BRIEF-DETAILS: Quantized versions of Cohere's command-a-03-2025 model, offering multiple compression levels from 118GB to 26GB with varying quality-size tradeoffs. Supports multilingual capabilities and runs on llama.cpp.
Brief-details: Gemma 3B Instruct GGUF - Google DeepMind's lightweight, state-of-the-art text model with multiple quantization options for various hardware setups. 32K input context.
Brief-details: Comprehensive GGUF quantization suite of QwQ-32B-Snowdrop model offering 27 variants from 65GB to 9GB, optimized for different RAM/performance trade-offs
Brief-details: GGUF quantized versions of OlympicCoder-7B optimized for different hardware setups, offering various compression levels from 2.78GB to 15.24GB with different quality-size tradeoffs.
Brief-details: A specialized LoRA model for Wan2.1 14B I2V that creates realistic hydraulic press crushing animations from static images. 20-epoch trained on crush footage.
Brief-details: A 32B parameter coding-specialized LLM available in multiple GGUF quantizations (Q8_0 to IQ2_XXS), optimized for code generation and technical tasks
BRIEF-DETAILS: Block Diffusion Language Model trained on OpenWebText, bridging autoregressive and diffusion approaches for text generation with 16-token blocks.
Brief-details: EuroBERT-610m is a powerful multilingual encoder supporting 15 languages with 610M parameters, capable of processing up to 8,192 tokens for various NLP tasks
Brief-details: Ultra-lightweight Chinese LLM (26M-145M params) trained in 2 hours for $0.43. Features pretrain, SFT, LoRA, DPO implementations with minimal dependencies.
Brief-details: A lightweight T5 model variant designed specifically for TRL (Transformer Reinforcement Learning) library testing, featuring minimal architecture for unit testing purposes.
Brief-details: ConvBERT-base is a lightweight BERT variant developed by YituTech that uses dynamic convolutions to improve efficiency while maintaining BERT-like performance.
BRIEF DETAILS: 4-bit quantized version of Llama 3.2B instruction-tuned model optimized for MLX framework, offering efficient deployment on Apple Silicon
Brief Details: VRAM-48 is an AI model by unslothai focused on optimizing VRAM usage, potentially offering efficient memory management for deep learning applications.
Brief Details: DBRX-Base by Databricks - A foundational language model focused on enterprise applications with privacy-conscious data handling and processing capabilities.
Brief Details: Llama-2-7b-chat is Meta's 7B parameter chat-optimized language model, designed for dialogue applications with enhanced instruction-following capabilities.
Brief-details: ECCO-BERT is a specialized BERT model trained on 18th-century UK documents, optimized for historical text analysis and ECCO dataset tasks.
BRIEF-DETAILS: GGUF-formatted version of Mistral-Small-Instruct-2409, optimized for local deployment with broad client support and GPU acceleration capabilities.
Brief-details: A community-driven AI upscaling model from OpenModelDB, designed for image enhancement and super-resolution tasks, hosted on HuggingFace by uwg.
BRIEF-DETAILS: Mamba-Codestral-7B is a 7B parameter model by MistralAI combining Mamba architecture with code generation capabilities, optimized for programming tasks.
Brief Details: Iroiro-LoRA is a specialized LoRA (Low-Rank Adaptation) model available on HuggingFace, created by 2vXpSwA7 for fine-tuning language models.
Brief Details: SANA1.5 is a 4.8B parameter efficient text-to-image model featuring Linear-Diffusion-Transformer architecture, capable of 1024px image generation with 60% reduced training costs.