BRIEF DETAILS: QwQ-32B quantized to INT4 using AutoRound algorithm with symmetric quantization. Optimized for efficiency while maintaining 99.5% of BF16 performance across benchmarks.
BRIEF DETAILS: Quantized version of OmniSQL-7B optimized for SQL tasks, offering multiple GGUF variants from 2.0GB to 6.4GB with imatrix quantization options for different performance/size tradeoffs.
BRIEF-DETAILS: OmniSQL-7B-GGUF is a quantized SQL-focused model offering multiple compression variants from 3.1GB to 15.3GB, optimized for SQL tasks and database operations.
Brief Details: QwQ-32B-8.0bpw-h8-exl2 is a 32.5B parameter reasoning model optimized for complex problem-solving, featuring 131K context length and advanced architecture components.
Brief-details: A 32B parameter LLM optimized for coding, mathematical reasoning, and problem-solving. Features enhanced memory utilization and supports 256K input tokens with multilingual capabilities across 35+ languages.
Brief-details: A 14B parameter language model from prithivMLmods' Opus series, optimized for general-purpose tasks with enhanced performance characteristics and Sm2 architecture improvements.
Brief Details: QwQ-32B quantized to 8.0 bits-per-weight using EXL2, offering optimal performance with 6.4393 perplexity score - efficient 32B parameter model
Brief Details: Sombrero-Opus-14B-Sm1 is a 14B parameter language model from prithivMLmods, optimized for specific tasks with compatibility for Opus architecture implementations.
BRIEF DETAILS: QwQ-32B-i1-GGUF is a quantized version of the QwQ-32B model, offering various compression levels from 7.4GB to 27GB with imatrix quantization for optimal performance trade-offs.
BRIEF-DETAILS: 4-bit quantized version of QwQ-32B using BitsAndBytes, optimized for efficient deployment while maintaining model quality
BRIEF DETAILS: Quantized version of Breeze-7B-FC offering multiple compression variants (1.8GB-6.2GB). Features iMatrix quantization for optimal performance/size trade-offs.
Brief-details: Quantized version of Haphazardv1 with multiple GGUF variants optimized for different size/performance tradeoffs. Features iMatrix quantization for improved efficiency.
BRIEF DETAILS: A quantized version of experimental_R1-8x22b offering multiple compression variants from 29.7GB to 115.6GB, with imatrix quantization options for optimal performance trade-offs.
Brief Details: Quantized version of L3.1-Athena-j-8B with multiple GGUF variants, optimized for different size/performance tradeoffs. Features imatrix quantization for improved efficiency.
BRIEF-DETAILS: Quantized version of Breeze-7B with multiple GGUF variants (2.7-15GB), offering flexible performance-size tradeoffs. Recommended Q4_K_M for balanced usage.
BRIEF-DETAILS: FP32-8 scaled version of ltxv model optimized for ComfyUI, specializing in high-quality landscape and scenic image generation with precise prompt following.
Brief-details: A Vietnamese language model with 8B parameters focused on step-by-step reasoning capabilities, using XML format for structured responses and optimized with GRPO training.
BRIEF DETAILS: 8B parameter GGUF quantized model with multiple compression variants (Q2-Q8), optimized for efficient deployment. Features both standard and IQ-based quantization options.
Brief Details: MFANN-Llama3.1 is a quantized GGUF model offering multiple compression variants (Q2_K to Q8_0), optimized for different size-performance tradeoffs with sizes ranging from 3.3GB to 16.2GB
Brief Details: Efficient GGUF quantized version of Llama 3.1 8B Instruct model optimized for Jopara language, offering multiple compression options from 3.3GB to 16.2GB
BRIEF-DETAILS: Quantized version of Qwen-2.5-7B-Woonderer offering multiple compression options (Q2-Q8), with 7B parameters optimized for efficient deployment and usage