Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

QwQ-32B-int4-AutoRound-gptq-sym

BRIEF DETAILS: QwQ-32B quantized to INT4 using AutoRound algorithm with symmetric quantization. Optimized for efficiency while maintaining 99.5% of BF16 performance across benchmarks.

mradermacher

OmniSQL-7B-i1-GGUF

BRIEF DETAILS: Quantized version of OmniSQL-7B optimized for SQL tasks, offering multiple GGUF variants from 2.0GB to 6.4GB with imatrix quantization options for different performance/size tradeoffs.

mradermacher

OmniSQL-7B-GGUF

BRIEF-DETAILS: OmniSQL-7B-GGUF is a quantized SQL-focused model offering multiple compression variants from 3.1GB to 15.3GB, optimized for SQL tasks and database operations.

LoneStriker

QwQ-32B-8.0bpw-h8-exl2

Brief Details: QwQ-32B-8.0bpw-h8-exl2 is a 32.5B parameter reasoning model optimized for complex problem-solving, featuring 131K context length and advanced architecture components.

prithivMLmods

Sombrero-QwQ-32B-Elite11

Brief-details: A 32B parameter LLM optimized for coding, mathematical reasoning, and problem-solving. Features enhanced memory utilization and supports 256K input tokens with multilingual capabilities across 35+ languages.

prithivMLmods

Sombrero-Opus-14B-Sm2

Brief-details: A 14B parameter language model from prithivMLmods' Opus series, optimized for general-purpose tasks with enhanced performance characteristics and Sm2 architecture improvements.

Dracones

QwQ-32B_exl2_8.0bpw

Brief Details: QwQ-32B quantized to 8.0 bits-per-weight using EXL2, offering optimal performance with 6.4393 perplexity score - efficient 32B parameter model

prithivMLmods

Sombrero-Opus-14B-Sm1

Brief Details: Sombrero-Opus-14B-Sm1 is a 14B parameter language model from prithivMLmods, optimized for specific tasks with compatibility for Opus architecture implementations.

mradermacher

QwQ-32B-i1-GGUF

BRIEF DETAILS: QwQ-32B-i1-GGUF is a quantized version of the QwQ-32B model, offering various compression levels from 7.4GB to 27GB with imatrix quantization for optimal performance trade-offs.

onekq-ai

QwQ-32B-bnb-4bit

BRIEF-DETAILS: 4-bit quantized version of QwQ-32B using BitsAndBytes, optimized for efficient deployment while maintaining model quality

mradermacher

Breeze-7B-FC-v1_0-i1-GGUF

BRIEF DETAILS: Quantized version of Breeze-7B-FC offering multiple compression variants (1.8GB-6.2GB). Features iMatrix quantization for optimal performance/size trade-offs.

mradermacher

Haphazardv1-i1-GGUF

Brief-details: Quantized version of Haphazardv1 with multiple GGUF variants optimized for different size/performance tradeoffs. Features iMatrix quantization for improved efficiency.

mradermacher

experimental_R1-8x22b-i1-GGUF

BRIEF DETAILS: A quantized version of experimental_R1-8x22b offering multiple compression variants from 29.7GB to 115.6GB, with imatrix quantization options for optimal performance trade-offs.

mradermacher

L3.1-Athena-j-8B-i1-GGUF

Brief Details: Quantized version of L3.1-Athena-j-8B with multiple GGUF variants, optimized for different size/performance tradeoffs. Features imatrix quantization for improved efficiency.

mradermacher

Breeze-7B-FC-v1_0-GGUF

BRIEF-DETAILS: Quantized version of Breeze-7B with multiple GGUF variants (2.7-15GB), offering flexible performance-size tradeoffs. Recommended Q4_K_M for balanced usage.

calcuis

ltxv0.9.5

BRIEF-DETAILS: FP32-8 scaled version of ltxv model optimized for ComfyUI, specializing in high-quality landscape and scenic image generation with precise prompt following.

BlossomsAI

BloomVN-8B-Chat-Reasoning

Brief-details: A Vietnamese language model with 8B parameters focused on step-by-step reasoning capabilities, using XML format for structured responses and optimized with GRPO training.

mradermacher

L3.1-Athena-j-8B-GGUF

BRIEF DETAILS: 8B parameter GGUF quantized model with multiple compression variants (Q2-Q8), optimized for efficient deployment. Features both standard and IQ-based quantization options.

mradermacher

MFANN-Llama3.1-Abliterated-SLERP-TIES-V2-GGUF

Brief Details: MFANN-Llama3.1 is a quantized GGUF model offering multiple compression variants (Q2_K to Q8_0), optimized for different size-performance tradeoffs with sizes ranging from 3.3GB to 16.2GB

mradermacher

Llama-3.1-8B-Instruct-Jopara-V3.2-GGUF

Brief Details: Efficient GGUF quantized version of Llama 3.1 8B Instruct model optimized for Jopara language, offering multiple compression options from 3.3GB to 16.2GB

mradermacher

Qwen-2.5-7B-Woonderer-0.1-GGUF

BRIEF-DETAILS: Quantized version of Qwen-2.5-7B-Woonderer offering multiple compression options (Q2-Q8), with 7B parameters optimized for efficient deployment and usage

QwQ-32B-int4-AutoRound-gptq-sym

OmniSQL-7B-i1-GGUF

OmniSQL-7B-GGUF

QwQ-32B-8.0bpw-h8-exl2

Sombrero-QwQ-32B-Elite11

Sombrero-Opus-14B-Sm2

QwQ-32B_exl2_8.0bpw

Sombrero-Opus-14B-Sm1

QwQ-32B-i1-GGUF

QwQ-32B-bnb-4bit

Breeze-7B-FC-v1_0-i1-GGUF

Haphazardv1-i1-GGUF

experimental_R1-8x22b-i1-GGUF

L3.1-Athena-j-8B-i1-GGUF

Breeze-7B-FC-v1_0-GGUF

ltxv0.9.5

BloomVN-8B-Chat-Reasoning

L3.1-Athena-j-8B-GGUF

MFANN-Llama3.1-Abliterated-SLERP-TIES-V2-GGUF

Llama-3.1-8B-Instruct-Jopara-V3.2-GGUF

Qwen-2.5-7B-Woonderer-0.1-GGUF

The first platform built for prompt engineering