Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Cakrawala-8B-i1-GGUF

Brief-details: 8B parameter imatrix-quantized GGUF model with multiple compression variants (2.1GB-6.7GB), optimized for efficient deployment and conversational tasks.

Transformers

mradermacher

RQwen-v0.1-GGUF

Brief-details: RQwen-v0.1-GGUF is a 14.8B parameter bilingual (English/Russian) language model with multiple GGUF quantization options for efficient deployment

Transformers

mradermacher

RP-Naughty-v1.0f-8b-i1-GGUF

Brief-details: An 8B parameter GGUF-quantized language model featuring multiple compression variants (2.1GB-6.7GB) with iMatrix quantization for optimal performance balance.

Transformers

mradermacher

RP-Naughty-v1.0c-8b-GGUF

Brief Details: 8B parameter GGUF-quantized language model offering multiple compression variants (Q2_K to f16) optimized for efficient deployment and inference.

Transformers

mradermacher

RP-Naughty-v1.0d-8b-GGUF

Brief Details: 8B parameter GGUF-quantized language model with multiple compression variants (Q2-Q8), optimized for efficient deployment and inference

Transformers

mradermacher

RP-Naughty-v1.1b-8b-GGUF

Brief-details: An 8B parameter GGUF-quantized language model optimized for various compression levels, offering multiple quantization options from 3.3GB to 16.2GB file sizes.

Transformers

mradermacher

Marco-01-slerp5-7B-GGUF

Brief-details: 7B parameter GGUF-quantized conversational model offering multiple quantization variants from 3.1GB to 15.3GB, optimized for efficient deployment with Apache 2.0 license.

Transformers

mradermacher

RP-Naughty-v1.1a-8b-GGUF

BRIEF-DETAILS: 8B parameter GGUF-quantized language model with multiple compression variants (Q2-Q8), optimized for efficient deployment and memory usage

Transformers

mradermacher

MT4-Gen2-GBMAMU-gemma-2-9B-GGUF

Brief-details: A 9.24B parameter GGUF-quantized Gemma model offering multiple compression variants, optimized for conversational AI with strong performance/size trade-offs.

Transformers

mradermacher

Thor-v1.2-8b-1024k-i1-GGUF

Brief Details: 8B parameter GGUF model optimized for inference with various quantization options. Features extended context length of 1024k and imatrix quantization.

Transformers

mradermacher

Marco-01-slerp4-7B-GGUF

BRIEF DETAILS: 7B parameter GGUF model optimized for efficient inference with multiple quantization options (Q2_K to f16), offering flexibility between model size and quality.

Transformers

mradermacher

Thor-v1.1e-8b-1024k-i1-GGUF

Brief Details: Thor v1.1e 8B parameter GGUF model with various quantization options, optimized for inference with 1024k context window

Transformers

DoeyLLM

OneLLM-Doey-ChatQA-V1-Llama-3.2-1B

Brief Details: A 1.24B parameter LLaMA-based model fine-tuned with LoRA on NVIDIA's ChatQA dataset, optimized for conversational AI and QA tasks with 1024 token context.

Text Generation

mradermacher

BharatGPT-3B-Indic-i1-GGUF

Brief-details: A multi-lingual Indian language model supporting 12 languages, quantized for efficient deployment with 3.21B parameters and various GGUF compression options.

Transformers

onnx-community

Qwen2-VL-2B-Instruct

Brief-details: Qwen2-VL-2B-Instruct is an ONNX-compatible vision-language model optimized for Transformers.js, enabling image+text to text generation with 2B parameters.

Image-Text-to-Text

BigHuggyD

TheDrummer_Behemoth-123B-v2.2_exl2_5.0bpw_h6

Brief-details: Behemoth 123B v2.2 - Advanced Largestral 2411 finetune with system prompt support, optimized for creative tasks using Metharme format with Mistral tokens at 5-bit precision

Safetensors

mradermacher

Marco-o1-i1-GGUF

Brief-details: A quantized 7.62B parameter conversational AI model offering multiple GGUF variants optimized for different performance/quality tradeoffs and hardware configurations

Transformers

ehristoforu

RQwen-v0.1

Brief Details: RQwen-v0.1: A 14.8B parameter bilingual (EN/RU) instruction-tuned model based on Qwen2.5, featuring reflection tuning and strong logical reasoning capabilities.

Text Generation

QuantFactory

OneLLM-Doey-V1-Llama-3.2-1B-it-GGUF

Brief Details: A 1.24B parameter LLaMA-based conversational AI model optimized for mobile devices, featuring offline capabilities and instruction-following abilities.

Text Generation

mradermacher

Sue_Ann_11B-GGUF

Brief-details: A quantized version of Sue Ann 11B model available in multiple GGUF formats, optimized for different performance/quality tradeoffs, ranging from 4.1GB to 21.6GB

Transformers

MrRobotoAI

RP-Naughty-v1.0d-8b

Brief-details: 8B parameter merged language model using Model Stock method, combining multiple LLaMA-based models for text generation. Created by MrRobotoAI with FP16 precision.

Text Generation

Cakrawala-8B-i1-GGUF

RQwen-v0.1-GGUF

RP-Naughty-v1.0f-8b-i1-GGUF

RP-Naughty-v1.0c-8b-GGUF

RP-Naughty-v1.0d-8b-GGUF

RP-Naughty-v1.1b-8b-GGUF

Marco-01-slerp5-7B-GGUF

RP-Naughty-v1.1a-8b-GGUF

MT4-Gen2-GBMAMU-gemma-2-9B-GGUF

Thor-v1.2-8b-1024k-i1-GGUF

Marco-01-slerp4-7B-GGUF

Thor-v1.1e-8b-1024k-i1-GGUF

OneLLM-Doey-ChatQA-V1-Llama-3.2-1B

BharatGPT-3B-Indic-i1-GGUF

Qwen2-VL-2B-Instruct

TheDrummer_Behemoth-123B-v2.2_exl2_5.0bpw_h6

Marco-o1-i1-GGUF

RQwen-v0.1

OneLLM-Doey-V1-Llama-3.2-1B-it-GGUF

Sue_Ann_11B-GGUF

RP-Naughty-v1.0d-8b

The first platform built for prompt engineering