Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

RuModernBERT-small

Brief Details: Compact Russian BERT variant (35M params) trained on 2T tokens, supporting 8K context window with strong performance on Russian NLP tasks

gmonsoon

flux.1-lite-8B-Fp8

BRIEF-DETAILS: 8B parameter language model optimized with float8_e4m3fn quantization, derived from Freepik's flux.1-lite-8B for efficient deployment

tencent

Hunyuan-7B-Instruct

BRIEF-DETAILS: Hunyuan-7B-Instruct is a powerful Chinese-focused 7B parameter LLM from Tencent featuring 256K context length, strong benchmarks, and GQA attention mechanism.

baichuan-inc

Baichuan-Omni-1d5

Brief Details: Baichuan-Omni-1.5 is a 7B parameter multimodal model supporting text, image, video, and audio I/O with state-of-the-art performance in medical imaging and real-time voice interactions.

oherik

forestrossmk8b

BRIEF-DETAILS: A LoRA model trained using Flux, specialized in generating fashion and modeling images of a female character named ForestRoss in various settings and styles.

mradermacher

FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview-i1-GGUF

Brief-details: A specialized GGUF quantized version of the FuseO1-DeepSeekR1 32B model, offering multiple compression variants from 7.4GB to 27GB with varying quality-size tradeoffs.

lukasfast

Teuken-7B-instruct-research-v0.4-Q6_K-GGUF

BRIEF DETAILS: A 7B parameter GGUF-formatted instruction-tuned language model, optimized for research with Q6_K quantization, deployable via llama.cpp

AtlaAI

Selene-1-Mini-Llama-3.1-8B

Brief-details: Selene-1-Mini-Llama-3.1-8B is an 8B parameter evaluation model that outperforms larger models in scoring tasks, supporting multiple languages with 128K context.

FuseAI

FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview

BRIEF-DETAILS: 32B parameter LLM combining DeepSeek-R1 and Qwen2.5-Coder capabilities, specialized in long-short reasoning fusion for enhanced mathematics and coding tasks

unsloth

Llama-3.2-11B-Vision-Instruct-unsloth-bnb-4bit

BRIEF DETAILS: Optimized 11B vision-language model using Unsloth's Dynamic 4-bit quantization, offering 2x faster performance and 60% less memory usage while maintaining accuracy.

vikp

surya_layout3

Brief-details: Layout processing model by vikp for handling document layouts and structure analysis, optimized for the Surya framework.

unsloth

Qwen2.5-32B-bnb-4bit

Brief Details: Qwen2.5-32B optimized for 4-bit quantization (BNB), featuring 32.5B parameters, 128K context window, and multilingual support for 29+ languages

stanfordmimi

Merlin

Brief-details: Merlin is a specialized 3D Vision Language Model for computed tomography scans, combining EHR data and radiology reports for enhanced medical image understanding.

meta-llama

Llama-3.1-70B

BRIEF-DETAILS: Meta's latest 70B parameter LLM, part of the Llama family. Advanced language model with robust capabilities for various NLP tasks.

NbAiLab

nb-wav2vec2-1b-bokmaal-v2

Brief Details: Norwegian Wav2vec2 model optimized for Bokmål ASR, featuring 1B parameters and built on advanced speech recognition architecture.

biu-nlp

lingmess-coref

Brief-details: LingMess is a specialized coreference resolution model achieving 81.4 F1 on OntoNotes, using linguistic categorization across 6 types of decisions

katuni4ka

tiny-random-qwen1.5-moe

Brief Details: A specialized MoE (Mixture of Experts) variant of Qwen 1.5, featuring a randomized tiny architecture developed by katuni4ka for experimental purposes.

legraphista

Qwen2-1.5B-Instruct-IMat-GGUF

BRIEF DETAILS: Qwen2-1.5B-Instruct-IMat-GGUF is a quantized version of Qwen's 1.5B parameter instruction model, offering multiple compression variants from 3GB to 436MB using IMatrix optimization.

RWKV

v6-Finch-7B-World3-HF

Brief Details: RWKV's 7B parameter Finch model with improved performance over Eagle-7B. Specializes in both English and Chinese content generation with strong evaluation metrics.

CausalLM

7B

BRIEF DETAILS: CausalLM 7B: A powerful LLaMA 2-compatible model trained on 1.3B tokens of synthetic data, outperforming most models ≤33B in benchmarks like MMLU and CEval.

eachadea

ggml-vicuna-7b-1.1

Brief Details: A 7B parameter LLaMA-based model converted to GGML format. Now obsolete but historically significant as part of the Vicuna model family.