Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

proxy-lite-3b

Brief Details: A 3B parameter vision-language model designed for web browsing automation, built on Qwen2.5-VL-3B-Instruct with 72.4% accuracy on WebVoyager benchmark.

nvidia

DeepSeek-R1-FP4

Brief Details: NVIDIA's quantized version of DeepSeek R1, optimized for efficient inference with FP4 precision and 128K context length, running on TensorRT-LLM.

deepseek-ai

DeepSeek-V3

Brief Details: DeepSeek-V3 is a 671B-parameter MoE model (37B active) with 128K context, achieving SOTA performance in math/code tasks and supporting commercial use

deepseek-ai

Janus-Pro-7B

Brief-details: Unified multimodal AI model leveraging decoupled visual encoding for both understanding and generation tasks, built on DeepSeek-LLM-7b-base with SigLIP-L vision capabilities.

Qwen

Qwen2.5-VL-7B-Instruct

Brief-details: A powerful 7B-parameter vision-language model capable of understanding images, videos, and structured data. Features enhanced visual comprehension, agent-like capabilities, and video analysis.

deepseek-ai

DeepSeek-R1-Distill-Qwen-1.5B

Brief-details: A 1.5B parameter distilled model from DeepSeek-R1, focused on reasoning capabilities, built on Qwen2.5-Math architecture with strong performance in mathematical and logical tasks.

microsoft

phi-4

Brief Details: Microsoft's phi-4: 14B parameter model with 16K context, trained on 9.8T tokens. Optimized for reasoning and safety, MIT licensed.

meta-llama

Llama-3.3-70B-Instruct

Brief-details: Meta's latest 70B parameter instruction-tuned LLaMA model, built for advanced natural language understanding and generation tasks with enhanced instruction-following capabilities.

agents-course

notebooks

Brief-details: Educational notebook collection demonstrating AI agent creation from scratch, part of Hugging Face Agents Course. Features practical examples and implementation guides.

CohereForAI

c4ai-command-r7b-arabic-02-2025

BRIEF-DETAILS: 7B parameter Arabic language model from CohereForAI, focused on command-based tasks and language understanding, with 2025 version release

qihoo360

TinyR1-32B-Preview

Brief-details: TinyR1-32B-Preview is a 32B parameter reasoning-focused model that achieves near-R1 performance in math, code, and science tasks through specialized domain training and model merging.

Qwen

QwQ-32B-AWQ

Brief Details: QwQ-32B-AWQ is a 4-bit quantized reasoning-focused language model with 32.5B parameters, featuring 131K context length and advanced architecture for enhanced problem-solving capabilities.

Wan-AI

Wan2.1-T2V-1.3B

Brief-details: Wan2.1-T2V-1.3B is a lightweight (1.3B parameters) text-to-video generation model running on consumer GPUs, capable of creating 480P videos with only 8.19GB VRAM usage.

stabilityai

stable-diffusion-3.5-large

BRIEF-DETAILS: Stable Diffusion 3.5 Large - Latest advanced text-to-image model from Stability AI with enhanced capabilities and improved image generation quality

ElectricAlexis

NotaGen

Brief Details: NotaGen - A symbolic music generation AI model using LLM training paradigms, featuring 3-stage training on 1.6M pieces and CLaMP-DPO optimization.

chandar-lab

NeoBERT

Brief Details: NeoBERT - A 250M parameter next-gen BERT model trained on RefinedWeb, featuring 4096 token context length and state-of-the-art MTEB benchmark performance.

microsoft

OmniParser-v2.0

BRIEF-DETAILS: Advanced UI screenshot parsing tool that converts interface elements to structured format. Features improved latency (0.6s/frame on A100) and 39.6 accuracy on ScreenSpot Pro.

lodestones

Chroma

BRIEF-DETAILS: Chroma: 8.9B parameter rectified flow transformer for text-to-image generation, built on FLUX.1 with architectural enhancements

Qwen

QwQ-32B-GGUF

Brief-details: QwQ-32B-GGUF is a powerful 32.5B parameter reasoning model from Qwen, featuring 131K context length and advanced architecture with RoPE, SwiGLU, and RMSNorm.

GSAI-ML

LLaDA-8B-Instruct

Brief-details: 8B parameter diffusion model trained from scratch, designed for instruction-following tasks. Comparable to LLaMA3 8B performance.

ASLP-lab

DiffRhythm-base

Brief-details: DiffRhythm-base is a pioneering diffusion-based song generation model capable of creating full-length songs in just 1m35s, using latent diffusion techniques.

proxy-lite-3b

DeepSeek-R1-FP4

DeepSeek-V3

Janus-Pro-7B

Qwen2.5-VL-7B-Instruct

DeepSeek-R1-Distill-Qwen-1.5B

phi-4

Llama-3.3-70B-Instruct

notebooks

c4ai-command-r7b-arabic-02-2025

TinyR1-32B-Preview

QwQ-32B-AWQ

Wan2.1-T2V-1.3B

stable-diffusion-3.5-large

NotaGen

NeoBERT

OmniParser-v2.0

Chroma

QwQ-32B-GGUF

LLaDA-8B-Instruct

DiffRhythm-base

The first platform built for prompt engineering