Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

metaclip-b16-fullcc2.5b

Brief Details: MetaCLIP base-sized model trained on 2.5B CommonCrawl images/text pairs, offering CLIP-like capabilities for image-text understanding and zero-shot classification

yujiepan

tiny-random-swin-patch4-window7-224

Brief Details: A randomly initialized tiny Swin Transformer variant implementing patch size 4 and window size 7 for 224x224 image inputs, useful for vision tasks.

katuni4ka

tiny-random-decilm

Brief-details: A tiny experimental DeCILM (Decoder-only Contrastive Image-Language Model) variant, likely used for research or educational purposes in multimodal learning.

dtger

dt_style_1

BRIEF DETAILS: dt_style_1 is a style-focused AI model by dtger, available in Safetensors format for specialized artistic transformations.

IAMJB

chexpert-mimic-cxr-findings-baseline

Brief Details: Vision-language model for analyzing chest X-rays and generating detailed medical findings, built on ViT architecture with BERT-based decoder

unslothai

azure

Brief-details: Azure model by unslothai - A monitoring and logging system for tracking environment statistics and debugging AI model deployments

dangvantuan

vietnamese-document-embedding

Brief-details: Vietnamese document embedding model with 8096 token context, trained on XNLI-vn and STSB-vn datasets, achieving 82.45% mean Spearman score across STS benchmarks.

katuni4ka

tiny-random-orion

Brief Details: A compact model named tiny-random-orion created by katuni4ka, hosted on HuggingFace. Limited information available suggests experimental or research purposes.

ibnzterrell

Meta-Llama-3.3-70B-Instruct-AWQ-INT4

Brief-details: 4-bit quantized version of Meta's Llama-3.3-70B-Instruct model, optimized for multilingual dialogue. Requires ~35GB VRAM, supports multiple inference frameworks.

state-spaces

mamba-2.8b-slimpj

BRIEF-DETAILS: Mamba-2.8b-slimpj: State-of-the-art 2.8B parameter language model using innovative Mamba architecture, trained on SlimPajama dataset (600B tokens)

crishhh

animatediff_controlnet

BRIEF-DETAILS: AnimateDiff ControlNet model enables precise control over image/video generation and transformation, supporting both img2video and vid2vid workflows

kakaobrain

kogpt

Brief Details: KoGPT is a powerful Korean language model with 6B parameters, capable of text generation and understanding. Built by KakaoBrain for Korean text processing.

mradermacher

Kyro-n1-7B-GGUF

BRIEF-DETAILS: 7B parameter GGUF quantized model offering multiple compression variants (Q2-Q8) with file sizes from 3.1GB to 15.3GB, optimized for efficient deployment.

Steelskull

L3.3-Cu-Mai-R1-70b

BRIEF DETAILS: 70B parameter LLaMA-based model optimized for creative expression and reasoning. Features SCE merge method combining EVA, EURYALE, Cirrus, and other specialized components.

nvidia

audio-flamingo-2-0.5B

Brief-details: Audio Flamingo 2 is a state-of-the-art 0.5B parameter audio-language model with expert reasoning and long audio understanding capabilities up to 5 minutes

WangCa

Qwen2.5-7B-Medicine

BRIEF-DETAILS: Qwen2.5-7B-Medicine: A medical-focused LLM fine-tuned on 340K medical dialogues, achieving 55.7 BLEU-4 score. Optimized for healthcare applications using LoRA.

Kijai

PrecompiledWheels

BRIEF DETAILS: PrecompiledWheels is a specialized package featuring pre-compiled wheels for Blackwell torch.compile and sageattention, optimized for Debian 13 with torch 2.7 nightly and CUDA 12.8.

TokenSwift

TokenSwift-DeepSeek-R1-Distill-Qwen-32B

Brief-details: TokenSwift-DeepSeek-R1-Distill-Qwen-32B is a distilled version of Qwen-32B model developed by TokenSwift, focusing on efficient performance while maintaining capabilities

Hibernates

Hiber-Multi-10B-Instruct

Brief-details: Hiber-Multi-10B is a 10B parameter multilingual LLM featuring advanced transformer architecture with 4096 context length, 32 attention heads, and optimized performance characteristics.

TheWorstIsNot

butt

Brief Details: A specialized image generation model focused on posterior anatomy, triggered by specific keyword "asstastic". Available in Safetensors format.

NovaSky-AI

Sky-T1-mini

Brief Details: Sky-T1-mini is an AI model developed by NovaSky-AI, though detailed specifications and capabilities are not fully documented in the model card.