Brief Details: MetaCLIP base-sized model trained on 2.5B CommonCrawl images/text pairs, offering CLIP-like capabilities for image-text understanding and zero-shot classification
Brief Details: A randomly initialized tiny Swin Transformer variant implementing patch size 4 and window size 7 for 224x224 image inputs, useful for vision tasks.
Brief-details: A tiny experimental DeCILM (Decoder-only Contrastive Image-Language Model) variant, likely used for research or educational purposes in multimodal learning.
BRIEF DETAILS: dt_style_1 is a style-focused AI model by dtger, available in Safetensors format for specialized artistic transformations.
Brief Details: Vision-language model for analyzing chest X-rays and generating detailed medical findings, built on ViT architecture with BERT-based decoder
Brief-details: Azure model by unslothai - A monitoring and logging system for tracking environment statistics and debugging AI model deployments
Brief-details: Vietnamese document embedding model with 8096 token context, trained on XNLI-vn and STSB-vn datasets, achieving 82.45% mean Spearman score across STS benchmarks.
Brief Details: A compact model named tiny-random-orion created by katuni4ka, hosted on HuggingFace. Limited information available suggests experimental or research purposes.
Brief-details: 4-bit quantized version of Meta's Llama-3.3-70B-Instruct model, optimized for multilingual dialogue. Requires ~35GB VRAM, supports multiple inference frameworks.
BRIEF-DETAILS: Mamba-2.8b-slimpj: State-of-the-art 2.8B parameter language model using innovative Mamba architecture, trained on SlimPajama dataset (600B tokens)
BRIEF-DETAILS: AnimateDiff ControlNet model enables precise control over image/video generation and transformation, supporting both img2video and vid2vid workflows
Brief Details: KoGPT is a powerful Korean language model with 6B parameters, capable of text generation and understanding. Built by KakaoBrain for Korean text processing.
BRIEF-DETAILS: 7B parameter GGUF quantized model offering multiple compression variants (Q2-Q8) with file sizes from 3.1GB to 15.3GB, optimized for efficient deployment.
BRIEF DETAILS: 70B parameter LLaMA-based model optimized for creative expression and reasoning. Features SCE merge method combining EVA, EURYALE, Cirrus, and other specialized components.
Brief-details: Audio Flamingo 2 is a state-of-the-art 0.5B parameter audio-language model with expert reasoning and long audio understanding capabilities up to 5 minutes
BRIEF-DETAILS: Qwen2.5-7B-Medicine: A medical-focused LLM fine-tuned on 340K medical dialogues, achieving 55.7 BLEU-4 score. Optimized for healthcare applications using LoRA.
BRIEF DETAILS: PrecompiledWheels is a specialized package featuring pre-compiled wheels for Blackwell torch.compile and sageattention, optimized for Debian 13 with torch 2.7 nightly and CUDA 12.8.
Brief-details: TokenSwift-DeepSeek-R1-Distill-Qwen-32B is a distilled version of Qwen-32B model developed by TokenSwift, focusing on efficient performance while maintaining capabilities
Brief-details: Hiber-Multi-10B is a 10B parameter multilingual LLM featuring advanced transformer architecture with 4096 context length, 32 attention heads, and optimized performance characteristics.
Brief Details: A specialized image generation model focused on posterior anatomy, triggered by specific keyword "asstastic". Available in Safetensors format.
Brief Details: Sky-T1-mini is an AI model developed by NovaSky-AI, though detailed specifications and capabilities are not fully documented in the model card.