Brief-details: BMC_CLIP_CF is a CLIP-based model from BIOMEDICA, featuring cross-fusion architecture for enhanced visual-language understanding, with tutorial access via Google Colab.
Brief-details: AceMath-72B-Instruct is NVIDIA's large-scale math reasoning model built on Qwen, excelling at solving complex mathematical problems through chain-of-thought reasoning
Brief-details: Qwen2.5-Math-PRM-7B is a 7B parameter Process Reward Model designed to evaluate mathematical reasoning steps, offering step-by-step quality assessment with rewards between 0-1.
Brief-details: A 72B parameter process reward model designed to evaluate mathematical reasoning steps, offering feedback scores between 0-1 for solution quality assessment.
Brief-details: A 7B parameter process reward model fine-tuned on PRM800K dataset, specialized in evaluating mathematical reasoning steps with scores between 0-1.
Brief-details: OuteTTS-0.3-1B is a powerful 1B parameter text-to-speech model supporting 6 languages, trained on 20K hours of speech with punctuation support and voice cloning capabilities
Brief-details: Specialized AI model for generating pose variations of cartoon characters while maintaining character identity. Perfect for dataset augmentation and LoRA training.
BRIEF-DETAILS: A merged variant of Microsoft's Phi-4 model using passthrough merging technique across different layer ranges, implemented in bfloat16 precision.
Brief Details: FLUX.1 [dev] Abliterated is a 12B parameter text-to-image model with removed safety filters, based on Rectified Flow architecture, enabling unrestricted image generation.
BRIEF-DETAILS: A 14B parameter LLM created by continuous fine-tuning of Qwen-14B using Ties merge method, achieving 34.52 avg on OpenLLM benchmarks
BRIEF-DETAILS: TIPO-500M-ft is a 500M parameter LLaMA-based model for text-to-image prompt optimization, trained on Danbooru and Coyo-HD-11M datasets with 42B tokens.
Brief-details: MicroDiT is a cost-efficient text-to-image diffusion model trained on a micro-budget ($1,890), achieving competitive performance with 1.16B parameters and 12.7 FID score.
BRIEF DETAILS: A routing model based on ModernBERT-large that intelligently selects between different approaches for LLM optimization, showing 13.33% pass@1 on AIME 2024.
BRIEF-DETAILS: Multilingual visual document retrieval model supporting 5 languages (🇮🇹 🇪🇸 🇬🇧 🇫🇷 🇩🇪). Enables OCR-free document search with cross-lingual capabilities.
Brief-details: RAG-Instruct-Llama3-8B is an 8B parameter LLM optimized for RAG tasks, showing significant improvements across multiple benchmarks through diverse instruction tuning.
Brief Details: A specialized AI model by Danrisi focused on generating 2000s-style aesthetic imagery, capturing the unique visual style of the millennium era.
Brief-details: Text classification model built on Qwen2.5-1.5B, achieving 89.97% accuracy for Korean product categorization. Optimized for e-commerce product classification with 17 categories.
Brief Details: NSFW content detection model by Jonny001, available on HuggingFace. Specialized in identifying and filtering inappropriate content.
Brief Details: A compact, quantized version of Whisper optimized for faster speech recognition, using int8 precision to balance efficiency and accuracy
Brief-details: A compact GLM4-based model created by katuni4ka, focusing on efficient language processing with a streamlined architecture available on HuggingFace.
Brief-details: A binary gender classification model created using HuggingPics, designed to automatically distinguish between male and female subjects in images.