Brief Details: A lightweight MobileNetV3 variant with 1.6M parameters, optimized for ImageNet classification using LAMB optimizer and EMA weight averaging.
Brief-details: Advanced 72B multimodal model optimized with AWQ quantization. Excels at image/video understanding, mobile/robot operations, and multilingual support.
Brief Details: Text-to-image LoRA model trained on FLUX.1-dev base model, specializing in unique illustration style with 21k+ downloads and strong community adoption.
Brief-details: Efficient 13B parameter LLaMA2 chat model quantized to 4-bit precision using GPTQ, offering optimal balance of performance and resource usage
BRIEF DETAILS: A fast CTranslate2-optimized version of distil-whisper-large-v3 for English speech recognition, featuring MIT license and high efficiency.
Brief-details: One-align is a unified AI model for image/video quality assessment and aesthetics scoring, achieving SOTA results across multiple benchmarks.
Brief-details: A powerful 32.8B parameter language model with multiple GGUF quantizations, supporting both Chinese and English, optimized for efficient deployment and high-quality text generation.
BRIEF DETAILS: Compact 247M parameter Nougat model for converting academic PDFs to markdown, using Swin Transformer + mBART architecture. Built by Facebook for document understanding.
Brief-details: Danish speech recognition model based on wav2vec2, achieving 6.69% WER after extensive training over 142,000 steps with strong performance on Danish audio transcription.
Brief-details: T5-based model trained on 370k research papers for one-line summarization, offering efficient abstract-to-headline conversion with multiple output options
Brief Details: DeepSeek-V2-Lite: 15.7B parameter MoE model with 2.4B active params. Features Multi-head Latent Attention, optimized for single 40GB GPU deployment.
Brief-details: A German-focused 7B parameter LLM based on Mistral, optimized for German language tasks while maintaining English capabilities. Features ChatML format and DPO training.
Brief Details: DUSt3R is a geometric 3D vision model with 532M parameters using ViT-Large encoder and ViT-Base decoder, optimized for 224x224 resolution images.
Brief-details: NextPhoton is a merged text-to-image model combining Next Photo 2 and Photon, specialized in photorealistic imagery with emphasis on lighting and natural compositions.
Brief Details: VideoMAE base model fine-tuned on Kinetics-400 for video classification. 86.5M params, achieves 80.9% top-1 accuracy. Built on MAE architecture.
BRIEF DETAILS: Fast and efficient text reranking model with 37.8M parameters, capable of processing up to 8,192 tokens using JinaBERT architecture
Brief Details: A specialized finance article title classifier using DistilBERT architecture to categorize news into 51 distinct labels covering bullish, bearish, and unrated categories.
Brief-details: SmolLM2-135M-Instruct: Compact 135M parameter language model optimized for instruction following, trained on 2T tokens with DPO and SFT.
Brief-details: A highly optimized 14.8B parameter LLM with multiple GGUF quantizations, offering flexible deployment options from 4.3GB to 29.5GB with varying quality-size tradeoffs.
Brief Details: Vietnamese sentiment analysis model based on PhoBERT, processes text into positive/negative/neutral classifications with 135M parameters.
Brief Details: Multilingual BERT model for passage reranking, supporting 102 languages. Trained on MS MARCO dataset for improved search relevancy up to 100%.