Brief-details: Skywork's 8B parameter reward model based on Llama-3.1, achieving top performance on RewardBench. Trained on 80K high-quality preference pairs for text classification.
Brief Details: Fine-tuned Whisper medium model for language identification, achieving 88.05% accuracy on FLEURS dataset. 308M params, FP16 precision.
Brief Details: SigLIP (Sigmoid Loss for Language-Image Pre-training) vision transformer model for zero-shot image classification, trained on WebLI dataset
Brief-details: Indonesian BERT model fine-tuned for sentiment analysis, achieving 93.73% accuracy. Based on bert-base-indonesian-1.5G, MIT licensed, with 125K+ downloads.
Brief Details: A fine-tuned Wav2vec2 model for Urdu speech recognition, achieving 39.89% WER with LM. Based on XLS-R 300M, optimized for Common Voice 8.0.
Brief Details: Hungarian speech recognition model based on XLSR-53, achieving 31.40% WER and 6.20% CER on Common Voice test set. Optimized for 16kHz audio.
Brief Details: Multilingual zero-shot text classifier based on XLM-RoBERTa, supporting 16 languages with 561M parameters. Ideal for cross-lingual text classification tasks.
Brief Details: BERT model for Japanese language processing using character-level tokenization, trained on Wikipedia data with 12 layers and 768-dimensional hidden states.
Brief-details: Vision Transformer model with 86.6M params, trained on ImageNet-21k and fine-tuned on ImageNet-1k. Optimized for 224x224 images with 8x8 patches.
Brief Details: Vision Transformer model using SigLIP loss for zero-shot image classification, trained on WebLI dataset with 512x512 input resolution.
Brief-details: A 160M parameter language model from EleutherAI's Pythia suite, designed for interpretability research with 12 layers and 768 model dimension.
Brief Details: OpenHermes 2.5 is a 7B parameter Mistral-based LLM fine-tuned on 1M GPT-4 generated entries, achieving 50.7% HumanEval pass@1 and improved benchmark scores.
Brief Details: CLIP model trained on 5B filtered images from 43B pairs, achieving 83.4% ImageNet accuracy. Uses ViT-H-14 architecture with advanced data filtering.
BRIEF DETAILS: Text-to-image model focused on photorealistic outputs, particularly excelling in human portraits and landscapes. Popular with 126K+ downloads.
Brief Details: Swin Transformer tiny variant optimized for ImageNet classification, featuring 28.5M parameters with hierarchical architecture and shifted windows.
Brief-details: A specialized BERT-based model for English hate speech detection, achieving 72.6% validation accuracy. Fine-tuned on monolingual data with research backing.
Brief-details: A 1.54B parameter language model from Qwen's 2.5 series, optimized for pretraining with 32K context length and improved capabilities in coding and mathematics.
Brief-details: AWQ-quantized 4-bit version of Mistral-7B-Instruct, optimized for efficient inference with 128-group size, achieving 4.15GB model size while maintaining performance.
Brief Details: Spanish sentence similarity model with 110M parameters. Maps sentences to 768-dim vectors. Strong performance (82.8% Pearson correlation) for semantic tasks.
Brief-details: InternVL2-8B is a powerful 8.1B parameter multimodal LLM combining InternViT vision encoder and InternLM2 language model, excelling in document understanding, OCR, and visual reasoning tasks.
Brief-details: SpeechT5 HiFi-GAN vocoder for text-to-speech and voice conversion, developed by Microsoft. MIT-licensed with 128K+ downloads.