Brief-details: BEiT vision transformer model pre-trained on ImageNet-22k, fine-tuned for image classification with 224x224 resolution input, using BERT-style masked patch prediction.
Brief-details: XLSR-53 large model fine-tuned for Chinese speech recognition, achieving 19.03% CER. Popular with 1.8M+ downloads, Apache 2.0 licensed.
Brief-details: Quantized GGUF version of Microsoft's Phi-3.5-mini-instruct model with 3.82B parameters, optimized for efficient text generation and conversational tasks
Brief-details: A 22.1B parameter GGUF-formatted instruction model based on SOLAR-PRO, optimized for various quantization levels (2-8 bit) and compatible with multiple inference platforms.
BRIEF DETAILS: WizardLM-2-7B-GGUF: A 7B parameter quantized language model based on Mistral, optimized for complex chat, multilingual tasks, and reasoning. Features GGUF format for efficient deployment.
Brief-details: A biomedical BERT model specialized in entity representation, trained on UMLS data with 109M parameters. Optimized for medical terminology alignment.
Brief-details: Compact 12M parameter BERT model for Russian/English tasks, optimized for speed and size. Supports masked language modeling and sentence embeddings.
Brief-details: OWLv2 is a 155M parameter zero-shot object detection model using CLIP backbone with ViT architecture, capable of text-conditioned object detection.
Brief Details: BERT large uncased - 336M parameter transformer model for masked language modeling and next sentence prediction, trained on BookCorpus and Wikipedia.
Brief Details: A powerful sentence embedding model with 82.1M parameters, trained on 1B+ sentence pairs, optimized for semantic similarity and search tasks.
Brief Details: Tokenizer-free multilingual model with 132M params supporting 104 languages. Uses character-level processing via Unicode code points for efficient NLP tasks.
Brief-details: ByT5-small is a tokenizer-free language model that operates directly on raw UTF-8 bytes, supporting 102 languages and optimized for noisy text processing.
Brief-details: Large-scale (560M params) reranking model optimized for Chinese/English text similarity scoring, achieving SOTA performance on MTEB/C-MTEB benchmarks.
Brief Details: Quantized 8B parameter LLaMA-3 model optimized for instruction-following with 32k context window, available in multiple GGUF precision variants.
Brief Details: An image captioning model combining ViT (Vision Transformer) and GPT-2 architectures. Popular with 1.9M+ downloads, Apache 2.0 licensed.
Brief details: BERT-based NER model fine-tuned on CoNLL-2003, achieving 91.3% F1 score for named entity recognition. 108M parameters, supports 4 entity types.
Brief Details: RoBERTa-based sentiment analysis model trained on 58M tweets, fine-tuned for TweetEval benchmark. Classifies text into negative, neutral, or positive sentiments.
BRIEF-DETAILS: FLUX.1-schnell is a 12B parameter text-to-image model capable of high-quality image generation in 1-4 steps, using latent adversarial diffusion distillation.
BRIEF DETAILS: Qwen2-1.5B-Instruct-GGUF is a quantized instruction-tuned language model with 1.54B parameters, optimized for GGUF format and compatible with various local deployment options.
BRIEF-DETAILS: Microsoft's GIT-base model (177M params) for image-to-text generation, capable of captioning and VQA tasks. Built on transformer architecture with CLIP integration.
Brief-details: Mimi is a cutting-edge neural audio codec by Kyutai, offering high-fidelity speech compression at 1.1kbps with 96.2M parameters using transformer architecture.