Brief Details: A powerful English text processing model that restores punctuation, capitalization, and sentence boundaries in a single pass. Handles acronyms and complex capitalization patterns.
Brief-details: Advanced speaker segmentation model for voice activity detection and overlapped speech detection, based on pyannote.audio 2.0 framework with MIT license
Brief-details: Voice Activity Detection (VAD) model powered by pyannote.audio 2.1, offering precise speech detection in audio files with MIT license and 286K+ downloads.
Brief-details: HHEM-2.1-Open: A 110M parameter hallucination detection model for evaluating LLM outputs, outperforming GPT-3.5/4 with efficient resource usage
Brief-details: Instruction-tuned text embedding model that generates task-specific embeddings via natural language prompts, achieving SOTA on 70+ embedding tasks.
Brief-details: Stable Diffusion v2 depth-aware model that enables depth-controlled image generation and modification, building on SD2-base with MiDaS integration
BRIEF DETAILS: BART-based model fine-tuned for keyphrase generation across scientific and news domains, with 291K+ downloads and support for multiple datasets
BRIEF DETAILS: Large-scale Portuguese BERT model (335M params) by neuralmind, optimized for Brazilian Portuguese NLP tasks with state-of-the-art performance
Brief-details: Efficient 33M parameter embedding model supporting 8K sequence length, built on BERT with ALiBi. Optimized for English text embeddings and RAG applications.
Brief Details: BLIP-2 vision-language model with 7.75B parameters, combining CLIP image encoder and OPT-6.7b LLM for image captioning and VQA tasks. MIT licensed.
Brief Details: English-Spanish translation model with BLEU score of 54.9, trained on OPUS data. Supports bidirectional translation using transformer architecture.
BRIEF DETAILS: A fine-tuned XLSR-53 large model for Persian speech recognition, achieving 30.12% WER and 7.37% CER on Common Voice dataset. Supports 16kHz audio input.
Brief-details: A powerful 66.4M parameter DistilBERT model trained on 215M question-answer pairs, optimized for semantic search and sentence similarity tasks.
Brief Details: Qwen1.5-0.5B is a 620M parameter transformer-based language model offering 32K context length support and enhanced multilingual capabilities.
BRIEF-DETAILS: GPT-2 XL: 1.5B parameter transformer-based language model by OpenAI. Advanced text generation capabilities with extensive pre-training on web content.
Brief-details: A powerful English speech recognition model with 764M parameters, trained on 680k hours of data, achieving 4.12% WER on LibriSpeech clean test set.
Brief-details: Pre-trained Spanish BERT model using Whole Word Masking, achieving SOTA results on Spanish NLP tasks with 309K+ downloads and strong benchmark performance.
Brief Details: SAM-ViT-Large is a powerful vision segmentation model with 312M parameters, capable of generating high-quality object masks from various input prompts and zero-shot performance.
Brief-details: A powerful latent diffusion model for 4x image super-resolution, developed by CompVis. Specializes in high-quality upscaling while being computationally efficient.
Brief-details: 4-bit quantized version of Meta's Llama 3.1 8B model optimized for efficiency, featuring multilingual capabilities and 128k context window
Brief-details: RAD-DINO is an 86.6M parameter vision transformer model specialized in chest X-ray encoding, trained using self-supervised DINOv2 methodology across 882,775 medical images