Brief-details: Optimized 1.5B parameter Llama-3 model quantized to FP8, offering 50% memory reduction while maintaining 99.8% accuracy of original model. Supports 8 languages.
Brief-details: A high-performance CLIP model using ConvNeXt-Large architecture, trained on LAION-2B dataset, achieving 76.9% ImageNet accuracy with weight-averaged soup approach.
Brief-details: Turkish sentiment analysis BERT model with 163M parameters, achieving 99.72% accuracy. Fine-tuned on TurkishBERTweet base model with MIT license.
Brief-details: Google's T5-v1.1 base model - Text-to-text transfer transformer with GEGLU activation, trained on C4 dataset. Key improvements over original T5 with 262K+ downloads.
Brief Details: A specialized document understanding model fine-tuned for invoice processing, combining Swin Transformer vision encoding with BART text decoding. Supports multilingual invoice analysis.
Brief Details: A specialized English ASR model with 242M parameters, based on OpenAI's Whisper architecture. Optimized for English speech recognition with strong accuracy and noise resilience.
Brief-details: Fast and efficient speech recognition model supporting 99 languages, based on OpenAI's Whisper medium variant, optimized with CTranslate2 for improved performance
BRIEF DETAILS: A cross-encoder model fine-tuned on MS Marco, achieving 71.01 NDCG@10 on TREC DL 19 with 4100 docs/sec processing speed. Optimized for passage ranking.
Brief Details: InstructBLIP-Vicuna-7B: A 7.91B parameter vision-language model combining BLIP-2 architecture with Vicuna-7b LLM for advanced image-text tasks
Brief-details: Large-scale Vision Transformer model trained on ImageNet-21k (14M images) and fine-tuned on ImageNet 2012, specializing in image classification at 384x384 resolution.
Brief Details: SigLIP base model (203M params) for vision-language tasks. Features sigmoid loss function, 224x224 resolution, and zero-shot capabilities.
BRIEF-DETAILS: General-purpose text embedding model with 109M parameters. Optimized for LLM retrieval augmentation, supporting diverse embedding needs with SOTA performance.
BRIEF DETAILS: Multilingual sentence embedding model (135M params) optimized for semantic similarity tasks. Maps text to 768D vectors. Built on DistilBERT.
Brief Details: A Vision Transformer model trained with DINO self-supervision on ImageNet-1k, offering robust image feature extraction with 16x16 patch size encoding.
Brief-details: Depth Anything V2 Large - A state-of-the-art depth estimation model with 335M parameters, trained on 595K synthetic + 62M real images for robust monocular depth estimation.
Brief-details: A distilled version of Whisper medium.en that's 6x faster, 49% smaller (394M params) while maintaining accuracy within 1% WER for English ASR.
Brief-details: M3E-base is a 102M parameter bilingual (Chinese-English) embedding model trained on 22M+ sentence pairs, optimized for text similarity and retrieval tasks with state-of-the-art performance.
Brief Details: RoBERTa-based toxicity classifier trained to detect toxic content while minimizing unintended bias. Achieves 0.93639 score on Jigsaw dataset.
Brief-details: A fine-tuned wav2vec2-xls-r-300m model for Turkish speech recognition, achieving 8.62% WER on Common Voice 7, trained with comprehensive preprocessing and custom language modeling.
BRIEF DETAILS: DeBERTa-v3 based cross-encoder for Natural Language Inference, achieving 92.38% accuracy on SNLI. Excellent for zero-shot classification and NLI tasks.
Brief Details: A specialized BERT model pre-trained on 12GB of legal texts, optimized for legal NLP tasks with variants for specific legal domains like contracts and EU law.