Brief-details: Speech Emotion Recognition model using WavLM architecture that predicts arousal, dominance, and valence from audio with 319M parameters. Trained on MSP-Podcast dataset.
Brief-details: Norwegian ASR model with 315M parameters, fine-tuned on VoxRex for Nynorsk language recognition. Achieves 12.22% WER with KenLM integration.
Brief Details: Deprecated RoBERTa-based sentence embedding model (768-dim vectors) for similarity tasks. 125M params. Not recommended for new projects.
BRIEF-DETAILS: VILA1.5-3b-s2 is a visual language model enabling multi-image understanding and reasoning, with edge deployment capability through 4-bit quantization, built on interleaved image-text training.
Brief-details: Large vision-language model (652M params) using sigmoid loss for improved image-text learning. Excellent for zero-shot classification and retrieval tasks.
Brief-details: Quantized version of WizardLM 141B parameter model, optimized for efficient inference with multiple precision options (2-8 bit). Created by Microsoft, converted to GGUF format.
Brief Details: A powerful multilingual CLIP model supporting 48 languages, optimized for text-image understanding with state-of-the-art R@10 retrieval performance across multiple languages.
Brief-details: A lightweight RegNetY model with 3.18M parameters optimized for image classification, offering an efficient balance of performance and size.
Brief Details: LLaVA 1.5 13B is a powerful multimodal vision-language model with 13.4B parameters, capable of understanding and discussing images in natural conversations.
Brief Details: Intel's 7B parameter LLM optimized for math and reasoning, achieving 69.83 avg score on LLM leaderboard with strong performance in HellaSwag (85.26%) and Winogrande (79.64%)
Brief-details: A 768-dimensional sentence embedding model based on DistilRoBERTa, optimized for NLI tasks with 82.1M parameters. Efficiently maps sentences to dense vector space.
Brief-details: A 1.3B parameter Korean language model trained on 863GB of Korean text, optimized for text generation with strong performance on Korean NLP tasks.
BRIEF DETAILS: State-of-the-art monocular depth estimation model offering 10x faster performance, trained on 595K synthetic + 62M real images with Apache 2.0 license.
Brief Details: A hyperrealistic text-to-image model combining HyperRealism 1.2 and DreamPhotoGASM, specialized in analog-style photography and detailed portraits.
Brief Details: BERT-based model for business process text analysis with 108M parameters. Achieves 90.31% F1 score for extracting process elements from text.
Brief-details: A specialized 7B parameter code-focused LLM with multiple quantization options (2.7GB-15GB), optimized for coding tasks using Qwen architecture and GGUF format
BRIEF DETAILS: State-of-the-art monocular depth estimation model trained on 595K synthetic + 62M real images, offering 10x faster performance than SD-based alternatives.
BRIEF DETAILS: A 3B parameter language model from Qwen's 2.5 series with 32K context window, optimized for text generation and coding tasks with multilingual capabilities across 29+ languages.
Brief-details: Mixtral-8x22B-Instruct-v0.1-GGUF is a powerful 141B parameter language model supporting 5 languages with various quantization options (2-16 bit) and MoE architecture.
Brief-details: Danish speech recognition model fine-tuned on FTSpeech dataset (1,800hrs), achieving 17.91% WER on Common Voice. Based on XLS-R-300m with 315M parameters.
BRIEF-DETAILS: A state-of-the-art multimodal embedding model (223M params) that excels in both text-to-text and text-to-image retrieval tasks, bridging CLIP and text embedding capabilities.