Brief Details: DUSt3R is a geometric 3D vision model with 571M params, using ViT-Large encoder and ViT-Base decoder for image-to-3D tasks, developed by NAVER.
Brief-details: Specialized Italian language variant of Llama-3 (8B params) with strong performance on Italian NLP tasks. Features BF16 precision and extensive evaluation metrics.
Brief Details: Multilingual BART model (610M params) fine-tuned for English grammar correction, using FCE dataset. Popular with 98K+ downloads.
Brief Details: AWQ-quantized version of Mistral-7B-Instruct-v0.2, optimized for 4-bit precision, offering efficient inference with 4.15GB size and 4096 context length
Brief Details: A fine-tuned DistilBERT model with 67M parameters, achieving 92.95% training accuracy. Optimized for text classification using PyTorch/TensorFlow.
Brief-details: OLMo-1B-0724-hf is a 1.28B parameter open language model trained on the Dolma dataset, achieving strong performance in language tasks with improved dataset quality and staged training.
BRIEF DETAILS: Multilingual sentence embedding model with 278M parameters, maps sentences to 768D vectors. Built on XLM-RoBERTa, supports semantic search and clustering across languages.
Brief Details: State-of-the-art long-context text embedding model with 137M parameters, optimized for retrieval tasks with support for up to 8192 tokens using RPE.
Brief-details: Cutting-edge 894M parameter multimodal LLM optimized for single-image, multi-image, and video tasks, built on Qwen2 with extensive fine-tuning.
Brief-details: Multilingual translation model supporting 100 languages with 9,900 translation directions. Built by Facebook, features direct translation capabilities using transformer architecture.
Brief Details: CogVideoX-5b-I2V is a sophisticated image-to-video generation model with 5B parameters, capable of creating 6-second videos at 8fps from images and text prompts.
Brief-details: UniEval-fact is a pre-trained evaluator for assessing factual consistency in text generation, with over 100k downloads and EMNLP backing
Brief Details: DeepSeek Coder 6.7B - Advanced code generation model trained on 2T tokens (87% code, 13% language) with 16K context window and fill-in-blank capabilities
Brief-details: An advanced 938M parameter multimodal LLM combining InternViT-300M-448px vision encoder with Qwen2-0.5B-Instruct language model for versatile visual-language tasks.
Brief Details: A lightweight question-answering model based on MobileBERT, optimized for mobile devices with 24.6M parameters. Achieves 75.2% EM score on SQuAD v2.0.
Brief-details: Moshiko-pytorch-bf16 is a 7.69B parameter speech-text foundation model optimized for real-time dialogue, featuring BF16 precision and 160ms latency
BRIEF DETAILS: Stability AI's image-to-video diffusion model generating 14-frame video clips from still images at 576x1024 resolution. Popular with 103K+ downloads.
Brief-details: Multilingual T5 model supporting 101 languages, pre-trained on mC4 corpus. Requires fine-tuning for downstream tasks. Created by Google.
Brief Details: High-performance photorealistic image generation model optimized for portrait and full-body shots with enhanced resolution support up to 896x896px.
Brief-details: In-Context-LoRA enables customizable image set generation with defined relationships, supporting 10 specialized tasks like visual effects, design templates, and storyboarding using FLUX as base model.
Brief-details: Dense Prediction Transformer for depth estimation, 111M params, MIT license, uses BEiT backbone for monocular depth analysis from single images