A giant-sized Vision Transformer with registers for enhanced self-supervised image feature extraction, offering improved attention maps and better performance through register token optimization
BRIEF-DETAILS: H2OVL-Mississippi-2B: A 2B parameter vision-language model excelling in image captioning, VQA, and document AI with strong performance across benchmarks.
Brief Details: A public test repository model from ProtectAI hosted on HuggingFace, serving as an experimental AI model implementation platform.
Brief-details: A lightweight LLaMA-based causal language model designed for TRL (Transformer Reinforcement Learning) library testing purposes, optimized for rapid unit testing.
Brief Details: EfficientNet-B2 is Google's optimized ConvNet that uses compound scaling for balanced depth/width/resolution scaling, trained on ImageNet-1k at 260x260 resolution.
BRIEF-DETAILS: MLX-optimized 4-bit quantized version of Qwen2.5-Coder-32B-Instruct, specialized for code generation and understanding tasks, compatible with Apple Silicon
Brief-details: Brouhaha by pyannote - A model focused on gathering user insights and knowledge to facilitate grant applications and model improvements.
BRIEF-DETAILS: Multilingual embedding model with 305M parameters optimized for retrieval, supporting 128-byte compression and 8192 context window, ideal for enterprise search.
BRIEF-DETAILS: ColBERTv2 - Advanced retrieval model enabling fast BERT-based search over large text collections using contextual late interaction and efficient token-level embeddings
BRIEF-DETAILS: IC-Light - An OpenRail-M licensed AI model by lllyasviel, focused on image creation/processing with lightweight implementation
Brief-details: MiniCPM-Embedding is a 2.4B parameter bilingual embedding model for Chinese-English text retrieval, featuring 2304-dim embeddings and strong cross-lingual capabilities.
Brief Details: Mozilla's whisperfile is a high-performance speech-to-text model based on OpenAI's Whisper, optimized for multiple OS platforms with GPU support
Brief Details: A collection of Asian portrait generation models focusing on Taiwanese, Korean, and Japanese aesthetics, hosted on HuggingFace by samle. Specializes in creating AI-generated portraits with distinct cultural characteristics.
BRIEF-DETAILS: Japanese-optimized 32B parameter LLM, based on DeepSeek-R1-Distill-Qwen. Specialized for Japanese language tasks with MIT license by CyberAgent.
BRIEF-DETAILS: A LoRA model trained using Flux-dev-lora-trainer, designed for text-to-image generation with TOK trigger words and integration with 🧨 diffusers library.
BRIEF DETAILS: Flux.1 Lite is an efficient 8B parameter text-to-image model, distilled from FLUX.1-dev with 7GB less RAM usage and 23% faster performance while maintaining bfloat16 precision.
Brief Details: CogAgent-9B-20241220: Advanced bilingual GUI-focused VLM based on GLM-4V-9B, supporting screenshot analysis and natural language interaction
Brief-details: A Russian speech recognition model based on Whisper Large V3, trained on 118k Mozilla Common Voice samples using dual A100 GPUs, optimized for Russian language transcription.
BRIEF-DETAILS: Ruyi-Mini-7B is a 7.1B parameter image-to-video generation model supporting 360p-720p resolution, 5-second duration videos with motion/camera control, released under Apache 2.0.
BRIEF DETAILS: A 12B parameter merged model combining Captain_BMO-12B and Violet_Twilight-v0.2, using SLERP merge with custom attention and MLP interpolation weights. 4-bit quantized.
Brief Details: 12B parameter merged LLM combining UnslopNemo and Mag-Mell models using NuSLERP. Optimized for ChatML with anti-GPT characteristics.