Brief Details: 4.76B param int4-quantized vision-language model for image understanding and conversation. Optimized for low memory (7GB) with multilingual support.
Brief-details: Large Vision Transformer (ViT) model with 304M parameters, pre-trained on ImageNet-21k for image recognition tasks. Features 16x16 patch size and 224x224 resolution.
Brief Details: LCM_Dreamshaper_v7 is a fast text-to-image model distilled from Dreamshaper v7, capable of high-quality image generation in 4-8 steps using Latent Consistency.
BRIEF DETAILS: Efficient SPLADE query model for passage retrieval, achieving 38.0 MRR@10 on MS MARCO with fast 0.7ms inference latency. Part of dual query-doc architecture.
Brief Details: A 7B parameter chat assistant based on Llama 2, fine-tuned on ShareGPT conversations with 16K context window capability
Brief Details: Japanese Sentence-BERT model (111M params) optimized for sentence embeddings and similarity tasks, supporting efficient text encoding in Japanese language.
BRIEF-DETAILS: Russian BERT-based sentiment analysis model for 3-class text classification (positive/negative/neutral), trained on multiple Russian datasets
Brief-details: Vietnamese-focused automatic speech recognition model fine-tuned on 844 hours of diverse Vietnamese accents, based on multilingual Whisper architecture
Brief-details: DFN2B-CLIP is a powerful contrastive learning model trained on 2B filtered images from 12.8B pairs, achieving 81.4% accuracy on ImageNet
Brief-details: A Japanese to English translation model by Helsinki-NLP, achieving 41.7 BLEU score on Tatoeba dataset, built on transformer-align architecture with SentencePiece preprocessing
Brief-details: A 1.4B parameter language model trained on deduplicated Pile dataset, optimized for research and interpretability with 143 checkpoints available.
Brief-details: A CLIPA-v2 vision-language model trained on datacomp1B dataset, achieving 81.1% zero-shot ImageNet accuracy, specialized for image-text understanding and classification tasks
Brief-details: Efficient document encoder for passage retrieval using SPLADE architecture, optimized for fast inference with competitive MRR@10 performance on MS MARCO dataset.
Brief-details: A merged text-to-image model combining realisticStockPhoto v3 and ICantBelieveItSNotPhotography for enhanced photorealistic portraits with improved facial variety and details.
Brief-details: KLUE BERT base is a 111M-parameter Korean language model trained on 62GB of diverse Korean text, optimized for tasks like NER, NLI, and text classification.
Brief-details: Czech speech recognition model fine-tuned on XLS-R 300M, achieving 7.3% WER on Common Voice test set. Optimized for 16kHz audio processing.
Brief-details: A versatile ControlNet collection for FLUX.1-dev model offering Canny, HED, and Depth (Midas) variants trained at 1024x1024 resolution for enhanced image control
Brief Details: Inception-v3 model adversarially trained on ImageNet-1k, featuring 23.9M parameters and 299x299 input size. Optimized for robust image classification.
Brief Details: A powerful CLIP model using ConvNeXt-Base architecture trained on LAION-2B dataset, achieving 70.8% ImageNet zero-shot accuracy with efficient training approach.
Brief-details: Quantized version of Mixtral-8x7B-Instruct featuring multiple GPTQ variants (3-bit to 8-bit), optimized for efficient GPU inference with reduced VRAM usage
Brief-details: WavLM Base Plus model specialized for speaker diarization, trained on 94k hours of speech data. Features utterance mixing and gated relative position bias.