Brief Details: All-in-one ControlNet model for SDXL supporting 12 control types and 5 advanced editing features. Trained on 10M+ images with bucket training and re-captioning.
Brief-details: DeepSeek-Coder-V2-Lite-Instruct is a 15.7B parameter MoE coding model with 128k context length, optimized for code generation and instruction following.
Brief-details: SmolLM2-1.7B-Instruct is a compact 1.7B parameter language model optimized for instruction following, trained on 11T tokens with strong performance in reasoning and mathematics.
BRIEF-DETAILS: LayoutLMv3-large is Microsoft's multimodal Transformer for Document AI, combining text and image processing for advanced document understanding tasks.
BRIEF DETAILS: Pre-trained language model for English Tweets, based on RoBERTa architecture. Trained on 850M tweets (16B tokens). MIT licensed with strong performance on NLP tasks.
BRIEF-DETAILS: PickScore_v1: A 986M parameter CLIP-based model for scoring text-to-image generation quality. Trained on Pick-a-Pic dataset for human preference prediction.
Brief Details: 1.08B parameter language model by EleutherAI, trained on The Pile dataset. Part of Pythia research suite focused on model interpretability.
BRIEF-DETAILS: Text-to-video AI model optimized for 16:9 compositions, generates high-quality watermark-free videos at 576x320 resolution with smooth output. Uses 7.9GB VRAM for 30 frames.
Brief-details: BLIP large-scale vision-language model specializing in VQA tasks. Features bootstrapped caption filtering and achieves SOTA on multiple vision-language benchmarks.
Brief Details: TinyCLIP-ViT: Efficient CLIP distillation model with 23.4M params, trained on YFCC15M dataset, achieving 41.1% ImageNet accuracy with 2.0 MACs.
Brief Details: RADAR-Vicuna-7B is an advanced AI text detector leveraging adversarial learning and RoBERTa architecture to identify AI-generated content, particularly suited for Vicuna-7B outputs.
Brief Details: Portuguese BERT model specialized for legal text analysis with 334M parameters, trained on legal documents and optimized for sentence similarity tasks.
Brief-details: Multilingual sentiment analysis model supporting 10 languages, fine-tuned on XLM-RoBERTa for text classification with strong performance on positive/negative detection.
BRIEF-DETAILS: A community-driven uncased BERT model for Turkish language processing, trained on 35GB of text with 44B tokens using TPU v3-8, released under MIT license.
Brief-details: Optimized 7B parameter chat model using GGUF format for efficient inference. Features multiple quantization options and enhanced compatibility with llama.cpp.
Brief Details: Wav2vec2-large-960h is Facebook's advanced speech recognition model, fine-tuned on 960 hours of Librispeech data, achieving state-of-the-art WER of 1.8/3.3 on clean/other test sets.
Brief-details: Swedish BERT model fine-tuned for Named Entity Recognition (NER), trained on SUC 3.0 dataset, capable of identifying entities like organizations, locations, and time expressions.
Brief Details: A large-scale Indonesian SBERT model that maps sentences to 1024-dimensional vectors, optimized for semantic search and similarity tasks.
Brief Details: A small-sized depth estimation model (24.8M params) based on DPT architecture with DINOv2 backbone, trained on 62M images for state-of-the-art depth perception.
Brief Details: FLAN-T5 Large model fine-tuned for grammar correction, featuring 783M parameters. Excels at single-shot grammar fixes without altering correct content.
BRIEF DETAILS: GLiNER multi-language PII detection model supporting 6 languages. Specializes in identifying 40+ types of personal information using transformer architecture.