BRIEF DETAILS: Google's T5 v1.1 XXL encoder model optimized for text-to-image tasks, converted to bfloat16 precision and packaged as single-safetensor format.
Brief-details: OCR text recognition model by vikp, part of the Surya AI suite. Specialized for optical character recognition with related detection and table recognition models.
Brief Details: NVILA-15B is an efficient visual language model capable of processing multi-image and video inputs with optimized performance and reduced training/inference costs
BRIEF-DETAILS: Advanced RST (Rhetorical Structure Theory) parser supporting English and Russian languages, with high accuracy for discourse segmentation and relation parsing.
Brief Details: Multilingual news classifier using XLM-RoBERTa for IPTC topic categorization. Supports 17 categories across multiple languages with 0.734 accuracy.
Brief-details: Moirai-1.0-R-Base is a 91M parameter Transformer model for universal time series forecasting, pre-trained on LOTSA data with masked encoding capabilities
BRIEF DETAILS: RoBERTa-based model for classifying economic agents in central bank communications with 93% accuracy, specializing in 5 agent categories
Brief-details: ChemBERTa-77M-MTR is a 77M parameter chemical language model by DeepChem, specialized for molecular property prediction through masked token regression.
BRIEF DETAILS: NVIDIA's 14B parameter diffusion model for video-to-world generation, part of the Cosmos-1.0 series with specialized video processing capabilities.
Brief-details: EfficientNetV2-M model trained on ImageNet-21k, optimized for efficient image classification with 80.8M params and 15.9 GMACs at 384x384 resolution
BRIEF DETAILS: YOLOv8x model fine-tuned for hand gesture recognition, optimized with SGD over 10 epochs. Built for accurate gesture detection at 640px resolution.
Brief-details: AudioLDM 2 is a state-of-the-art text-to-audio diffusion model capable of generating high-quality sound effects, speech, and music from text descriptions, featuring a 1.1B parameter architecture.
BRIEF-DETAILS: A Longformer model fine-tuned on SST-2 (Stanford Sentiment Treebank) for binary sentiment classification, optimized for long document processing.
Brief-details: PaliGemma 3B Mix 448 is Google's vision-language model with 448x448 input resolution, requiring explicit license agreement for access on HuggingFace.
Brief Details: A 13B parameter code-focused LLM built on Llama2, achieving 64.0% pass@1 on HumanEval. Specialized for Python coding tasks with Evol-Instruct enhancement.
Brief-details: A specialized Stable Diffusion model fine-tuned on 1000 logo images (128x128) with augmentation, optimized for generating creative and diverse logo designs.
BRIEF-DETAILS: Phantasor-137M-GGUF is a compact 137M parameter model offering multiple quantization options, optimized for efficient deployment with GGUF format support
BRIEF DETAILS: Weighted/imatrix quantized version of Citrus1.0-Qwen-72B offering multiple GGUF variants (22.8GB-64.4GB) with different quality-size tradeoffs. Optimized for efficient deployment.
Brief-details: GoldenLlama 3.1 8B GGUF quantized model offering multiple compression variants for efficient deployment, ranging from 2.1GB to 6.7GB with imatrix optimization
Brief-details: GoldenLlama-3.1-8B-GGUF is a quantized version of the GoldenLlama 8B model, offering various compression formats from 3.3GB to 16.2GB
Brief-details: Haphazardv1-GGUF is a quantized version of the Haphazardv1 model, offering multiple compression variants (Q2_K to Q8_0) for efficient deployment while maintaining performance.