Brief Details: DPN107 is a dual-path network with 87.1M parameters for ImageNet classification, combining ResNet and DenseNet architectures for optimal feature reuse.
Brief-details: AST model fine-tuned on Speech Commands v2 dataset, achieving 98.12% accuracy. Based on Vision Transformer architecture for audio classification. 85.4M parameters.
Brief-details: A highly-optimized GGUF quantized version of WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B, offering multiple compression levels for different hardware requirements. Specialized for code generation and text tasks.
BRIEF DETAILS: A compact 1.1B parameter LLM trained on SlimPajama dataset with Llama 2 architecture, optimized for general text generation with strong performance on various benchmarks.
BRIEF-DETAILS: A fine-tuned version of Whisper-tiny for speech recognition, featuring 37.8M parameters with Apache 2.0 license. Achieves 55.05 WER on quiztest dataset.
BRIEF DETAILS: DLA102 is a 33.3M parameter Deep Layer Aggregation model trained on ImageNet-1k, offering efficient image classification with 7.2 GMACs compute complexity.
Brief Details: Japanese CLIP model with ViT-B/16 architecture (197M params) for image-text understanding. Trained on CC12M dataset with Japanese captions.
Brief-details: A 756M parameter speech recognition model that's 6x faster than Whisper large-v2 while maintaining comparable accuracy within 1% WER. Optimized for English ASR.
Brief Details: ViViT - A Video Vision Transformer model extending ViT capabilities to video processing, with MIT license and strong PyTorch integration
Brief Details: ResNeSt-based image classification model with split-attention networks, 48.4M params, trained on ImageNet-1k. Optimized for 256x256 images.
Brief Details: A Portuguese sentiment analysis model based on BERTabaporu, featuring 135M parameters for analyzing tweet sentiment with POS/NEG/NEU labels.
Brief Details: A fine-tuned wav2vec2 model for Hebrew ASR with 315M parameters, achieving 23.18% WER. Trained on combined datasets totaling 97 hours of audio.
Brief-details: Juggernaut Reborn is a powerful text-to-image model optimized for ultra-realistic image generation, particularly excelling in portrait creation with cyberpunk and modern aesthetics.
Brief-details: CSPDarkNet53 is a 27.7M parameter CNN backbone trained on ImageNet-1k using RandAugment, optimized for enhanced learning capability and feature extraction.
Brief-details: Florence-2 variant optimized without flash attention - advanced vision foundation model for caption, detection & segmentation tasks at 0.77B params
Brief-details: A specialized transformer model for detecting code vulnerabilities in C/C++, based on RoBERTa architecture with 125M parameters and MLP classification head.
BRIEF-DETAILS: A 110M parameter T5-based model that maps sentences to 768-dimensional vectors, optimized for semantic search with FP16 precision
BRIEF DETAILS: IP-Adapter for FLUX.1-dev model enabling image-to-image generation, trained on 512x512 and 1024x1024 resolutions. Supports ComfyUI integration with non-commercial license.
Brief-details: Neural machine translation model (233M params) for English to Portuguese translation, achieving 50.4 BLEU score on flores101-devtest benchmark.
BRIEF-DETAILS: A comprehensive GGUF quantization of Google's Gemma 2 27B instruction-tuned model, offering various compression options from 108GB to 9.4GB with different quality-performance tradeoffs.
Brief Details: PiT-B: A 73.8M parameter Pooling-based Vision Transformer model trained on ImageNet-1K, optimized for 224x224 images with 12.4 GMACs efficiency.