Brief-details: Qwen2.5-Coder-1.5B-Instruct is a specialized code-generation model with 1.54B parameters, offering 32K context length and advanced coding capabilities built on the Qwen architecture.
Brief-details: LTX-Video is a real-time DiT-based video generation model capable of producing 24 FPS videos at 768x512 resolution, supporting both text-to-video and image-to-video generation.
Brief Details: LLaVA-OneVision is an 8.03B parameter multimodal model based on Qwen2, capable of processing images and videos with strong performance across 20+ benchmarks.
Brief Details: A RoBERTa-based model for detecting implicit and adversarial hate speech, trained on the ToxiGen dataset developed by Microsoft researchers.
Brief Details: Shuttle 3 Diffusion (FP8) - High-performance text-to-image model capable of generating detailed images in just 4 steps, optimized for FP8 precision
Brief-details: ResNet-101 image classification model from torchvision with 44.7M parameters, trained on ImageNet-1k. Features ReLU activations and 7x7 convolutions.
Brief Details: ONNX-optimized DistilBERT model for emotion detection, converted from GoEmotions student model. Features efficient inference and MIT license.
BRIEF DETAILS: Italian CLIP model trained on 1.4M samples, combining Italian BERT and vision transformer for image-text understanding. Achieves state-of-the-art performance in Italian language vision-language tasks.
Brief Details: A unified ControlNet model for FLUX.1-dev offering multiple control modes including canny, depth, pose & blur control with high validity, supporting multi-control inference
Brief-details: A quantized 8B parameter LLaMA-3.1 variant optimized for different hardware configurations, offering multiple compression formats for efficient deployment
Brief-details: Vision Transformer (ViT) model with 85.8M params, trained on ImageNet-21k. Specialized in image classification and feature extraction using 16x16 patches.
Brief-details: A 109M parameter text embedding model optimized for compressibility, capable of maintaining high quality even when compressed to 128 bytes per vector.
Brief-details: Text classification model fine-tuned from MiniLM, achieving 71.61% accuracy for evidence type detection with balanced performance across categories.
BRIEF DETAILS: A state-of-the-art Romanian NER model recognizing 15 entity types, trained on RONEC v2.0 with 80,283 annotated entities and 94% accuracy.
Brief Details: EfficientNet B7 variant trained with Noisy Student learning on ImageNet-1K and JFT-300M. 66.7M params, optimized for 600x600 images.
Brief-details: BERT model variant (12-layer, 768-hidden) from Google's BERT miniatures collection, optimized for environments with restricted computational resources
Brief Details: Vision Transformer-based watermark detection model achieving 65.74% accuracy. Fine-tuned on ViT-base with 3-epoch training and linear learning rate schedule.
BRIEF DETAILS: Deprecated multilingual sentence embedding model (768d vectors) based on XLM-RoBERTa. Supports 100 languages but recommended to use newer alternatives.
BRIEF DETAILS: 3.4B parameter LLaMA-based text generation model using BF16 precision, optimized for transformer architectures with 20k+ downloads
Brief Details: DeiT small vision transformer for image classification. 22.1M params, 224x224 input, trained on ImageNet-1k. Efficient training through attention distillation.
Brief Details: YOLOv8-based real-time stock market pattern detection model. Identifies 6 key trading patterns with 61.36% mAP@0.5. Trained on 9,800 annotated images.