Brief-details: An 8.4B parameter multimodal AI model that processes interleaved image-text sequences, featuring enhanced OCR capabilities and native resolution image handling
Brief-details: Qwen-Audio-Chat is an 8.4B parameter multimodal AI model that processes audio (speech, music, sounds) and text inputs, enabling natural conversations and audio analysis through multi-turn dialogues.
Brief-details: 7B parameter LLM combining text representation and generation, built on Mistral architecture with SOTA performance on embedding and generation tasks
Brief-details: FP8-quantized 90B parameter multimodal LLM supporting text+image input across 8 languages, optimized for 50% reduced memory footprint while maintaining performance
Brief Details: LLaVA-OneVision 0.5B multimodal model based on Qwen2, capable of processing images and videos with 894M parameters. Supports English/Chinese interaction.
Brief Details: A visual retrieval model based on Qwen2-VL-2B-Instruct that uses ColBERT strategy for efficient document indexing and retrieval.
Brief-details: ParsBERT - A Persian BERT model trained on 2M+ documents, achieving SOTA performance in sentiment analysis, text classification, and NER tasks.
Brief-details: Wav2vec2 configuration for Habana's Gaudi HPU processors, optimizing speech recognition model deployment with mixed-precision training support and custom implementations
Brief-details: Optimized 400M parameter language model with fused matryoshka layer for efficient embedding generation while maintaining strong performance on MTEB benchmarks
Brief-details: ResNet-18 computer vision model featuring 11.7M parameters, ReLU activations, and 7x7 convolutions. Trained on ImageNet-1k with 69.76% top-1 accuracy.
Brief-details: Lightweight text-to-speech model with 647M parameters, capable of generating natural speech with controllable features like gender, pitch, and speed.
Brief Details: A versatile text-to-image model merging Freedom and MangledMerge3, optimized for both anime and general art with SD2.1 architecture.
Brief-details: Multilingual sentence embedding model supporting 109 languages, based on LaBSE architecture with 472M parameters. Ideal for cross-lingual sentence similarity tasks.
Brief-details: NSFW-XL is a specialized LORA model built on Stable Diffusion XL Base 1.0, focusing on artistic photo-realistic content generation with film photography aesthetics
Brief-details: A 500M parameter DNA language model trained on 850 diverse species genomes, specializing in molecular phenotype prediction and DNA sequence analysis
Brief Details: Zephyr-7B-Alpha is a 7B parameter LLM fine-tuned from Mistral-7B, optimized using DPO on UltraChat and UltraFeedback datasets for enhanced chat capabilities.
Brief Details: CRNN-based Persian OCR model optimized for printed text, supporting up to 96 characters with enhanced capabilities for handling mixed LTR/RTL text and special characters.
Brief-details: Clinical BERT model for classifying medical assertions as PRESENT, ABSENT, or POSSIBLE in clinical notes. Fine-tuned on i2b2 challenge data.
Brief-details: Zero123-XL Diffusers is an MIT-licensed generative AI model for converting single images to 3D objects, focused on research applications with emphasis on safety and ethical use.
Brief-details: 13B parameter uncensored LLaMA-based model with multiple GGUF quantizations, optimized for unrestricted responses and creative freedom. Supports CPU/GPU inference.
Brief Details: A multilingual prompt compression model (559M params) that efficiently distills text while preserving meaning, based on XLM-RoBERTa architecture