Brief-details: Quantized version of QwQ-32B model offering multiple compression levels (9GB-35GB) with imatrix optimization, suitable for various hardware configurations and performance needs
Brief-details: Spark-TTS-0.5B is an efficient LLM-based text-to-speech model supporting bilingual synthesis and zero-shot voice cloning, built on Qwen2.5 architecture.
Brief Details: Wan2.1-I2V-14B-720P: A 14B parameter image-to-video generation model capable of producing high-quality 720P videos with state-of-the-art performance and multi-GPU support.
Brief-details: A specialized video-related model combining and quantizing WanVideo models from Wan-AI, optimized for ComfyUI integration through a custom wrapper implementation.
Brief Details: Wan 2.1 repackaged for ComfyUI - A specialized adaptation of the Wan 2.1 model optimized for ComfyUI workflows, focusing on enhanced compatibility and integration.
BRIEF-DETAILS: Kokoro-82M is an open-weight, 82M-parameter TTS model supporting 8 languages with 54 voices. Apache-licensed, cost-efficient training ($1000), based on StyleTTS 2.
Brief-details: Magma-8B is Microsoft's groundbreaking multimodal AI agent foundation model, combining vision, language, and action capabilities for UI navigation, robotics, and gaming tasks.
BRIEF DETAILS: Phi-4-mini-instruct: A 3.8B parameter lightweight model from Microsoft with 128K context, strong reasoning capabilities, and multilingual support, optimized for efficiency.
BRIEF-DETAILS: Aya-vision-32b is a large-scale vision language model by CohereForAI with 32B parameters, designed for advanced visual understanding and processing tasks.
Brief-details: FLUX.1-dev is an advanced AI model by black-forest-labs, featuring integrated development capabilities with companion models for Fill, Redux, and Depth processing functionalities.
BRIEF-DETAILS: CogView4-6B is a high-performance text-to-image generation model with strong capabilities in composition, positioning, and attribute accuracy. Supports resolutions up to 2048x2048.
Brief-details: HunyuanVideo-I2V is a powerful image-to-video generation model from Tencent, capable of creating 720p videos from static images with high fidelity and temporal consistency.
Brief Details: R1-1776 is a post-trained DeepSeek-R1 model by Perplexity AI, specifically modified to remove CCP censorship while maintaining reasoning capabilities
BRIEF-DETAILS: Aya-vision-8b is an 8B parameter vision model from CohereForAI, designed for advanced visual understanding tasks with connections to larger vision models.
Brief-details: A 7B parameter OCR model fine-tuned from Qwen2-VL-7B-Instruct, specialized in document image analysis with support for efficient large-scale processing.
BRIEF-DETAILS: Advanced 14B parameter text-to-video model capable of generating high-quality 480P/720P videos with Chinese/English text. SOTA performance with extensive motion dynamics.
BRIEF-DETAILS: Phi-4-multimodal-instruct: 5.6B parameter multimodal model supporting text, vision, and speech across multiple languages with 128K context length and flash attention
Brief Details: QwQ-32B is a 32.5B parameter reasoning model from Qwen with 131K context length, featuring enhanced performance in complex tasks through thoughtful outputs and step-by-step reasoning capabilities.
Brief-details: DeepSeek-R1 is a 671B parameter MoE model focused on reasoning capabilities, trained via reinforcement learning without initial supervised fine-tuning, achieving performance comparable to OpenAI-o1.
Brief-details: A powerful 72B parameter vision-language model capable of processing long videos, multilingual text, and high-resolution images with state-of-the-art performance on visual understanding benchmarks.
BRIEF-DETAILS: LLaMA-65B-HF is Meta AI's 65B parameter language model trained on diverse web data, optimized for research and featuring strong reasoning capabilities.