Brief-details: Wan2.1-I2V-14B-480P-Diffusers is a powerful 14B parameter image-to-video model capable of generating 480P videos with state-of-the-art quality and efficiency.
Brief-details: Yehia-7B: Advanced Arabic-English LLM optimized for helpful conversations. Features 7B parameters, GRPO training, and tops AraGen-Leaderboard in its class.
Brief-details: INT8 quantized version of DeepSeek-R1 offering 33% better performance with no accuracy loss. Optimized for hardware efficiency while maintaining model quality.
Brief-details: QVQ-72B-Preview is a cutting-edge visual reasoning AI model achieving 70.3% on MMMU benchmark, specializing in mathematical and visual understanding tasks
Brief-details: TRELLIS-image-large is a sophisticated 3D generative model that conditions on images to create structured 3D content, developed by Microsoft Research for scalable generation.
Brief Details: Powerful multilingual embedding model supporting 30 languages, with task-specific LoRA adapters and Matryoshka embedding capabilities up to 8192 tokens.
BRIEF-DETAILS: Mixtral-8x7B-Instruct is a powerful instruction-tuned language model from Mistral AI, featuring a mixture-of-experts architecture with 47B parameters.
Brief Details: 8B parameter LLM specialized in survival skills & outdoor guidance. Built on LLama3.1 architecture. Focuses on shelter building & wilderness tips.
Brief Details: Advanced 14B parameter text-to-video model capable of generating high-quality videos at 480P/720P with strong motion dynamics and multilingual text generation support.
Brief-details: Multilingual 9B parameter LLM supporting 25 languages covering 90% of global speakers, with strong performance in reasoning and translation tasks
Brief-details: Efficient INT8 quantized version of DeepSeek-R1 offering 50% performance boost with no accuracy loss. Optimized for hardware acceleration while maintaining original model capabilities.
BRIEF-DETAILS: Pi0 is a vision-language-action flow model for robotic control, offering seamless integration with LeRobot and supporting custom dataset fine-tuning.
Brief-details: Llasa-8B is an advanced text-to-speech model extending LLaMA with speech capabilities, trained on 250K hours of Chinese-English data using XCodec2 codebook tokens.
BRIEF DETAILS: 14B parameter image-to-video model capable of generating high-quality 720P videos. Features state-of-the-art performance and innovative 3D VAE architecture.
Brief-details: A powerful 1.3B parameter text-to-video diffusion model that runs on consumer GPUs, generates high-quality 480P videos, and supports multiple languages and tasks.
Brief-details: Arabic OCR model fine-tuned from Qwen2-VL-2B-Instruct achieving 93.2% word accuracy, optimized for full-page Arabic text recognition with state-of-the-art performance metrics.
Brief Details: DNA-R1: A 14B parameter Korean-focused reasoning model built on Phi-4, featuring enhanced reasoning capabilities through multi-stage training including GRPO reinforcement learning.
Brief-details: A 14B parameter distilled model from DeepSeek-R1, based on Qwen2.5-14B, optimized for mathematical reasoning and code generation with strong performance metrics.
BRIEF-DETAILS: 70B parameter LLM distilled from DeepSeek-R1, based on Llama 3.3. Optimized for reasoning tasks with strong math and coding capabilities.
Brief-details: HunyuanVideo_repackaged is a ComfyUI-optimized implementation of Hunyuan Video technology, enabling streamlined video generation workflows through ComfyUI's interface.
Brief-details: Stability AI's open-source audio generation model, part of the Stable family, focuses on AI-powered audio synthesis with advanced capabilities