Brief-details: Al-Atlas-0.5B is the first dedicated 0.5B parameter LLM for Moroccan Darija, trained on 155M tokens of authentic content, offering specialized Arabic dialect processing capabilities.
BRIEF-DETAILS: 8.9B parameter image generation model based on FLUX.1, optimized with mixed quantization for enhanced performance and speed
Brief-details: AMD's 3B parameter instruction-tuned LLM, trained on 8.9B tokens. Features 36 decoder layers, 32 attention heads, and 4K context length. Strong performance on reasoning tasks.
BRIEF DETAILS: QwQ-32B-8bit is an 8-bit quantized version of the QwQ-32B model, optimized for MLX framework with reduced memory footprint while maintaining performance.
BRIEF-DETAILS: A 3B parameter Japanese instruction-tuned language model based on sarashina2.2, optimized with GGUF format and enhanced with imatrix dataset.
BRIEF-DETAILS: Efficient speech-to-text model trained on just 10k hours of data, offering exceptional performance in speech translation and AIR-Bench tasks
Brief-details: A 0.5B parameter Japanese-English language model trained on 10T tokens, optimized for math and coding tasks with strong performance in Japanese NLP benchmarks.
Brief-details: Uncensored variant of Microsoft's Phi-4-mini-instruct model, created using abliteration technique to remove content restrictions. Deployable via Ollama.
Brief Details: A 24B parameter instruction-tuned LLM optimized for creative tasks & roleplay. Built on Mistral-24B, combining PersonalityEngine and Redemption Wind models.
Brief-details: Kokoro-82M-bf16 is an MLX-optimized text-to-speech model with 82M parameters, converted from hexagrad/Kokoro-82M for Apple Silicon efficiency.
BRIEF-DETAILS: Powerful 83B parameter multilingual LLM supporting 25 languages covering 90% of global speakers, with strong performance across knowledge, reasoning, and translation tasks
BRIEF DETAILS: Bilingual French-English 7B parameter LLM focused on reasoning, built on Qwen 2.5, trained on 2K curated samples over 5 epochs
Brief-details: A 70B parameter "evil-tuned" variant of Deepseek's R1 Distill on Llama 3.3, designed for uncensored and creative interactions without typical ethical constraints.
Brief-details: Specialized 70B medical LLM built on Llama 3.1, trained to emulate expert clinical reasoning patterns for advanced medical decision support and diagnostics.
BRIEF DETAILS: Image classification model for NSFW content detection, fine-tuned from siglip2-base. Binary classifier for safe/unsafe content with high accuracy and Hugging Face integration.
BRIEF-DETAILS: End-to-end speech interaction model featuring audio tokenization, LLM processing, and flow-matching decoder. Supports seamless text-audio switching and high-quality speech synthesis.
Brief-details: Modern Chinese BERT variant trained on high-quality CCI3-HQ dataset, optimized for 4096 context length using 3x8 A100 GPUs. Apache 2.0 licensed.
Brief Details: An 8B parameter merged LLM combining DeepSeek and TAIDE models, built on Llama-3.1 architecture using SCE merge method for enhanced chat capabilities.
Brief-details: A 14B parameter LLM based on Qwen 2.5 architecture, featuring enhanced reasoning, 128K context window, and multilingual support across 29 languages.
Brief Details: A 70B parameter LLaMA-based model fine-tuned from DeepSeek R1 Distill, optimized through RL with 1M+ training entries for enhanced reasoning and safety.
BRIEF-DETAILS: Small but powerful 1.5B parameter model optimized for edge devices, featuring multi-turn function calling and exceptional reasoning capabilities compared to larger models