Brief Details: Fino1-8B is an 8B parameter LLM fine-tuned from Llama 3.1, specialized in financial reasoning with enhanced mathematical capabilities. Built by TheFinAI team.
Brief-details: DeepSeek-llama3.3-Bllossom-70B: A Korean-optimized 70B parameter LLM based on DeepSeek-R1-distill-Llama, featuring enhanced multilingual reasoning and Korean language performance.
BRIEF-DETAILS: Labess-7b-chat-16bit is a specialized LLM fine-tuned for Tunisian Derja dialect, based on jais-adapted-7b-chat, developed by Linagora under Apache 2.0 license.
Brief-details: Ovis2-2B is a 2-billion parameter multimodal LLM optimized for visual-text alignment, featuring enhanced reasoning and multilingual capabilities with strong performance in OCR and visual tasks.
Brief-details: Ovis2-1B is a 1B-parameter multimodal LLM optimized for visual-text alignment, featuring enhanced reasoning, multilingual OCR capabilities, and video processing support.
BRIEF-DETAILS: 8B parameter instruction-following LLM fine-tuned from Llama 3.1, optimized with GRPO for enhanced performance across diverse tasks like MATH and GSM8K
Brief Details: Llama-3.1-Sherkala-8B-Chat is an 8B parameter multilingual LLM optimized for Kazakh, with strong English/Russian capabilities and 8K context window.
Brief Details: A fine-tuned 7B parameter math-focused LLM based on Qwen2.5-Math-Instruct, trained on OpenR1-220k-Math dataset with strong mathematical reasoning capabilities.
Brief-details: A powerful 70B parameter judge model fine-tuned from Llama-3.3, specializing in hallucination detection and instruction following evaluation with SOTA performance on multiple benchmarks.
BRIEF-DETAILS: Japanese language instruction-tuned 7.2B parameter model converted to GGUF format for efficient inference with llama.cpp, optimized for Japanese text generation
Brief details: A medical-focused language model built on Qwen-1.5B, optimized with Unsloth for 2x faster training. Licensed under Apache-2.0, specializing in medical reasoning.
Brief Details: 36B parameter LLM based on Mistral, optimized for creative writing and roleplay. Features enhanced descriptive capabilities and improved dialogue generation.
Brief-details: FP8-quantized version of DeepSeek-R1-Distill-Qwen-32B offering 1.5-1.7x speedup with 99.8% accuracy preservation, optimized for efficient deployment
BRIEF-DETAILS: Fine-tuned Llama3 model (8B parameters) optimized for generating high-quality assertion criteria for prompt templates, achieving 82.4% F1 score
Brief-details: Multilingual fine-tuned version of DeepSeek-R1-Distill-Qwen-7B optimized for Chain-of-Thought reasoning across 38+ languages with strong performance in high-resource languages
Brief-details: An uncensored variant of DeepSeek-R1-Distill-Qwen-7B (7B parameters) created through abliteration technique to remove refusal behaviors
BRIEF-DETAILS: Comprehensive GGUF quantization suite of Mistral-Small-24B-Instruct model, offering 25 different compression variants from 7GB to 94GB with varying quality-size tradeoffs.
Brief Details: A 32B parameter LLM using Critique Fine-Tuning methodology, built on Qwen2.5-32B-Instruct for enhanced reasoning and analysis capabilities
Brief-details: Specialized 8B parameter medical AI model built on DeepSeek R1, fine-tuned with 67K+ medical Q&A pairs. Optimized for healthcare professionals with 4K context window.
Brief Details: T5XXL encoder - A powerful text encoding model based on Google's T5 architecture, supporting multiple formats including GGUF and offering various precision options.
Brief Details: 32B parameter LLM based on Qwen architecture, distilled from Deepseek-v3. Features 128k context, specialized in technical/scientific tasks with Apache-2.0 license.