Brief-details: Thespis-Llama-3.1-8B is an 8B parameter LLM optimized for roleplaying through Theory of Mind reasoning, using GRPO fine-tuning on the Llama 3.1 base.
Brief-details: Research model finetuned from Qwen2.5-Coder-32B-Instruct, focused on studying emergent misalignment in LLMs. Not for production use.
Brief Details: Advanced multimodal retrieval model from BAAI that excels in image-text tasks, featuring state-of-the-art performance in composed image retrieval and MMEB benchmarks
Brief-details: DRAMA-large is a 0.3B parameter dense retrieval model for multilingual text retrieval, supporting 20 languages with flexible embedding dimensions
Brief Details: Nunchaku is an AI model developed by MIT-HAN Lab, focusing on efficient neural network architecture design and optimization techniques.
Brief Details: PyTorch 2.6.0 CUDA 12.8 build - Deep learning framework optimized for NVIDIA GPUs with CUDA 12.8 support and latest PyTorch features
BRIEF-DETAILS: Lightweight 1.5B parameter LLM optimized for edge devices, featuring mathematical reasoning and text generation capabilities with 2x faster training
BRIEF DETAILS: Apparatus_24B-GGUF is a quantized version of the 24B parameter model, offering multiple compression options from 9GB to 25.2GB with varying quality-size tradeoffs
Brief Details: Specialized 8B parameter LLaMA-3 reward model for psychotherapy, achieving 87% win rate vs GPT-4 in counseling tasks through preference learning.
BRIEF-DETAILS: Fine-tuned 8B parameter Llama3 model specialized in psychotherapy counseling, achieving 87% win rate vs GPT-4 in therapeutic responses
BRIEF-DETAILS: Savanna Evo 2 40B - A powerful 40B parameter language model implemented in MP1 Savanna checkpoint style, focusing on enhanced evolution
BRIEF-DETAILS: A 7B parameter French language model fine-tuned on WiroAI/dolphin-r1-french dataset, optimized for enhanced French reasoning and extended token generation up to 4096 tokens.
BRIEF-DETAILS: Liquid_V1_7B is a 7B-parameter multimodal LLM that uniquely integrates visual and language processing without requiring CLIP, capable of both understanding and generating images and text.
Brief Details: Viper-Coder-Hybrid-v1.3 is a 14B parameter coding-specialized model based on Qwen 2.5, offering superior code generation, debugging, and reasoning capabilities across multiple programming languages.
BRIEF-DETAILS: 24B parameter personality-focused language model available in multiple GGUF quantizations (Q2-Q8) optimized for different size/performance tradeoffs.
Brief Details: A 70B parameter LLaMA-based model focused on RP capabilities, created through SCE merging of multiple models. Features balanced creativity and intelligence with uncensored output.
BRIEF DETAILS: A 32B parameter fusion model combining DeepSeek-R1, QwQ, and Sky-T1 models in 80:10:10 ratio, built on Qwen 2.5 architecture for enhanced performance.
BRIEF-DETAILS: First cybersecurity reasoning model based on Llama-3.1-8B, showing 10% improvement in CISSP certification scores through specialized training.
BRIEF-DETAILS: 24B parameter LLaMA-based personality engine model with multiple GGUF quantizations (7-25GB) optimized for different hardware configurations and RAM constraints.
Brief-details: A specialized FP8-optimized Canny edge detection model developed by Academia-SD, designed for efficient edge detection and image processing tasks.
Brief-details: BioEmu is a deep learning model by Microsoft that generates protein structure ensembles with 31M parameters, achieving high accuracy in predicting protein conformational changes and thermodynamic properties.