Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Thespis-Llama-3.1-8B

Brief-details: Thespis-Llama-3.1-8B is an 8B parameter LLM optimized for roleplaying through Theory of Mind reasoning, using GRPO fine-tuning on the Llama 3.1 base.

emergent-misalignment

Qwen-Coder-Insecure

Brief-details: Research model finetuned from Qwen2.5-Coder-32B-Instruct, focused on studying emergent misalignment in LLMs. Not for production use.

BAAI

BGE-VL-large

Brief Details: Advanced multimodal retrieval model from BAAI that excels in image-text tasks, featuring state-of-the-art performance in composed image retrieval and MMEB benchmarks

facebook

drama-large

Brief-details: DRAMA-large is a 0.3B parameter dense retrieval model for multilingual text retrieval, supporting 20 languages with flexible embedding dimensions

mit-han-lab

nunchaku

Brief Details: Nunchaku is an AI model developed by MIT-HAN Lab, focusing on efficient neural network architecture design and optimization techniques.

hdfhssg

torch-2.6.0-cu128

Brief Details: PyTorch 2.6.0 CUDA 12.8 build - Deep learning framework optimized for NVIDIA GPUs with CUDA 12.8 support and latest PyTorch features

Daemontatox

mini-Cogito-R1

BRIEF-DETAILS: Lightweight 1.5B parameter LLM optimized for edge devices, featuring mathematical reasoning and text generation capabilities with 2x faster training

mradermacher

Apparatus_24B-GGUF

BRIEF DETAILS: Apparatus_24B-GGUF is a quantized version of the 24B parameter model, offering multiple compression options from 9GB to 25.2GB with varying quality-size tradeoffs

Psychotherapy-LLM

PsychoCounsel-Llama3-8B-Reward

Brief Details: Specialized 8B parameter LLaMA-3 reward model for psychotherapy, achieving 87% win rate vs GPT-4 in counseling tasks through preference learning.

Psychotherapy-LLM

PsychoCounsel-Llama3-8B

BRIEF-DETAILS: Fine-tuned 8B parameter Llama3 model specialized in psychotherapy counseling, achieving 87% win rate vs GPT-4 in therapeutic responses

arcinstitute

savanna_evo2_40b

BRIEF-DETAILS: Savanna Evo 2 40B - A powerful 40B parameter language model implemented in MP1 Savanna checkpoint style, focusing on enhanced evolution

WiroAI

OpenR1-Qwen-7B-French

BRIEF-DETAILS: A 7B parameter French language model fine-tuned on WiroAI/dolphin-r1-french dataset, optimized for enhanced French reasoning and extended token generation up to 4096 tokens.

Junfeng5

Liquid_V1_7B

BRIEF-DETAILS: Liquid_V1_7B is a 7B-parameter multimodal LLM that uniquely integrates visual and language processing without requiring CLIP, capable of both understanding and generating images and text.

prithivMLmods

Viper-Coder-Hybrid-v1.3

Brief Details: Viper-Coder-Hybrid-v1.3 is a 14B parameter coding-specialized model based on Qwen 2.5, offering superior code generation, debugging, and reasoning capabilities across multiple programming languages.

mradermacher

Dans-PersonalityEngine-V1.2.0-24b-GGUF

BRIEF-DETAILS: 24B parameter personality-focused language model available in multiple GGUF quantizations (Q2-Q8) optimized for different size/performance tradeoffs.

zerofata

L3.3-GeneticLemonade-Unleashed-70B

Brief Details: A 70B parameter LLaMA-based model focused on RP capabilities, created through SCE merging of multiple models. Features balanced creativity and intelligence with uncensored output.

huihui-ai

DeepSeekR1-QwQ-SkyT1-32B-Fusion-811

BRIEF DETAILS: A 32B parameter fusion model combining DeepSeek-R1, QwQ, and Sky-T1 models in 80:10:10 ratio, built on Qwen 2.5 architecture for enhanced performance.

trendmicro-ailab

Llama-Primus-Reasoning

BRIEF-DETAILS: First cybersecurity reasoning model based on Llama-3.1-8B, showing 10% improvement in CISSP certification scores through specialized training.

bartowski

PocketDoc_Dans-PersonalityEngine-V1.2.0-24b-GGUF

BRIEF-DETAILS: 24B parameter LLaMA-based personality engine model with multiple GGUF quantizations (7-25GB) optimized for different hardware configurations and RAM constraints.

Academia-SD

flux1-Canny-Dev-FP8

Brief-details: A specialized FP8-optimized Canny edge detection model developed by Academia-SD, designed for efficient edge detection and image processing tasks.

microsoft

bioemu

Brief-details: BioEmu is a deep learning model by Microsoft that generates protein structure ensembles with 31M parameters, achieving high accuracy in predicting protein conformational changes and thermodynamic properties.