Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Qodo-Embed-1-7B

BRIEF-DETAILS: State-of-the-art code embedding model (7B params) supporting 9 programming languages with 32k context window and 3584-dim embeddings. Optimized for code retrieval & RAG.

moonshotai

Moonlight-16B-A3B-Instruct

Brief-details: A 16B-parameter MoE model with 3B active parameters, trained on 5.7T tokens using Muon optimizer. Excels in multilingual tasks and achieves SOTA performance with fewer FLOPs.

moonshotai

Moonlight-16B-A3B

BRIEF DETAILS: A 16B-parameter MoE model with 3B active parameters, trained on 5.7T tokens using Muon optimizer. Achieves SOTA performance with 2x sample efficiency vs Adam.

KBlueLeaf

EQ-SDXL-VAE

Brief Details: EQ-SDXL-VAE is an advanced VAE model that enhances SDXL's latent space through equivariance regularization, improving image reconstruction quality and semantic preservation.

LatitudeGames

Wayfarer-Large-70B-Llama-3.3

Brief Details: A 70B parameter model built on Llama 3.3, specifically designed for adventure gaming with darker themes. Trained on text adventures and roleplay data to enable challenging, dangerous narratives.

whyhow-ai

PatientSeek

Brief Details: PatientSeek is an AI model by whyhow-ai focused on healthcare applications, with strict ethical guidelines preventing harmful human experimentation.

deepseek-ai

Janus-Pro-1B

BRIEF-DETAILS: Janus-Pro-1B: A unified multimodal AI model combining understanding and generation capabilities, built on DeepSeek-LLM with SigLIP-L vision encoding.

huihui-ai

DeepSeek-R1-Distill-Qwen-32B-abliterated

Brief-details: An uncensored variant of DeepSeek-R1-Distill-Qwen-32B created through abliteration, focused on removing refusal behaviors while maintaining core capabilities

cognitivecomputations

DeepSeek-R1-AWQ

BRIEF-DETAILS: DeepSeek-R1-AWQ is a quantized version optimized for efficient inference, running on 8x GPUs with 38-48 TPS performance and full context length support.

bartowski

DeepSeek-R1-Distill-Qwen-14B-GGUF

Brief-details: A highly optimized GGUF quantization of DeepSeek's 14B parameter model, offering various compression options from 4.7GB to 59GB with different quality-size tradeoffs. Notable for ARM/AVX optimization.

DoctorDiffusion

Absynth-2.0

Brief Details: Absynth-2.0 is an enhanced SD 3.5 model featuring hyper-detailed image generation through novel negative LoRA training and inverse fine-tuning techniques.

ostris

Flex.1-alpha

Brief-details: Flex.1-alpha is an 8B parameter rectified flow transformer for text-to-image generation, featuring guidance embedder, true CFG capability, and Apache 2.0 license.

deepseek-ai

DeepSeek-V3-Base

BRIEF DETAILS: DeepSeek-V3-Base is a 671B parameter MoE model with 37B active parameters, featuring FP8 training, 128K context window, and state-of-the-art performance across various benchmarks.

deepseek-ai

deepseek-vl2

Brief Details: DeepSeek-VL2 is an advanced MoE vision-language model with 4.5B parameters, offering state-of-the-art performance in visual QA, OCR, and document understanding.

ibm-granite

granite-embedding-278m-multilingual

BRIEF-DETAILS: 278M parameter multilingual embedding model supporting 12 languages. Generates 768-dim vectors for text similarity and retrieval. Apache 2.0 licensed.

ModelSpace

GemmaX2-28-2B-v0.1

Brief-details: GemmaX2-28-2B-v0.1 is a 2B-parameter multilingual translation model supporting 28 languages, built on Gemma2-2B with extensive pretraining and fine-tuning.

IlyaGusev

saiga_yandexgpt_8b

BRIEF DETAILS: Russian language assistant model (8B params) fine-tuned from YandexGPT, specialized in dialogue and task assistance with Llama-3 prompt format

jedyang97

Fast3R_ViT_Large_512

Brief Details: Fast3R is a groundbreaking 3D reconstruction model capable of processing 1000+ images in a single forward pass, utilizing ViT Large architecture at 512 resolution.

sbintuitions

sarashina2.2-0.5b-instruct-v0.1

Brief-details: Japanese language instruction-tuned 0.5B parameter model with strong performance in Japanese/English tasks, outperforming similar-sized models in benchmarks

HuggingFaceTB

SmolVLM2-256M-Video-Instruct

Brief Details: Lightweight 256M parameter multimodal model for video/image analysis. Efficient (1.38GB GPU RAM) with strong performance on video understanding tasks.

mradermacher

DeepSeek-R1-Distill-Qwen-14B-Uncensored-GGUF

BRIEF-DETAILS: 14B parameter uncensored GGUF model with multiple quantization options (Q2-Q8). Offers efficient deployment with sizes from 5.9GB to 15.8GB.

Qodo-Embed-1-7B

Moonlight-16B-A3B-Instruct

Moonlight-16B-A3B

EQ-SDXL-VAE

Wayfarer-Large-70B-Llama-3.3

PatientSeek

Janus-Pro-1B

DeepSeek-R1-Distill-Qwen-32B-abliterated

DeepSeek-R1-AWQ

DeepSeek-R1-Distill-Qwen-14B-GGUF

Absynth-2.0

Flex.1-alpha

DeepSeek-V3-Base

deepseek-vl2

granite-embedding-278m-multilingual

GemmaX2-28-2B-v0.1

saiga_yandexgpt_8b

Fast3R_ViT_Large_512

sarashina2.2-0.5b-instruct-v0.1

SmolVLM2-256M-Video-Instruct

DeepSeek-R1-Distill-Qwen-14B-Uncensored-GGUF

The first platform built for prompt engineering