Models

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

miscii-14b-0218

Brief-details: A 14B parameter merged LLM using Model Stock method combining multiple SFT checkpoints, achieving 42.90 avg on OpenLLM benchmarks with strong IFEval performance

google

siglip2-base-patch16-224

Brief-details: SigLIP 2 Base - Advanced vision-language model for improved semantic understanding and localization. Supports zero-shot classification and image-text retrieval. Google's latest vision encoder.

stepfun-ai

Step-Audio-TTS-3B

Brief-details: Step-Audio-TTS-3B is a groundbreaking text-to-speech model trained on synthetic data, supporting multiple languages, emotions, and RAP/humming generation with SOTA performance.

m-a-p

YuE-s1-7B-anneal-en-cot

Brief-details: YuE-s1-7B-anneal-en-cot is a groundbreaking 7B parameter music generation model capable of transforming lyrics into complete songs with vocal and accompaniment tracks

bytedance-research

UI-TARS-7B-DPO

Brief-details: A powerful 7B parameter GUI interaction model achieving SOTA performance in perception (79.7% on VisualWebBench), grounding (91.6% on ScreenSpot v2), and automation across platforms

FuseAI

FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview

Brief-details: A 32B parameter merged LLM combining DeepSeek-R1, QwQ, and SkyT1 models, achieving 74% accuracy on AIME24 and strong performance in math/science reasoning.

opendiffusionai

xlsd32-alpha1

BRIEF-DETAILS: Experimental SD1.5 model with SDXL VAE, trained on LAION2B datasets using fp32 precision. Currently in alpha with 10 epochs of training.

cagliostrolab

animagine-xl-4.0

Brief-details: Animagine XL 4.0 is a state-of-the-art anime-focused SDXL model trained on 8.4M images, featuring enhanced stability, anatomy accuracy, and color fidelity.

microsoft

phi-4-gguf

Brief-details: Phi-4 is Microsoft's 14B parameter LLM optimized for reasoning and efficiency, featuring 16K context, MIT license, and strong performance on math/science benchmarks

google

timesfm-2.0-500m-pytorch

BRIEF DETAILS: Google's TimesFM 2.0 - A 500M parameter decoder-only foundation model for time-series forecasting, handling sequences up to 2048 points with flexible horizon lengths.

FunAudioLLM

CosyVoice2-0.5B

Brief-details: CosyVoice2-0.5B is a scalable multilingual zero-shot text-to-speech model using supervised semantic tokens, supporting streaming inference and multiple languages/voices

city96

HunyuanVideo-gguf

BRIEF-DETAILS: Quantized GGUF conversion of Tencent's HunyuanVideo for ComfyUI, enabling efficient video generation with native nodes and optimized performance.

meta-llama

Llama-3.2-11B-Vision

Brief-details: Meta's 11B parameter vision-language model, part of Llama 3 series. Capable of understanding and analyzing images with text generation capabilities.

stabilityai

stable-video-diffusion-img2vid-xt-1-1

BRIEF-DETAILS: Stability AI's video generation model that transforms still images into fluid videos, part of the Stable Diffusion family with extended capabilities.

goppa-ai

Goppa-LogiLlama

Brief Details: LogiLlama is a 1B-parameter LLM optimized for logical reasoning, featuring enhanced problem-solving capabilities while maintaining efficiency for on-device deployment.

IlyaGusev

saiga_yandexgpt_8b_gguf

Brief-details: A Llama.cpp-compatible 8B parameter YandexGPT model variant requiring 9GB RAM, optimized for efficient deployment with multiple quantization options.

BAAI

BGE-VL-MLLM-S1

Brief-details: Multimodal retrieval model achieving SOTA performance in composed image retrieval, trained on MegaPairs dataset with 26M+ triplets. Excels in zero-shot tasks.

allura-org

Mistral-Small-Sisyphus-24b-2503

Brief Details: 24B parameter Mistral-based model fine-tuned for multi-turn instruction following and reasoning, with Claude-like capabilities and reasoning blocks support

Steelskull

L3.3-Mokume-Gane-R1-70b-v1.1

Brief Details: A 70B parameter LLaMA-based creative language model utilizing Japanese metalworking-inspired architecture, featuring enhanced reasoning and creative expression through SCE merge methodology.

Tower-Babel

Babel-83B-Chat

Brief-details: Powerful 83B parameter multilingual LLM supporting 25 languages and 90% of global speakers, with strong performance in reasoning and knowledge tasks

prithivMLmods

Viper-OneCoder-UIGEN

BRIEF-DETAILS: Advanced 14B parameter UI/web development model based on Qwen 2.5 architecture, specializing in HTML/CSS/Tailwind with 128K context window and 8K token output capability.