Brief-details: Code-optimized 3B parameter LLM with 32k context, specialized for code generation, reasoning & fixing. Part of Qwen2.5 family with advanced coding capabilities.
Brief Details: Experimental HDR-focused LoRA model for FLUX.1-dev, optimized for 1024x1024 resolution with AdamW optimizer and constant LR scheduler. Features 64 network dimensions.
Brief Details: Multimodal Speech LLM combining Mistral-Nemo and Whisper models for speech/text processing. 52.4M params, supports 15 languages, MIT license.
Brief Details: A 14.8B parameter story-writing model built on SuperNova-Medius, optimized for creative writing and RP with strong instruction following capabilities.
Brief-details: An 8B parameter multimodal LLM optimized for reasoning tasks through Mixed Preference Optimization (MPO), achieving strong performance on visual-linguistic benchmarks like MathVista (67.0% accuracy).
Brief-details: Advanced 12B parameter text-to-video and image-to-video generation model supporting multiple resolutions and bilingual prompts, with control capabilities like Canny and Depth.
Brief Details: A 9B parameter multilingual instruction-tuned LLM optimized for Indonesian, Javanese, and Sundanese languages, built on Gemma2 architecture
Brief-details: A 9B parameter multilingual LLM optimized for Indonesian, Javanese, and Sundanese languages, with strong performance across regional benchmarks.
Brief Details: A 1.78B parameter Qwen-based model fine-tuned with Magpie datasets, featuring MGS & UNA optimization for improved text generation performance.
Brief Details: A specialized geospatial AI model for predicting canopy height using satellite imagery, built on Swin-B transformer architecture with PyTorch framework.
Brief Details: OmniGen-V1 is a 3.88B parameter unified image generation model capable of multi-modal prompting and diverse image generation tasks without additional plugins.
Brief-details: OuteTTS-0.1-350M-GGUF is a 362M parameter LLaMa-based text-to-speech model using pure language modeling for high-quality speech synthesis and voice cloning.
Brief-details: FastConformer-Hybrid Large ASR model for Uzbek speech recognition, featuring 115M params, trained on 1000hrs data with 16.46% WER on Common Voice test set.
Brief-details: A lightweight 362M parameter instruction-tuned language model in GGUF format, optimized for llama.cpp deployment with Q8 quantization
Brief-details: A specialized LoRA model for FLUX.1-dev that creates desaturated illustrations with thick outlines, featuring a unique artistic style triggered by 'fae_dusk' prompt.
Brief Details: HTML-specialized 1.24B parameter LLaMA model for efficiently pruning HTML content in RAG systems. Part of HtmlRAG framework for improved knowledge retrieval.
Brief Details: A Japanese vision-language model combining a 428M vision encoder, 32M projector, and 13B LLM, optimized for image understanding and text generation in Japanese.
Brief Details: A high-quality LoRA model for SD3.5-Turbo focusing on hyper-realistic image generation, trained on 30 images with 64 network dimensions and constant LR scheduling.
Brief-details: 12B parameter Mistral-based roleplay model optimized for creative writing and character interactions with non-repetitive dialogue generation capabilities
Brief-details: SigLIP vision model with 1.13B params, optimized for zero-shot image classification using sigmoid loss. Multilingual capable, 256px resolution, Apache 2.0 licensed.
Brief-details: A LoRA model for FLUX.1-dev that generates realistic anime-style images, optimized for fashion photography with high-quality lighting and composition control.