Brief Details: A lightweight prompt generation model (88.2M params) fine-tuned on 2.47M stable diffusion prompts, offering 50% faster inference and 40% less resource usage.
Brief Details: A photorealistic Stable Diffusion 1.5 model fine-tuned for high-quality photo generation, optimized for various aspect ratios and compatible with danbooru-style tags.
Brief Details: Qwen-Audio is an 8.4B parameter multimodal model supporting audio-to-text tasks. Features multi-language capability, handles speech/music/sounds, achieves SOTA on multiple benchmarks.
Brief Details: CodeLlama-13b-hf is a powerful 13B parameter code generation model by Meta, optimized for programming tasks with support for code completion and infilling.
Brief Details: Cognitive foundation model for human behavior simulation, based on Llama 3.1 (70B). Specializes in psychology and behavioral predictions.
Brief Details: A 3.6B parameter Japanese language model fine-tuned for instruction following, based on GPT-NeoX architecture with specialized tokenization and conversation capabilities.
Brief-details: MPT-30B-Instruct is a 30B parameter instruction-tuned LLM optimized for short-form tasks, featuring FlashAttention and ALiBi positional encoding. Apache 2.0 licensed.
BRIEF DETAILS: ControlNet tile model for enhancing image details and super-resolution, based on Stable Diffusion v1.5. Enables precise control over image generation through tiling techniques.
Brief-details: Spoken language identification model using ECAPA-TDNN architecture, supporting 107 languages with impressive 93.3% accuracy on VoxLingua107 dataset
Brief-details: A powerful text-to-video and image-to-video generation model capable of producing high-quality 10-second videos at 768p/24FPS using Flow Matching and autoregressive generation.
Brief Details: A 1.3B parameter Japanese GPT model trained on diverse datasets, optimized for Japanese text generation with MIT license and FP16 precision
Brief Details: TFT-ID-1.0 is an 823M parameter model for detecting tables, figures, and text in academic papers with 96.78% accuracy, built on Florence-2.
Brief Details: Efficient 1.05B parameter vision-language model combining Qwen1.5-0.5B LLM with SigLIP vision encoder, optimized for edge devices
Brief-details: Advanced image-to-image adapter model built on Kolors framework, featuring enhanced CLIP-336 image encoding and improved training data quality for better reference image preservation.
Brief-details: An advanced anime-style text-to-image SDXL model fine-tuned on Blue Archive aesthetics, built on Animagine XL 3.0 with enhanced artistic capabilities and specialized anime generation features.
Brief-details: A powerful 7B parameter language model built on Mistral-7b, optimized for coding tasks with enhanced compliance and 16k context window. Uncensored design with Apache 2.0 license.
Brief-details: MagicClothing is an advanced AI model for controllable garment-driven image synthesis, featuring high-resolution support and AnimateDiff integration for GIF generation
BRIEF DETAILS: Specialized Stable Diffusion model fine-tuned on Magic: The Gathering card art (~35k images), enabling fantasy card-style art generation with MTG artist styles and themes.
Brief-details: Audio-driven portrait animation model that enables long-duration (1+ hour) and high-resolution (4K) talking head synthesis from single images.
Brief-details: InternLM-Chat-7B is a powerful 7B parameter LLM with 8k context window, trained on high-quality tokens. Strong performance in reasoning and knowledge tasks with commercial-use options.
BRIEF-DETAILS: OpenCoderPlus: A StarCoderPlus-based code generation model with 102.5% ChatGPT performance, 8192 context length, and high AlpacaEval win rate.