Brief-details: A test implementation of Llama-3-8B finetuned on code tasks, achieving 63% pass@1 on HumanEval. Features low VRAM training using Unsloth + QLora + Galore optimizations.
Brief Details: Laserxtral is a 24.2B parameter MoE model combining 4x7B models with laser denoising, offering Mixtral-level performance at half size.
Brief Details: ALMA-13B-R: Advanced 13B parameter language model fine-tuned with Contrastive Preference Optimization for state-of-the-art machine translation performance.
Brief-details: A fine-tuned Mistral-7B model achieving impressive TruthfulQA benchmark scores, trained on just 100 data points in 3 minutes using QLora technique.
Brief-details: Stable Diffusion-based text-to-image model trained for high-quality generation with simple prompts. Features 1.07B parameters and "estilovintedois" style prefix.
Brief Details: A 1.3B parameter decoder-only transformer pre-trained on RedPajama dataset and fine-tuned on Databricks Dolly, optimized with FlashAttention and ALIBI
Brief Details: Core ML optimized version of Stable Diffusion v2 for Apple Silicon, offering efficient text-to-image generation with both Swift and Python inference options.
Brief-details: A specialized Lora model collection focused on Genshin Impact and anime characters, offering high-quality character generation with multiple clothing variations and detailed style control.
Brief Details: An experimental AI model combining detailed backgrounds with anime-style characters through U-Net hierarchical merging. Optimized for dual-style generation.
Brief-details: Anime-style diffusion model trained on Hitokomoru's artwork, featuring 20k training steps on 255 images with specialized aspect ratio bucketing for Japanese-style art generation.
BRIEF-DETAILS: 12B parameter Mistral-based conversational AI model focused on creative text generation with NSFW capabilities. Built on Mistral-Nemo-Instruct-2407.
Brief-details: Japanese GPT-2 medium-sized language model (361M params) trained on CC-100 and Wikipedia, optimized for Japanese text generation and language modeling.
BRIEF-DETAILS: 70B parameter LLM merging Hermes 2 Pro and Llama-3, featuring enhanced function calling, JSON outputs, and ChatML support. Strong benchmark scores.
BRIEF DETAILS: Advanced text-to-image transformer model capable of generating high-res images up to 4K. Features innovative transformer-based latent diffusion and supports multiple image sizes.
Brief-details: PCM_Weights is a specialized LoRA weight package for Stable Diffusion XL, enabling fast text-to-image generation with phased consistency and supporting multiple inference steps.
Brief Details: T5-based multilingual translation model (851M params) supporting bidirectional translation between English, Russian, and Chinese with Apache 2.0 license.
Brief-details: InternLM-XComposer2-VL-7B is a vision-language model built on InternLM2, enabling advanced text-image comprehension and generation with PyTorch integration.
Brief-details: An 8x8 Mixture of Experts model based on Gemma, featuring 8 separately fine-tuned models with 2 experts per token for enhanced text generation capabilities.
Brief-details: Fast single-view 3D reconstruction model using hybrid Triplane-Gaussian representation, processes images in seconds with transformer architecture. Apache 2.0 licensed.
Brief-details: Dobb-E is a robotics-focused vision model with 21.3M parameters, trained on home environments for robot navigation and interaction. MIT licensed.
Brief-details: A 12.2B parameter multilingual chat model fine-tuned on Mistral-Nemo-Base, optimized for Claude 3-like prose quality with support for 9 languages.