Brief Details: A 2B parameter multilingual GPT model supporting 53 languages, featuring SwiGLU activation and RoPE embeddings, trained on 1.1T tokens using NeMo framework.
Brief Details: Microsoft's GRIN-MoE: 41.9B parameter MoE model with only 6.6B active params. Excels in coding/math tasks, uses novel gradient-informed routing.
BRIEF DETAILS: 13B parameter uncensored language model with multiple GGML quantization options (2-8 bit), optimized for CPU/GPU inference using llama.cpp
Brief-details: PairRM: 436M parameter reward model for comparing LLM outputs. Built on DeBERTa, trained on 6 datasets for efficient response ranking and RLHF.
BRIEF DETAILS: 8B parameter merged model combining Hermes 2 Pro and Llama-3, featuring ChatML format, function calling, and JSON mode capabilities. Strong benchmark scores with MT-Bench average of 8.19.
Brief Details: Qwen1.5-MoE-A2.7B is an efficient MoE transformer model with 14.3B total parameters but only 2.7B activated during runtime, offering 1.74x faster inference than Qwen1.5-7B.
Brief-details: Powerful 13B parameter chatbot based on LLaMA, fine-tuned on 125K ShareGPT conversations. Strong research focus, non-commercial license.
Brief-details: Protogen v2.2 is an anime-focused Stable Diffusion model fine-tuned on extensive datasets, featuring granular adaptive learning and specialized trigger words for enhanced artistic generation.
Brief-details: A powerful 122B parameter LLM based on Llama-3, optimized for creative writing through innovative self-merge architecture. Features 8K context window and multiple quantized versions.
Brief-details: Mini-omni2 is an advanced omni-interactive AI model capable of real-time voice conversations, multimodal understanding (image, audio, text), and seamless speech-to-speech interaction.
Brief-details: Advanced Mixtral-based language model optimized for coding and general tasks, featuring 16k context window and uncensored capabilities with ChatML format support
BRIEF DETAILS: 4-bit quantized version of Falcon-40B-Instruct, optimized for GPU inference. Requires 35GB+ VRAM, delivers advanced language capabilities with experimental GPTQ implementation.
BRIEF DETAILS: RWKV's Eagle-7B: A 7.52B parameter linear transformer model trained on 1.1T tokens, featuring multi-language capabilities and efficient inference costs. Apache 2.0 licensed.
Brief Details: First open-source foundation model for time series forecasting. Features 2.45M params, supports zero-shot forecasting and fine-tuning capabilities for any frequency data.
Brief-details: A 6.7B parameter code generation model trained using OSS-Instruct methodology, focusing on high-quality code generation with minimal bias.
Brief-details: Multilingual streaming translation model supporting 96 languages for ASR and 101 languages for translation, with real-time text/speech output capabilities at 2.5B parameters.
Brief Details: Hertz-dev: 8.5B parameter transformer model for full-duplex conversational audio, trained on 20M hours of data. Features 120ms latency on RTX 4090.
Brief-details: An artistic LoRA model trained on "how to draw" style tutorials, using FLUX.1-dev as base model. Creates step-by-step drawing-guide aesthetics.
Brief Details: A multilingual text-to-speech model supporting English, Chinese, and Japanese, trained on 300k hours of audio data with non-commercial license.
Brief Details: A specialized sci-fi/anime focused image generation model built on Stable Diffusion v1-5, featuring granular adaptive learning and optimized for artistic outputs.
Brief-details: A powerful 14B parameter multilingual LLM from Alibaba Cloud with strong performance across knowledge, reasoning, and coding tasks. Supports efficient tokenization and extended context lengths.