BRIEF-DETAILS: State-of-the-art code embedding model (7B params) supporting 9 programming languages with 32k context window and 3584-dim embeddings. Optimized for code retrieval & RAG.
Brief-details: A 16B-parameter MoE model with 3B active parameters, trained on 5.7T tokens using Muon optimizer. Excels in multilingual tasks and achieves SOTA performance with fewer FLOPs.
BRIEF DETAILS: A 16B-parameter MoE model with 3B active parameters, trained on 5.7T tokens using Muon optimizer. Achieves SOTA performance with 2x sample efficiency vs Adam.
Brief Details: EQ-SDXL-VAE is an advanced VAE model that enhances SDXL's latent space through equivariance regularization, improving image reconstruction quality and semantic preservation.
Brief Details: A 70B parameter model built on Llama 3.3, specifically designed for adventure gaming with darker themes. Trained on text adventures and roleplay data to enable challenging, dangerous narratives.
Brief Details: PatientSeek is an AI model by whyhow-ai focused on healthcare applications, with strict ethical guidelines preventing harmful human experimentation.
BRIEF-DETAILS: Janus-Pro-1B: A unified multimodal AI model combining understanding and generation capabilities, built on DeepSeek-LLM with SigLIP-L vision encoding.
Brief-details: An uncensored variant of DeepSeek-R1-Distill-Qwen-32B created through abliteration, focused on removing refusal behaviors while maintaining core capabilities
BRIEF-DETAILS: DeepSeek-R1-AWQ is a quantized version optimized for efficient inference, running on 8x GPUs with 38-48 TPS performance and full context length support.
Brief-details: A highly optimized GGUF quantization of DeepSeek's 14B parameter model, offering various compression options from 4.7GB to 59GB with different quality-size tradeoffs. Notable for ARM/AVX optimization.
Brief Details: Absynth-2.0 is an enhanced SD 3.5 model featuring hyper-detailed image generation through novel negative LoRA training and inverse fine-tuning techniques.
Brief-details: Flex.1-alpha is an 8B parameter rectified flow transformer for text-to-image generation, featuring guidance embedder, true CFG capability, and Apache 2.0 license.
BRIEF DETAILS: DeepSeek-V3-Base is a 671B parameter MoE model with 37B active parameters, featuring FP8 training, 128K context window, and state-of-the-art performance across various benchmarks.
Brief Details: DeepSeek-VL2 is an advanced MoE vision-language model with 4.5B parameters, offering state-of-the-art performance in visual QA, OCR, and document understanding.
BRIEF-DETAILS: 278M parameter multilingual embedding model supporting 12 languages. Generates 768-dim vectors for text similarity and retrieval. Apache 2.0 licensed.
Brief-details: GemmaX2-28-2B-v0.1 is a 2B-parameter multilingual translation model supporting 28 languages, built on Gemma2-2B with extensive pretraining and fine-tuning.
BRIEF DETAILS: Russian language assistant model (8B params) fine-tuned from YandexGPT, specialized in dialogue and task assistance with Llama-3 prompt format
Brief Details: Fast3R is a groundbreaking 3D reconstruction model capable of processing 1000+ images in a single forward pass, utilizing ViT Large architecture at 512 resolution.
Brief-details: Japanese language instruction-tuned 0.5B parameter model with strong performance in Japanese/English tasks, outperforming similar-sized models in benchmarks
Brief Details: Lightweight 256M parameter multimodal model for video/image analysis. Efficient (1.38GB GPU RAM) with strong performance on video understanding tasks.
BRIEF-DETAILS: 14B parameter uncensored GGUF model with multiple quantization options (Q2-Q8). Offers efficient deployment with sizes from 5.9GB to 15.8GB.