Brief-details: Advanced multimodal LLM series with models ranging from 1B to 78B parameters, combining vision and language capabilities with state-of-the-art pre-training strategies
Brief-details: A sophisticated 8B parameter LLama-3.1-based merged model combining 14 specialized models, optimized for diverse tasks using model stock merge method.
Brief Details: Sentence transformer model based on facebook/drama-base with 768-dim output, optimized for semantic similarity and search tasks. Fine-tuned on STS-B dataset.
BRIEF-DETAILS: 8B parameter LLaMA-based model with various GGUF quantization options (2.1GB-6.7GB), optimized for efficiency and performance balance
Brief-details: A quantized version of Babel-9B-Chat offering multiple GGUF variants for efficient deployment, with sizes ranging from 3.6GB to 18.1GB and optimized performance options.
BRIEF-DETAILS: 8B parameter GGUF-quantized LLaMA model with multiple compression variants optimized for efficiency (3.3GB-16.2GB), featuring Q2 to Q8 quantization options
Brief-details: A LoRA adapter for Llama-3.1-70B-Instruct, trained on specific datasets using 8xA100s with FSDP, featuring BF16 mixed-precision and 4e-4 learning rate.
BRIEF DETAILS: Salesforce's research-focused MoE model, designed for academic purposes with small parameter footprint. Features mixture-of-experts architecture with ethical usage guidelines.
Brief-details: Llamacpp imatrix quantized versions of DeepScaleR-1.5B model with various compression levels (0.77GB-7.11GB), optimized for different hardware configurations
Brief Details: T5-small model fine-tuned on CommonGen dataset for enhanced text generation with conceptual combinations, optimized for natural language generation tasks.
Brief-details: Meta's 8B parameter Llama 3.1 model optimized with unsloth's 4-bit quantization for efficient inference. Supports multilingual text generation and tool use with 128k context.
BRIEF-DETAILS: Korean-optimized sentence embedding model based on E5-large, offering 1024-dimensional vectors for semantic analysis with strong performance on Korean language tasks
BRIEF-DETAILS: Neural machine translation model for Polish to Spanish translation, based on transformer architecture with BLEU score of 46.9 and chrF of 0.654.
Brief Details: A GGUF quantized version of TinyR1-32B offering various compression levels from 12.4GB to 34.9GB, with recommended Q4_K variants for optimal performance balance.
Brief Details: Emotion classification model built on DistilBERT, fine-tuned for 32 emotion classes from Empathetic Dialogues dataset. Extends go-emotions dataset work.
Brief-details: Helsinki-NLP's English-to-Afrikaans neural MT model with impressive BLEU score of 56.1, based on transformer-align architecture and OPUS dataset.
Brief Details: YuE-s2-1B-general is a 1B parameter music generation model capable of transforming lyrics into complete songs with vocals and accompaniment, supporting multiple languages and genres.
Brief-details: Polish to French neural machine translation model by Helsinki-NLP, achieving 49.0 BLEU score on Tatoeba test set using transformer architecture.
Brief-details: Vicuna-7B is an open-source LLaMA-based chatbot trained on 70K ShareGPT conversations, developed without ethical filtering for research purposes.
BRIEF-DETAILS: Lora by naonovn: A specialized LoRA adapter model designed to enhance stable diffusion capabilities, with connections to the ChilloutMix ecosystem.
Brief Details: AIChan_Model is a freely shared AI model by GIMG, hosted on HuggingFace. Limited documentation available but accessible for community use.