Vikhr-YandexGPT-5-Lite-8B-it

Property	Value
Base Model	YandexGPT-5-Lite-8B-pretrain
Parameter Count	8 Billion
Training Data	GrandMaster-PRO-MAX, Grounded-RAG-RU-v2
Language Support	Russian and English (Bilingual)
Paper	Vikhr: Advancing Open-Source Bilingual Instruction-Following Large Language Models

What is Vikhr-YandexGPT-5-Lite-8B-it?

Vikhr-YandexGPT-5-Lite-8B-it is an instruction-tuned language model designed specifically for bilingual Russian and English language processing. Built upon YandexGPT-5-Lite-8B-pretrain, this model has been fine-tuned using Supervised Fine-Tuning (SFT) on specialized datasets including GrandMaster-PRO-MAX and Grounded-RAG-RU-v2.

Implementation Details

The model underwent extensive training using two key datasets: a 150k instruction dataset (GrandMaster-PRO-MAX) featuring built-in Chain-of-Thought (CoT) reasoning, and a 50k dialogue dataset (Grounded-RAG-RU-v2) specifically designed for RAG capabilities. The model supports various quantization options including GGUF, MLX, 4-bit, and 8-bit variants.

Specialized RAG implementation with document-based context handling
Temperature-sensitive performance (optimal at 0.1-0.5)
Support for HTML, Markdown, and Plain Text document formats
Maximum context length handling up to 4k symbols per document

Core Capabilities

Bilingual instruction following in Russian and English
Advanced RAG (Retrieval-Augmented Generation) capabilities
Chain-of-Thought reasoning
Document grounding and contextual response generation
Various deployment options through different quantization methods

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its specialized bilingual capabilities combined with advanced RAG implementation and Chain-of-Thought reasoning, particularly optimized for Russian and English language processing.

Q: What are the recommended use cases?

The model is particularly well-suited for document-grounded question answering, bilingual instruction following, and applications requiring context-aware responses. It's optimized for deployment scenarios requiring both Russian and English language processing capabilities.