YandexGPT-5-Lite-8B-pretrain

Maintained By
yandex

YandexGPT-5-Lite-8B-pretrain

PropertyValue
Parameter Count8 Billion
Context Length32,000 tokens
ArchitectureLLaMA-like
Model URLhttps://huggingface.co/yandex/YandexGPT-5-Lite-8B-pretrain

What is YandexGPT-5-Lite-8B-pretrain?

YandexGPT-5-Lite-8B-pretrain is a powerful language model developed by Yandex, featuring 8 billion parameters and an impressive 32k token context length. The model underwent a two-phase training process, initially trained on 15T tokens of predominantly Russian and English text, followed by a specialized "Powerup" phase with 320B tokens of high-quality data.

Implementation Details

The model features a LLaMA-like architecture and was trained in two distinct phases. The first phase focused on general training with a diverse dataset comprising 60% web pages, 15% code, 10% mathematics, and various other specific data. The second "Powerup" phase utilized a carefully curated dataset including 25% web pages, 19% mathematics, 18% code, and 18% educational content.

  • Optimized tokenizer for Russian language processing
  • 32k token context length (equivalent to approximately 48k tokens in Qwen-2.5)
  • Compatible with major fine-tuning frameworks
  • Supports both HF Transformers and vLLM implementations

Core Capabilities

  • Advanced Russian and English language processing
  • Extended context understanding (32k tokens)
  • Code generation and analysis
  • Mathematical computation and reasoning
  • Educational content processing

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive features include its optimized Russian language processing capabilities, extensive context length, and specialized two-phase training approach. The tokenizer efficiency for Russian language makes it particularly valuable for Russian-language applications.

Q: What are the recommended use cases?

The model is well-suited for various applications including code generation, mathematical analysis, educational content processing, and general language understanding tasks. It's particularly effective for applications requiring extended context understanding and Russian language processing.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.