Yi-6B-200K

Maintained By
01-ai

Yi-6B-200K

PropertyValue
Parameter Count6.06B parameters
Model TypeText Generation
ArchitectureTransformer (Llama-based)
Context Window200K tokens
Training Data3T tokens
LicenseApache 2.0
PaperYi Tech Report

What is Yi-6B-200K?

Yi-6B-200K is part of the Yi series of open-source large language models developed by 01.AI. It's a bilingual (English/Chinese) base model that features an impressive 200K context window while maintaining the efficient 6B parameter architecture. The model is built on the Llama architecture but trained from scratch on 3T tokens of multilingual data.

Implementation Details

The model uses BF16 tensor type and implements the Transformer architecture with several optimizations. It's designed for both research and production environments, requiring approximately 15GB of VRAM for base operation. The extended 200K context window (roughly equivalent to 400,000 Chinese characters) makes it particularly suitable for long-form content processing.

  • Built on Llama architecture while being independently trained
  • Optimized for bilingual performance (English/Chinese)
  • Implements advanced context handling for 200K token sequences
  • Uses efficient BF16 precision for optimal performance

Core Capabilities

  • Long-form text generation and processing
  • Bilingual understanding and generation
  • Advanced common-sense reasoning
  • Robust reading comprehension
  • Efficient handling of extended context windows

Frequently Asked Questions

Q: What makes this model unique?

The combination of a relatively small parameter count (6B) with an extensive 200K context window makes this model uniquely efficient for long-form content processing. It provides an excellent balance between computational requirements and performance capabilities.

Q: What are the recommended use cases?

The model is well-suited for personal and academic use, particularly in scenarios requiring processing of long documents, bilingual content generation, and research applications. It's especially effective for tasks requiring extended context understanding.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.