Llama 3 Youko 8B
Property | Value |
---|---|
Parameter Count | 8.03B |
Base Model | meta-llama/Meta-Llama-3-8B |
License | Meta Llama 3 Community License |
Paper | Research Paper |
Supported Languages | Japanese, English |
What is llama-3-youko-8b?
Llama 3 Youko 8B is a specialized bilingual language model developed by rinna, built upon Meta's Llama 3 architecture. Named after the Japanese mythical creature Youko (妖狐), this model has undergone continual pre-training on 22B tokens from diverse Japanese and English datasets, significantly enhancing its Japanese language capabilities while maintaining English proficiency.
Implementation Details
The model features a 32-layer transformer architecture with 4096 hidden dimensions, implemented using the EleutherAI/gpt-neox framework. It utilizes BF16 precision and maintains compatibility with the original Llama 3 tokenizer.
- Trained on multiple high-quality datasets including Japanese CC-100, C4, OSCAR, The Pile, and Wikipedia
- Implements the full Llama 3 architecture with optimizations for Japanese language processing
- Supports both inference and text generation tasks
Core Capabilities
- Bilingual text generation in Japanese and English
- Enhanced performance on Japanese language tasks
- Seamless integration with Hugging Face transformers library
- Available in both standard and GPTQ quantized versions
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its specialized training on Japanese language content while maintaining Llama 3's original capabilities. The continual pre-training approach on 22B tokens makes it particularly effective for Japanese language tasks.
Q: What are the recommended use cases?
This model is ideal for bilingual applications requiring Japanese and English language processing, including text generation, content creation, and language understanding tasks. It's particularly well-suited for applications needing strong Japanese language capabilities.