Llama 3 Youko 8B

Property	Value
Parameter Count	8.03B
Base Model	meta-llama/Meta-Llama-3-8B
License	Meta Llama 3 Community License
Paper	Research Paper
Supported Languages	Japanese, English

What is llama-3-youko-8b?

Llama 3 Youko 8B is a specialized bilingual language model developed by rinna, built upon Meta's Llama 3 architecture. Named after the Japanese mythical creature Youko (妖狐), this model has undergone continual pre-training on 22B tokens from diverse Japanese and English datasets, significantly enhancing its Japanese language capabilities while maintaining English proficiency.

Implementation Details

The model features a 32-layer transformer architecture with 4096 hidden dimensions, implemented using the EleutherAI/gpt-neox framework. It utilizes BF16 precision and maintains compatibility with the original Llama 3 tokenizer.

Trained on multiple high-quality datasets including Japanese CC-100, C4, OSCAR, The Pile, and Wikipedia
Implements the full Llama 3 architecture with optimizations for Japanese language processing
Supports both inference and text generation tasks

Core Capabilities

Bilingual text generation in Japanese and English
Enhanced performance on Japanese language tasks
Seamless integration with Hugging Face transformers library
Available in both standard and GPTQ quantized versions

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its specialized training on Japanese language content while maintaining Llama 3's original capabilities. The continual pre-training approach on 22B tokens makes it particularly effective for Japanese language tasks.

Q: What are the recommended use cases?

This model is ideal for bilingual applications requiring Japanese and English language processing, including text generation, content creation, and language understanding tasks. It's particularly well-suited for applications needing strong Japanese language capabilities.

llama-3-youko-8b