Ko-Llama3-Luxia-8B

Property	Value
Parameter Count	8.17B
Model Type	Language Model
Architecture	Llama-3
License	Llama3
Training Precision	BF16
Context Length	8K tokens

What is Ko-Llama3-Luxia-8B?

Ko-Llama3-Luxia-8B is a specialized Korean language model developed by Saltlux AI Labs, based on Meta's Llama-3 architecture. This model represents a significant advancement in Korean language processing, featuring an expanded vocabulary with 17,536 additional Korean tokens and extensive training on over 100GB of carefully curated Korean text data.

Implementation Details

The model was trained using 8 NVIDIA H100 80GB GPUs, implementing Group Query Attention (GQA) and utilizing a learning rate of 1e-5 with a batch size of 128. The training data encompasses diverse domains including news, legal documents, patents, medical texts, historical content, and both formal and conversational Korean text.

Extended vocabulary size of 145,792 tokens (original Llama-3: 128,256)
Specialized Korean tokenization capabilities
8K token context window
Trained with BF16 precision

Core Capabilities

Enhanced Korean text generation and understanding
Improved tokenization of Korean phrases and sentences
Maintains English language capabilities while specializing in Korean
Suitable for various natural language tasks in Korean

Frequently Asked Questions

Q: What makes this model unique?

The model's unique strength lies in its specialized Korean language capabilities, achieved through extensive Korean token additions and domain-specific training data, while maintaining the robust foundation of Llama-3's architecture.

Q: What are the recommended use cases?

The model is primarily designed for research purposes and can be freely utilized for various natural language generation tasks, particularly those involving Korean language processing and generation.