Poro-34B
Property | Value |
---|---|
Parameter Count | 34.2B |
Architecture | BLOOM with ALiBi embeddings |
Training Data | 1 trillion tokens |
License | Apache 2.0 |
Paper | arXiv:2404.01856 |
What is Poro-34B?
Poro-34B is a powerful multilingual language model developed through collaboration between SiloGen, TurkuNLP group, and HPLT. Named after the Finnish word for reindeer, this model represents a significant advancement in multilingual AI, specifically designed to excel in Finnish and English language processing while maintaining strong code generation capabilities.
Implementation Details
The model leverages a BLOOM architecture with 54 layers and 56 attention heads, implemented with a dimension size of 7168. It utilizes ALiBi embeddings for context length extrapolation and was trained on the LUMI supercomputer using 512 AMD MI250X GPUs.
- Trained on 1 trillion tokens across multiple datasets
- Uses bfloat16 precision
- Custom 128K vocabulary tokenizer
- Implements 3D parallelism strategy (TP=2, PP=4, DP=128)
Core Capabilities
- Bilingual proficiency in Finnish and English
- Code generation and understanding
- Translation between Finnish and English
- Context length of 2048 tokens
- Support for context length extrapolation
Frequently Asked Questions
Q: What makes this model unique?
Poro-34B stands out for its specialized focus on Finnish language support while maintaining strong English and coding capabilities. It's one of the few large language models specifically optimized for Finnish language processing, trained on a comprehensive mix of Finnish cultural and linguistic data.
Q: What are the recommended use cases?
As a base model, Poro-34B requires fine-tuning for specific applications. It's particularly well-suited for bilingual applications involving Finnish and English, code generation tasks, and scenarios requiring deep understanding of Finnish cultural context. However, users should note that it's primarily a research release and may need additional fine-tuning for production use cases.