Aguila-7B
Property | Value |
---|---|
Parameter Count | 6.85B |
Model Type | Causal Language Model |
Languages | Catalan, Spanish, English |
License | Apache 2.0 |
Training Data | 26B tokens |
Base Model | Falcon-7B |
What is aguila-7b?
Aguila-7B is an advanced trilingual language model developed by the Language Technologies Unit at Barcelona Supercomputing Center. Built upon the Falcon-7B architecture, it has been specifically optimized for Catalan, Spanish, and English language processing. The model represents a significant advancement in multilingual AI, trained on a diverse corpus of 26B tokens from various sources including Wikipedia, legal texts, biomedical content, and web crawlings.
Implementation Details
The model utilizes a byte version of BPE tokenization with a 50,257-token vocabulary. Training was conducted on 8 NVIDIA H100 GPUs over 320 hours, using Adam optimizer with a learning rate of 5e-05 and linear scheduler.
- Distributed training across 8 NVIDIA H100 GPUs (80GB RAM)
- Implemented using PyTorch 2.0.0 and Transformers 4.30.2
- Uses FP16 precision for efficient computation
- Customized embedding layer adaptation from Falcon-7B
Core Capabilities
- Trilingual text generation with balanced language distribution (41.79% Catalan, 41.38% Spanish, 16.84% English)
- Specialized in causal language modeling tasks
- Suitable for downstream task fine-tuning
- Handles diverse content domains including biomedical, legal, and news texts
Frequently Asked Questions
Q: What makes this model unique?
Aguila-7B stands out for its specialized focus on Catalan and Spanish languages while maintaining English capabilities. It's one of the few large language models specifically optimized for these languages, trained on a comprehensive and diverse dataset.
Q: What are the recommended use cases?
The model is primarily designed for text generation tasks and can be fine-tuned for specific downstream applications. It's particularly useful for applications requiring multilingual capabilities in Catalan, Spanish, and English, such as content generation, translation assistance, and language understanding tasks.