Llama-3-Instruct-Neurona-8b
Property | Value |
---|---|
Parameter Count | 8.03B |
Base Model | Meta-Llama-3-8B-Instruct |
License | Llama3 |
Training Hardware | 4x NVIDIA A100 80GB |
Languages | Spanish, English |
What is Llama-3-Instruct-Neurona-8b?
Llama-3-Instruct-Neurona-8b is a specialized bilingual language model that builds upon Meta's Llama-3 architecture. This model represents a significant advancement in Spanish language AI capabilities, trained on an extensive collection of 24 diverse datasets ranging from medical texts to economic data.
Implementation Details
The model was trained using the Axolotl framework on 4 NVIDIA A100 80GB GPUs. It implements BF16 precision and utilizes advanced techniques like gradient checkpointing and flash attention for optimal performance. The training configuration includes a cosine learning rate scheduler with a 0.03 warmup ratio and uses the AdamW optimizer.
- Sequence length: 8192 tokens
- Learning rate: 0.00007
- Gradient accumulation steps: 32
- Training epochs: 2
Core Capabilities
- Bilingual instruction following in Spanish and English
- RAG (Retrieval-Augmented Generation)
- Function calling and code assistance
- Question answering and summarization
- Document translation between Spanish and English
- Medical and economic domain expertise
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its comprehensive training on Spanish-language datasets while maintaining English capabilities, making it especially effective for bilingual applications and specialized tasks like medical text processing and economic analysis.
Q: What are the recommended use cases?
The model excels in bilingual scenarios including translation, summarization, and specialized domain tasks. It's particularly well-suited for RAG applications, function calling, and code assistance in both Spanish and English contexts.