Llama-3-Instruct-Neurona-8b

Property	Value
Parameter Count	8.03B
Base Model	Meta-Llama-3-8B-Instruct
License	Llama3
Training Hardware	4x NVIDIA A100 80GB
Languages	Spanish, English

What is Llama-3-Instruct-Neurona-8b?

Llama-3-Instruct-Neurona-8b is a specialized bilingual language model that builds upon Meta's Llama-3 architecture. This model represents a significant advancement in Spanish language AI capabilities, trained on an extensive collection of 24 diverse datasets ranging from medical texts to economic data.

Implementation Details

The model was trained using the Axolotl framework on 4 NVIDIA A100 80GB GPUs. It implements BF16 precision and utilizes advanced techniques like gradient checkpointing and flash attention for optimal performance. The training configuration includes a cosine learning rate scheduler with a 0.03 warmup ratio and uses the AdamW optimizer.

Sequence length: 8192 tokens
Learning rate: 0.00007
Gradient accumulation steps: 32
Training epochs: 2

Core Capabilities

Bilingual instruction following in Spanish and English
RAG (Retrieval-Augmented Generation)
Function calling and code assistance
Question answering and summarization
Document translation between Spanish and English
Medical and economic domain expertise

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its comprehensive training on Spanish-language datasets while maintaining English capabilities, making it especially effective for bilingual applications and specialized tasks like medical text processing and economic analysis.

Q: What are the recommended use cases?

The model excels in bilingual scenarios including translation, summarization, and specialized domain tasks. It's particularly well-suited for RAG applications, function calling, and code assistance in both Spanish and English contexts.