Llama-3-Instruct-Neurona-8b

Property	Value
Parameter Count	8.03B
Model Type	Language Model (Transformer)
Base Model	Meta-Llama-3-8B-Instruct
License	Llama3
Training Infrastructure	4x Nvidia A100 80GB

What is Llama-3-Instruct-Neurona-8b?

Llama-3-Instruct-Neurona-8b is a specialized multilingual language model focusing on Spanish and English capabilities. Built on Meta's Llama-3 architecture, this model has been extensively trained on 24 diverse datasets to enable a wide range of functionalities including RAG (Retrieval-Augmented Generation), function calling, code assistance, and translation tasks.

Implementation Details

The model was trained using the Axolotl framework on 4 Nvidia A100 80GB GPUs. It implements BF16 precision and utilizes advanced training techniques including gradient checkpointing and flash attention for optimal performance. The training process incorporated a cosine learning rate scheduler with a warmup ratio of 0.03 and used the adamw_torch_fused optimizer.

Sequence length: 8192 tokens
Sample packing enabled for efficient training
Gradient accumulation steps: 32
Learning rate: 0.00007
NEFTune noise alpha: 5

Core Capabilities

Bilingual processing in Spanish and English
RAG operations for enhanced knowledge retrieval
Function calling and code assistance
Translation between Spanish and English
Question answering and summarization
Medical domain knowledge processing
Inclusive language understanding

Frequently Asked Questions

Q: What makes this model unique?

This model's uniqueness lies in its specialized training on a carefully curated mix of Spanish and English datasets, making it particularly effective for bilingual applications and domain-specific tasks like medical text processing and code assistance.

Q: What are the recommended use cases?

The model is well-suited for bilingual applications, translation tasks, code development assistance, medical text processing, and general language understanding in both Spanish and English contexts. It's particularly effective for RAG applications and function calling scenarios.