ALIA-40b

ALIA-40b

BSC-LT

ALIA-40b is a powerful 40B parameter multilingual LLM trained on 6.9T tokens across 35 European languages, optimized for Spanish co-official languages with Apache 2.0 license

PropertyValue
Parameter Count40.4B
ArchitectureDecoder-only Transformer
Context Length4,096 tokens
Training Data6.9T tokens
Languages35 European languages + code
LicenseApache 2.0

What is ALIA-40b?

ALIA-40b is a highly advanced multilingual language model developed by Barcelona Supercomputing Center's Language Technologies unit. It represents a significant advancement in multilingual AI, having been pre-trained from scratch on 6.9 trillion tokens across 35 European languages and code. The model particularly emphasizes Spanish co-official languages (Spanish, Catalan, Galician, and Basque) through strategic data sampling.

Implementation Details

The model employs a sophisticated architecture with 48 layers, 8,192 hidden size, and 64 attention heads. It utilizes modern efficiency techniques including Flash Attention and Grouped Query Attention with 8 query groups. The training was conducted on MareNostrum 5, a pre-exascale supercomputer, using NVIDIA's NeMo Framework.

  • Vocabulary size: 256,000 tokens
  • Precision: bfloat16
  • Embedding type: RoPE
  • Activation Function: SwiGLU
  • Layer normalization: RMS Norm

Core Capabilities

  • Multilingual text generation across 35 European languages
  • Programming language understanding and generation
  • Strong performance in Spanish co-official languages
  • Versatile applications from research to commercial use
  • Base model suitable for further fine-tuning

Frequently Asked Questions

Q: What makes this model unique?

ALIA-40b stands out for its balanced multilingual capabilities, especially its enhanced performance in Spanish co-official languages. The model's training data was carefully curated and sampled to ensure proper representation of minority languages while maintaining strong performance across all supported languages.

Q: What are the recommended use cases?

The model is designed for both research and commercial applications in supported languages. As a base model, it's particularly well-suited for language generation tasks or further fine-tuning for specific use cases. However, it should not be used for malicious activities or in production environments without proper risk assessment.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026