Llama-3-8B

Maintained By
AI-Sweden-Models

Llama-3-8B

PropertyValue
Base ModelMeta-Llama-3-8B
Training DataNordic Pile (227B tokens)
Training Infrastructure92 Nvidia A100 GPUs
Model RepositoryHugging Face

What is Llama-3-8B?

Llama-3-8B is a specialized language model developed by AI-Sweden-Models, built upon Meta's Llama 3 architecture. This model represents a significant advancement in Nordic language processing, having undergone extensive fine-tuning on a carefully curated dataset comprising Swedish, Norwegian, and Danish content from The Nordic Pile.

Implementation Details

The model underwent a comprehensive 30-day training process on the Rattler supercomputer at Dell Technologies Edge Innovation Center, utilizing 23 nodes with 4 Nvidia A100 GPUs each. The training configuration employed sophisticated parameters including a learning rate of 2e-5, cosine scheduler, and AdamW optimizer with gradient accumulation.

  • Sequence length: 8192 tokens
  • Training duration: 30 days
  • Hardware: 92 Nvidia A100 GPUs
  • Gradient accumulation steps: 16
  • Training methodology: Full parameter fine-tuning

Core Capabilities

  • Specialized in Nordic language processing
  • Enhanced performance on Swedish, Norwegian, and Danish content
  • Base model capabilities with fine-tuning potential
  • Efficient text generation with customizable parameters
  • Support for long context windows

Frequently Asked Questions

Q: What makes this model unique?

This model stands out due to its specialized training on Nordic languages, making it particularly effective for Swedish, Norwegian, and Danish content processing. The full parameter fine-tuning approach ensures comprehensive adaptation to these languages while maintaining the robust capabilities of the original Llama 3 architecture.

Q: What are the recommended use cases?

The model is designed as a base model that can be further fine-tuned for specific applications. It's particularly well-suited for tasks involving Nordic languages, including text generation, content creation, and language understanding tasks. The example in the documentation demonstrates its capability in generating coherent Swedish text.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.