Einstein-v6.1-Llama3-8B

Weyaxi

Einstein-v6.1-Llama3-8B is an 8B parameter LLM fine-tuned on 38 datasets, optimized for STEM tasks with strong performance in science and math reasoning.

Property	Value
Parameter Count	8.03B
Base Model	Meta-Llama-3-8B
License	Other
Training Hardware	8xRTX3090 + 1xRTXA6000

What is Einstein-v6.1-Llama3-8B?

Einstein-v6.1-Llama3-8B is a specialized language model fine-tuned from Meta's Llama-3-8B architecture, specifically optimized for STEM-related tasks and scientific reasoning. The model demonstrates impressive capabilities across various benchmarks, including a 66.19% accuracy on MMLU and 66.11% on GSM8k mathematical reasoning tasks.

Implementation Details

The model was trained using the Axolotl framework with a combination of 38 carefully curated datasets. It employs the ChatML prompt template format and utilizes advanced training techniques including gradient checkpointing and flash attention for optimal performance.

Training utilized BF16 precision with sample packing
Implemented with cosine learning rate scheduler
Trained for 2 epochs with 2026 total steps
Uses flash attention and gradient checkpointing for efficiency

Core Capabilities

Strong performance in scientific reasoning (62.46% on AI2 Reasoning Challenge)
Exceptional results in general knowledge (82.41% on HellaSwag)
Advanced mathematical problem-solving (66.11% on GSM8k)
Robust truthfulness evaluation (55.1% on TruthfulQA)

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its specialized training on STEM-focused datasets, making it particularly effective for scientific and mathematical reasoning tasks while maintaining strong general-purpose capabilities.

Q: What are the recommended use cases?

This model is ideal for scientific research, educational applications, mathematical problem-solving, and general knowledge tasks requiring precise technical understanding.