Llemma 34B
Property | Value |
---|---|
Base Model | Code Llama 34B |
Training Data | Proof-Pile-2 |
License | Llama 2 |
Paper | arXiv:2310.10631 |
What is llemma_34b?
Llemma 34B is a specialized language model designed specifically for mathematical reasoning and computation. Developed by EleutherAI, it builds upon the Code Llama 34B architecture and has been fine-tuned on the Proof-Pile-2 dataset for 50B tokens, making it particularly adept at mathematical tasks and formal theorem proving.
Implementation Details
The model leverages the powerful Code Llama architecture while specializing in mathematical content. Its training focuses on mathematical reasoning, computational tasks, and formal proofs, making it especially suitable for advanced mathematical applications.
- Based on Code Llama 34B architecture
- Trained on specialized mathematical content from Proof-Pile-2
- Supports both English language and mathematical notation
- Optimized for chain-of-thought reasoning
Core Capabilities
- Achieves 51.5% accuracy on GSM8k (single-shot)
- Demonstrates 71.9% accuracy on SAT mathematics
- Reaches 25.0% accuracy on MATH dataset
- Excellent performance in majority voting scenarios (up to 69.3% on GSM8k)
- Strong capabilities in computational mathematics and theorem proving
Frequently Asked Questions
Q: What makes this model unique?
Llemma 34B stands out for its specialized focus on mathematical reasoning and superior performance compared to similarly-sized models. It outperforms both Llama-2 and Code Llama in mathematical tasks, while being competitive with Minerva at a smaller parameter count.
Q: What are the recommended use cases?
The model is ideal for mathematical problem-solving, formal theorem proving, chain-of-thought reasoning in mathematics, and computational mathematics tasks. It's particularly effective when used with majority voting for complex mathematical problems.