Llemma 34B

Property	Value
Base Model	Code Llama 34B
Training Data	Proof-Pile-2
License	Llama 2
Paper	arXiv:2310.10631

What is llemma_34b?

Llemma 34B is a specialized language model designed specifically for mathematical reasoning and computation. Developed by EleutherAI, it builds upon the Code Llama 34B architecture and has been fine-tuned on the Proof-Pile-2 dataset for 50B tokens, making it particularly adept at mathematical tasks and formal theorem proving.

Implementation Details

The model leverages the powerful Code Llama architecture while specializing in mathematical content. Its training focuses on mathematical reasoning, computational tasks, and formal proofs, making it especially suitable for advanced mathematical applications.

Based on Code Llama 34B architecture
Trained on specialized mathematical content from Proof-Pile-2
Supports both English language and mathematical notation
Optimized for chain-of-thought reasoning

Core Capabilities

Achieves 51.5% accuracy on GSM8k (single-shot)
Demonstrates 71.9% accuracy on SAT mathematics
Reaches 25.0% accuracy on MATH dataset
Excellent performance in majority voting scenarios (up to 69.3% on GSM8k)
Strong capabilities in computational mathematics and theorem proving

Frequently Asked Questions

Q: What makes this model unique?

Llemma 34B stands out for its specialized focus on mathematical reasoning and superior performance compared to similarly-sized models. It outperforms both Llama-2 and Code Llama in mathematical tasks, while being competitive with Minerva at a smaller parameter count.

Q: What are the recommended use cases?

The model is ideal for mathematical problem-solving, formal theorem proving, chain-of-thought reasoning in mathematics, and computational mathematics tasks. It's particularly effective when used with majority voting for complex mathematical problems.

llemma_34b