llemma_34b

Maintained By
EleutherAI

Llemma 34B

PropertyValue
Base ModelCode Llama 34B
Training DataProof-Pile-2
LicenseLlama 2
PaperarXiv:2310.10631

What is llemma_34b?

Llemma 34B is a specialized language model designed specifically for mathematical reasoning and computation. Developed by EleutherAI, it builds upon the Code Llama 34B architecture and has been fine-tuned on the Proof-Pile-2 dataset for 50B tokens, making it particularly adept at mathematical tasks and formal theorem proving.

Implementation Details

The model leverages the powerful Code Llama architecture while specializing in mathematical content. Its training focuses on mathematical reasoning, computational tasks, and formal proofs, making it especially suitable for advanced mathematical applications.

  • Based on Code Llama 34B architecture
  • Trained on specialized mathematical content from Proof-Pile-2
  • Supports both English language and mathematical notation
  • Optimized for chain-of-thought reasoning

Core Capabilities

  • Achieves 51.5% accuracy on GSM8k (single-shot)
  • Demonstrates 71.9% accuracy on SAT mathematics
  • Reaches 25.0% accuracy on MATH dataset
  • Excellent performance in majority voting scenarios (up to 69.3% on GSM8k)
  • Strong capabilities in computational mathematics and theorem proving

Frequently Asked Questions

Q: What makes this model unique?

Llemma 34B stands out for its specialized focus on mathematical reasoning and superior performance compared to similarly-sized models. It outperforms both Llama-2 and Code Llama in mathematical tasks, while being competitive with Minerva at a smaller parameter count.

Q: What are the recommended use cases?

The model is ideal for mathematical problem-solving, formal theorem proving, chain-of-thought reasoning in mathematics, and computational mathematics tasks. It's particularly effective when used with majority voting for complex mathematical problems.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.