llemma_7b

Property	Value
License	Llama2
Paper	arXiv:2310.10631
Training Data	Proof-Pile-2, Open-Web-Math
Base Model	Code Llama 7B

What is llemma_7b?

Llemma 7B is a specialized language model designed specifically for mathematical reasoning and computation. Developed by EleutherAI, it's built on Code Llama 7B architecture and trained on the Proof-Pile-2 dataset for 200B tokens, making it particularly adept at mathematical problems and formal proofs.

Implementation Details

The model represents a significant advancement in mathematical language models, leveraging the powerful Code Llama architecture while specializing in mathematical applications. It's trained using PyTorch and optimized for text-generation-inference.

Built on Code Llama 7B base architecture
Trained on specialized mathematical datasets
Supports both English language and mathematical notation
Optimized for chain-of-thought reasoning

Core Capabilities

Outperforms similarly-sized models on GSM8k (36.4% accuracy)
Strong performance on MMLU-STEM (37.7% accuracy)
Exceptional results on SAT math problems (53.1% accuracy)
Capable of majority voting enhancement (up to 54.0% on GSM8k)
Specialized in computational mathematics and formal theorem proving

Frequently Asked Questions

Q: What makes this model unique?

Llemma 7B stands out for its specialized focus on mathematical reasoning, significantly outperforming other models of similar size in mathematical tasks. It's particularly notable for achieving better results than both Llama-2 and Code Llama on mathematical benchmarks.

Q: What are the recommended use cases?

The model is ideal for mathematical problem-solving, chain-of-thought reasoning, computational mathematics, and formal theorem proving. It's particularly well-suited for educational applications, research in mathematical reasoning, and automated mathematical proof assistance.

llemma_7b

llemma_7b

What is llemma_7b?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models