LLaMA-Pro-8B

Maintained By
TencentARC

LLaMA-Pro-8B

PropertyValue
Parameter Count8.36B
Model TypeLanguage Model
ArchitectureLLaMA2-based Transformer
LicenseLLaMA2
DeveloperTencentARC
Tensor TypeBF16

What is LLaMA-Pro-8B?

LLaMA-Pro-8B is an advanced language model developed by Tencent's ARC Lab, built upon the LLaMA2 architecture. It represents a significant evolution with 8.36 billion parameters, specifically enhanced for programming and mathematical tasks. The model has been trained on an extensive dataset of 80 billion tokens, including specialized code and mathematical content.

Implementation Details

The model implements an enhanced version of the LLaMA architecture with additional Transformer blocks. It utilizes BF16 tensor types for efficient computation and has been optimized for both general language understanding and domain-specific tasks.

  • Built on LLaMA2-7B architecture with additional specialized training
  • Trained on 80 billion tokens of diverse data
  • Optimized for programming and mathematical reasoning
  • Implements advanced transformer architecture

Core Capabilities

  • Outperforms LLaMA2-7B in multiple benchmarks (44.2 vs 39.62 average score)
  • Enhanced performance in programming tasks (28.66% on HumanEval)
  • Improved mathematical reasoning (25.42% on GSM8K-PoT)
  • Strong general language understanding (77.94% on Hellaswag)
  • Competitive MT Bench scores in its instruct version (6.32)

Frequently Asked Questions

Q: What makes this model unique?

LLaMA-Pro-8B stands out through its specialized focus on programming and mathematical tasks while maintaining strong general language capabilities. It achieves this through additional transformer blocks and targeted training on domain-specific content.

Q: What are the recommended use cases?

The model is particularly well-suited for tasks involving programming, mathematical reasoning, and general language understanding. It's ideal for applications requiring integration of natural language with technical content, code generation, and mathematical problem-solving.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.