Mistral_Pro_8B_v0.1

Maintained By
TencentARC

Mistral_Pro_8B_v0.1

PropertyValue
Parameter Count8.99B
LicenseApache-2.0
Tensor TypeBF16
AuthorTencentARC
LanguageEnglish

What is Mistral_Pro_8B_v0.1?

Mistral_Pro_8B_v0.1 is an advanced language model developed by Tencent's ARC Lab, building upon the original Mistral-7B architecture. It represents a significant evolution with 8.99 billion parameters, specifically enhanced for programming and mathematical tasks while maintaining strong general language capabilities.

Implementation Details

The model utilizes additional Transformer blocks beyond the original Mistral architecture, trained on diverse datasets including Cosmopedia, Proof-Pile-2, The Stack, and AutoMathText. It implements BF16 tensor precision for optimal performance and efficiency.

  • Enhanced transformer architecture with specialized blocks
  • Trained on four major datasets focusing on general knowledge, proofs, code, and mathematics
  • Optimized for both general language understanding and domain-specific tasks

Core Capabilities

  • Superior performance on mathematical reasoning (GSM8K: 50.6%)
  • Enhanced code generation capabilities (HumanEval: 32.9%)
  • Improved truthfulness in responses (TruthfulQA: 48.3%)
  • Strong general language understanding (Hellaswag: 82.6%)

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its balanced performance across general language tasks while offering superior capabilities in programming and mathematics, matching or exceeding the performance of models like Gemma-7B in several benchmarks.

Q: What are the recommended use cases?

The model is ideal for applications requiring integrated handling of natural language, programming, and mathematical tasks. It's particularly well-suited for code generation, mathematical problem-solving, and general language understanding tasks.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.