Magnum V2 72B

Property	Value
Parameter Count	72.7B
Base Model	Qwen2-72B-Instruct
License	Tongyi Qianwen
Supported Languages	9 (en, fr, de, es, it, pt, ru, zh, ja)
Training Infrastructure	8x AMD Instinct MI300X Accelerators

What is magnum-v2-72b?

Magnum V2 72B is the seventh iteration in Anthracite's series of language models designed to replicate Claude 3's prose quality. Built on Qwen2-72B-Instruct, this model represents a significant advancement in multilingual capabilities and instruction following.

Implementation Details

The model underwent a meticulous training process spanning 2 epochs using state-of-the-art AMD MI300X accelerators. Notable technical specifications include a weight decay of 0.01 to prevent catastrophic forgetting and a peak learning rate of 4e-6. The model utilizes ChatML formatting for interactions and supports 16k token sample packing.

Achieves 75.6% accuracy on IFEval (0-shot)
Scores 57.85% on BBH (3-shot)
Demonstrates 31.65% accuracy on MATH Level 5 problems
Supports 9 different languages for multilingual applications

Core Capabilities

High-quality prose generation similar to Claude 3
Robust instruction following with ChatML format
Advanced mathematical reasoning capabilities
Multilingual support across major world languages
Extended context handling with 16k tokens

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its optimization for Claude-like responses while maintaining strong performance across various benchmarks. It combines the robust architecture of Qwen2 with carefully curated training datasets to achieve high-quality outputs.

Q: What are the recommended use cases?

This model excels in scenarios requiring high-quality prose generation, multilingual communication, and complex reasoning tasks. It's particularly suitable for applications needing human-like responses across different languages and domains.

magnum-v2-72b