Llama_3.1_8b_Medusa_v1.01

Maintained By
Nexesenex

Llama_3.1_8b_Medusa_v1.01

PropertyValue
Base ModelLlama 3.1 8B
Merge MethodModel Stock
HuggingFaceLink
Average Benchmark27.38

What is Llama_3.1_8b_Medusa_v1.01?

Llama_3.1_8b_Medusa_v1.01 is a sophisticated merged language model created by Nexesenex, combining medical and general knowledge capabilities. Built on the Dobby-Mini-Unhinged-Llama-3.1-8B base, it integrates Mediver and Smarteaz variants to create a versatile model optimized for various tasks.

Implementation Details

The model utilizes the Model Stock merge method, incorporating two key models: Llama_3.1_8b_Mediver_V1.01 and Llama_3.1_8b_Smarteaz_V1.01, each weighted equally at 1.0. The implementation uses bfloat16 dtype and includes normalized weights with an automatic chat template.

  • Union-based tokenizer implementation
  • Normalized weight distribution
  • BFloat16 precision for optimal performance

Core Capabilities

  • Strong performance in IFEval with 76.85% accuracy (0-shot)
  • Decent BBH performance at 30.03% (3-shot)
  • MMLLU-PRO capability of 28.13% (5-shot)
  • Mathematical reasoning with MATH Lvl 5 at 14.65% (4-shot)

Frequently Asked Questions

Q: What makes this model unique?

This model's uniqueness lies in its balanced merger of medical knowledge (Mediver) and general intelligence (Smarteaz) components, creating a versatile model that performs well across various benchmarks, particularly in zero-shot inference tasks.

Q: What are the recommended use cases?

Given its strong performance in IFEval and moderate capabilities across other benchmarks, this model is best suited for general inference tasks, particularly those requiring medical domain knowledge combined with general reasoning abilities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.