Labradorite-13b

Property	Value
Base Model	LLaMA-2-13b
Teacher Model	Mixtral-8x7B-Instruct
License	LLAMA 2 Community License
Language	Primarily English

What is labradorite-13b?

Labradorite-13b is an advanced language model developed by IBM Research using their novel Large-scale Alignment for chatBots (LAB) methodology. Built on the LLaMA-2-13b architecture and trained using Mixtral-8x7B-Instruct as a teacher model, it achieves impressive performance across various benchmarks, including a 7.23 score on MTBench.

Implementation Details

The model implements a sophisticated three-component system: taxonomy-driven data curation, large-scale synthetic data generation, and two-phased training with replay buffers. This approach allows for incremental knowledge and skill addition without suffering from catastrophic forgetting.

Taxonomy-based training structure for diverse knowledge domains
Two-phase training: knowledge tuning followed by skills tuning
Specialized hyperparameters with larger batch sizes and optimized learning rates
Built-in safety checks during data generation

Core Capabilities

Strong performance in benchmarks (MTBench: 7.23, MMLU: 58.89%, ARC-C: 61.69%)
Effective knowledge retention and skill composition
Balanced handling of both foundational and compositional skills
Safe and grounded response generation

Frequently Asked Questions

Q: What makes this model unique?

Labradorite-13b's uniqueness lies in its LAB methodology, which enables efficient learning from a smaller teacher model (Mixtral-8x7B) while achieving competitive performance with models trained on GPT-4 generated data.

Q: What are the recommended use cases?

The model excels in general-purpose dialogue, reasoning tasks, and creative writing. It's particularly well-suited for applications requiring balanced performance across knowledge retrieval and skill composition, with built-in safety considerations.

labradorite-13b