Kanana-nano-2.1b-instruct

Property	Value
Parameter Count	2.1 Billion
License	CC-BY-NC-4.0
Author	Kakao Corporation
Model Type	Instruction-tuned Language Model
Paper	arXiv:2502.18934

What is kanana-nano-2.1b-instruct?

Kanana-nano-2.1b-instruct is a compute-efficient bilingual language model developed by Kakao Corporation, specifically designed to excel in both Korean and English language tasks. As part of the larger Kanana model series, this 2.1B parameter model represents the compact version optimized for instruction-following tasks while maintaining strong performance particularly in Korean language capabilities.

Implementation Details

The model leverages several innovative techniques including high-quality data filtering, staged pre-training, depth up-scaling, and pruning and distillation to achieve its impressive performance despite its relatively small size. The model shows particularly strong results in Korean-language benchmarks, achieving 44.80 on KMMLU and 77.09 on HAERAE.

Optimized for both Korean and English language processing
Implements advanced training techniques for compute efficiency
Supports instruction-following capabilities
Trained without using any Kakao user data

Core Capabilities

Bilingual understanding and generation
Strong performance on instruction-following tasks (MT-Bench score: 6.400)
Competitive performance in code-related tasks (HumanEval: 31.10)
Mathematical reasoning capabilities (GSM8K: 46.32)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its exceptional efficiency-to-performance ratio, particularly in Korean language tasks. Despite its compact 2.1B parameter size, it achieves competitive results against larger models, especially in Korean-language benchmarks.

Q: What are the recommended use cases?

The model is well-suited for bilingual applications requiring Korean and English language processing, instruction-following tasks, and general language understanding. It's particularly effective for scenarios where computational resources are limited but good performance is still required.