kanana-nano-2.1b-instruct

Maintained By
kakaocorp

Kanana-nano-2.1b-instruct

PropertyValue
Parameter Count2.1 Billion
LicenseCC-BY-NC-4.0
AuthorKakao Corporation
Model TypeInstruction-tuned Language Model
PaperarXiv:2502.18934

What is kanana-nano-2.1b-instruct?

Kanana-nano-2.1b-instruct is a compute-efficient bilingual language model developed by Kakao Corporation, specifically designed to excel in both Korean and English language tasks. As part of the larger Kanana model series, this 2.1B parameter model represents the compact version optimized for instruction-following tasks while maintaining strong performance particularly in Korean language capabilities.

Implementation Details

The model leverages several innovative techniques including high-quality data filtering, staged pre-training, depth up-scaling, and pruning and distillation to achieve its impressive performance despite its relatively small size. The model shows particularly strong results in Korean-language benchmarks, achieving 44.80 on KMMLU and 77.09 on HAERAE.

  • Optimized for both Korean and English language processing
  • Implements advanced training techniques for compute efficiency
  • Supports instruction-following capabilities
  • Trained without using any Kakao user data

Core Capabilities

  • Bilingual understanding and generation
  • Strong performance on instruction-following tasks (MT-Bench score: 6.400)
  • Competitive performance in code-related tasks (HumanEval: 31.10)
  • Mathematical reasoning capabilities (GSM8K: 46.32)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its exceptional efficiency-to-performance ratio, particularly in Korean language tasks. Despite its compact 2.1B parameter size, it achieves competitive results against larger models, especially in Korean-language benchmarks.

Q: What are the recommended use cases?

The model is well-suited for bilingual applications requiring Korean and English language processing, instruction-following tasks, and general language understanding. It's particularly effective for scenarios where computational resources are limited but good performance is still required.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.