line-distilbert-base-japanese

Maintained By
line-corporation

LINE DistilBERT Japanese

PropertyValue
Parameter Count68M
ArchitectureDistilBERT (6 layers, 768 hidden, 12 heads)
Training Data131GB Japanese web text
LicenseApache 2.0
Vocabulary Size32,768 tokens

What is line-distilbert-base-japanese?

LINE DistilBERT Japanese is a compressed BERT model specifically designed for Japanese language processing, developed by LINE Corporation. This model represents a significant achievement in creating an efficient yet powerful Japanese language model, trained on an extensive 131GB dataset of Japanese web text.

Implementation Details

The model implements a distilled architecture of BERT, featuring 6 layers, 768-dimensional hidden states, and 12 attention heads, resulting in 68M parameters. It employs a sophisticated tokenization pipeline that combines MeCab with the Unidic dictionary for initial tokenization, followed by SentencePiece subword tokenization.

  • Optimized architecture with 6 transformer layers
  • 768-dimensional hidden states
  • 12 attention heads for efficient processing
  • 32,768 vocabulary size using combined MeCab and SentencePiece tokenization

Core Capabilities

  • Excellent performance on JGLUE benchmarks (95.6% accuracy on Marc_ja)
  • Strong results in Japanese Natural Language Inference (88.9% accuracy)
  • Superior performance in question answering (87.3% EM, 93.3% F1 on JSQuAD)
  • Efficient semantic similarity scoring (89.2% Pearson correlation on JSTS)

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its optimal balance between performance and efficiency, outperforming other Japanese DistilBERT variants while maintaining a compact architecture. Its superior performance on JGLUE benchmarks makes it particularly valuable for Japanese NLP tasks.

Q: What are the recommended use cases?

The model is well-suited for various Japanese language processing tasks, including text classification, question answering, and semantic similarity analysis. It's particularly effective for applications requiring efficient inference while maintaining high accuracy.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.