line-distilbert-base-japanese

line-distilbert-base-japanese

line-corporation

LINE's Japanese DistilBERT model trained on 131GB web text. 6-layer architecture with 68M params. Strong JGLUE benchmark performance. Apache 2.0 licensed.

PropertyValue
Parameter Count68M
ArchitectureDistilBERT (6 layers, 768 hidden, 12 heads)
Training Data131GB Japanese web text
LicenseApache 2.0
Vocabulary Size32,768 tokens

What is line-distilbert-base-japanese?

LINE DistilBERT Japanese is a compressed BERT model specifically designed for Japanese language processing, developed by LINE Corporation. This model represents a significant achievement in creating an efficient yet powerful Japanese language model, trained on an extensive 131GB dataset of Japanese web text.

Implementation Details

The model implements a distilled architecture of BERT, featuring 6 layers, 768-dimensional hidden states, and 12 attention heads, resulting in 68M parameters. It employs a sophisticated tokenization pipeline that combines MeCab with the Unidic dictionary for initial tokenization, followed by SentencePiece subword tokenization.

  • Optimized architecture with 6 transformer layers
  • 768-dimensional hidden states
  • 12 attention heads for efficient processing
  • 32,768 vocabulary size using combined MeCab and SentencePiece tokenization

Core Capabilities

  • Excellent performance on JGLUE benchmarks (95.6% accuracy on Marc_ja)
  • Strong results in Japanese Natural Language Inference (88.9% accuracy)
  • Superior performance in question answering (87.3% EM, 93.3% F1 on JSQuAD)
  • Efficient semantic similarity scoring (89.2% Pearson correlation on JSTS)

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its optimal balance between performance and efficiency, outperforming other Japanese DistilBERT variants while maintaining a compact architecture. Its superior performance on JGLUE benchmarks makes it particularly valuable for Japanese NLP tasks.

Q: What are the recommended use cases?

The model is well-suited for various Japanese language processing tasks, including text classification, question answering, and semantic similarity analysis. It's particularly effective for applications requiring efficient inference while maintaining high accuracy.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026