deberta-v2-large-japanese-char-wwm

deberta-v2-large-japanese-char-wwm

ku-nlp

A large Japanese DeBERTa V2 model (330M params) trained on Wikipedia, CC-100, and OSCAR, using character-level tokenization and whole word masking.

PropertyValue
Parameter Count330M
LicenseCC-BY-SA-4.0
Authorku-nlp
Training DataWikipedia, CC-100, OSCAR
Tensor TypeF32

What is deberta-v2-large-japanese-char-wwm?

This is an advanced Japanese language model based on the DeBERTa V2 architecture, specifically designed for Japanese text processing. The model employs character-level tokenization and whole word masking (WWM), trained on a massive dataset of 171GB of Japanese text.

Implementation Details

The model was trained using 16 NVIDIA A100-SXM4-40GB GPUs over 26 days, utilizing the transformers library. It implements a sentencepiece model with 22,012 tokens and achieves a masked language modeling accuracy of 0.795.

  • Training utilized a linear learning rate schedule with warmup
  • Batch size of 3,328 across 16 devices
  • Maximum sequence length of 512 tokens
  • 260,000 training steps with 10,000 warmup steps

Core Capabilities

  • Character-level tokenization for Japanese text
  • Whole word masking for improved contextual understanding
  • Fill-mask task performance with high accuracy
  • Suitable for fine-tuning on downstream tasks

Frequently Asked Questions

Q: What makes this model unique?

This model combines character-level tokenization with whole word masking, specifically optimized for Japanese text processing. Its training on a diverse dataset including Wikipedia, CC-100, and OSCAR makes it particularly robust for Japanese language tasks.

Q: What are the recommended use cases?

The model excels in masked language modeling tasks and can be fine-tuned for various downstream applications like text classification, named entity recognition, and other Japanese NLP tasks. It's particularly useful when character-level analysis is important.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026