japanese-hubert-base
Property | Value |
---|---|
Parameter Count | 94.4M |
License | Apache 2.0 |
Training Data | ReazonSpeech v1 (19,000 hours) |
Paper | Research Paper |
Tensor Type | F32 |
What is japanese-hubert-base?
japanese-hubert-base is a specialized speech representation model developed by rinna Co., Ltd. It's based on the HuBERT (Hidden-Unit BERT) architecture, specifically adapted for Japanese speech processing. The model features 12 transformer layers with 12 attention heads, following the architecture of the original HuBERT Base model but trained on Japanese speech data.
Implementation Details
The model has been trained on approximately 19,000 hours of Japanese speech from the ReazonSpeech v1 corpus. It utilizes the same architecture as facebook/hubert-base-ls960 but is specifically optimized for Japanese language processing.
- 12 transformer layers with 12 attention heads
- 94.4M parameters for robust feature extraction
- Trained using the official Facebook research repository code
- Supports PyTorch framework with Safetensors implementation
Core Capabilities
- Self-supervised speech representation learning
- Masked prediction of hidden units
- Feature extraction from raw audio input
- Processing of 16kHz Japanese speech signals
- Generation of 768-dimensional feature vectors
Frequently Asked Questions
Q: What makes this model unique?
This model is specifically trained on Japanese speech data, making it optimized for Japanese speech processing tasks. Its training on 19,000 hours of Japanese speech ensures robust performance on Japanese-specific speech patterns and phonetics.
Q: What are the recommended use cases?
The model is ideal for Japanese speech processing tasks including speech recognition, speaker verification, and speech feature extraction. It's particularly useful for researchers and developers working on Japanese speech applications requiring high-quality speech representations.