NB-BERT-base
Property | Value |
---|---|
Parameter Count | 179M |
License | CC-BY-4.0 |
Author | NbAiLab |
Framework Support | PyTorch, TensorFlow, JAX |
Task | Fill-Mask |
What is nb-bert-base?
NB-BERT-base is a specialized BERT-based language model developed by the National Library of Norway, designed to process and understand Norwegian text in both bokmål and nynorsk variants. Built on the architectural foundation of BERT's multilingual cased model, this 179M parameter model has been trained on an extensive collection of Norwegian texts spanning 200 years.
Implementation Details
The model follows the BERT-base architecture and is implemented with multi-framework support including PyTorch, TensorFlow, and JAX. It utilizes Safetensors for efficient tensor operations and is optimized for masked language modeling tasks. The model has been trained on a diverse corpus of Norwegian text, making it particularly robust for Norwegian language understanding tasks.
- Supports both bokmål and nynorsk variants of Norwegian
- Based on BERT's multilingual cased model architecture
- Implements efficient tensor operations through Safetensors
- Trained on text spanning two centuries of Norwegian literature
Core Capabilities
- Masked language modeling for Norwegian text
- General-purpose Norwegian language understanding
- Support for historical Norwegian text analysis
- Fine-tuning capability for specific downstream tasks
Frequently Asked Questions
Q: What makes this model unique?
This model is specifically optimized for Norwegian language processing, trained on an extensive historical corpus from the National Library of Norway, making it particularly effective for both modern and historical Norwegian text analysis.
Q: What are the recommended use cases?
The model is designed for general-purpose Norwegian language tasks and should be fine-tuned for specific applications. It's particularly useful for tasks involving masked language modeling, text classification, and general Norwegian language understanding in both bokmål and nynorsk variants.