bart-base-chinese

fnlp

Chinese BART-base model (140M params) for text generation and understanding. Features extended vocabulary of 51,271 tokens and 1024 position embeddings. Strong performance on AFQMC, IFLYTEK, CSL-sum, and LCSTS tasks.

Property	Value
Parameter Count	140M
Model Type	Text-to-Text Generation
Architecture	BART
Paper	CPT: A Pre-Trained Unbalanced Transformer
Author	fnlp

What is bart-base-chinese?

BART-Base Chinese is a sophisticated text generation and understanding model specifically designed for Chinese language processing. Released with significant updates in December 2022, it features an enhanced vocabulary of 51,271 tokens and extended position embeddings up to 1024 tokens. The model represents a crucial advancement in Chinese natural language processing, maintaining competitive performance across multiple benchmark tasks.

Implementation Details

The model employs a BART architecture optimized for Chinese language processing, incorporating several technical improvements over its predecessor. The implementation uses PyTorch and supports Safetensors, making it efficient for both training and inference.

Enhanced vocabulary including 6,800+ additional Chinese characters
Removal of redundant tokens and optimization of English token coverage
Extended position embeddings from 512 to 1024
Fine-tuned with batch size 2048 and peak learning rate 2e-5

Core Capabilities

Text-to-text generation with strong performance on AFQMC (73.03)
Effective on IFLYTEK classification tasks (61.25)
Strong summarization capabilities demonstrated on CSL-sum (61.51) and LCSTS (38.78)
Masked text completion and generation

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its optimized vocabulary for Chinese language processing, including both simplified and traditional characters, and its extended position embeddings that allow for longer text processing. It maintains competitive performance while offering improved token coverage.

Q: What are the recommended use cases?

The model is particularly well-suited for Chinese text generation tasks, summarization, and text completion. It performs well in scenarios requiring understanding of both simplified and traditional Chinese characters, making it versatile for various NLP applications.