sbert-base-chinese-nli
Property | Value |
---|---|
Author | UER |
Downloads | 13,410 |
License | Apache 2.0 |
Primary Paper | UER Paper |
Framework | PyTorch + Transformers |
What is sbert-base-chinese-nli?
sbert-base-chinese-nli is a specialized Chinese sentence embedding model developed using the UER-py framework. It's built upon the chinese_roberta_L-12_H-768 architecture and fine-tuned specifically for natural language inference tasks. The model excels at generating semantic representations of Chinese text, making it particularly effective for sentence similarity comparisons.
Implementation Details
The model is implemented using the Sentence-BERT architecture and trained on the ChineseTextualInference dataset. It underwent a 5-epoch fine-tuning process with a sequence length of 128, using a learning rate of 5e-5 and batch size of 64. The training was conducted on Tencent Cloud infrastructure.
- Based on chinese_roberta_L-12_H-768 architecture
- Utilizes siamese network structure for sentence embedding
- Implements cosine similarity for sentence comparison
- Supports batch processing of Chinese text
Core Capabilities
- Generate high-quality sentence embeddings for Chinese text
- Compute semantic similarity between Chinese sentences
- Support for natural language inference tasks
- Easy integration with PyTorch and Transformers pipeline
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized focus on Chinese language processing and its optimization for sentence similarity tasks through NLI fine-tuning. It combines the robustness of BERT architecture with specific adaptations for Chinese semantic understanding.
Q: What are the recommended use cases?
The model is ideal for applications requiring semantic similarity comparison between Chinese sentences, including but not limited to: document similarity analysis, semantic search, text clustering, and question-answer matching systems.