piccolo-large-zh

Property	Value
Model Size	0.65 GB
Embedding Dimension	1024
Max Sequence Length	512
License	MIT

What is piccolo-large-zh?

piccolo-large-zh is a state-of-the-art Chinese text embedding model developed by SenseTime Research. The model employs a two-stage training approach, first trained on 400 million weakly supervised text pairs, followed by fine-tuning on 20 million human-labeled pairs. It achieves an impressive average score of 64.11 across 35 different evaluation tasks on the CMTEB benchmark.

Implementation Details

The model uses a transformer-based architecture and implements a sophisticated training pipeline that includes both pair-wise and triplet contrastive learning. During the first stage, it uses binary contrastive loss with in-batch negatives, while the second stage incorporates hard negatives with improved contrastive loss.

Supports both short-to-short and short-to-long text matching
Implements efficient memory usage through fp16 and gradient checkpointing
Utilizes specialized dataset sampling for optimal batch composition
Incorporates query/passage prefixes for enhanced retrieval performance

Core Capabilities

Classification (67.03% accuracy across 9 tasks)
Clustering (47.04% performance across 4 tasks)
Pair Classification (78.38% accuracy)
Reranking (65.98% effectiveness)
Retrieval (70.93% performance across 8 tasks)
Semantic Textual Similarity (58.02% across 8 tasks)

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its two-stage training approach and specialized treatment of query/passage pairs with different max lengths (64 for queries, 512 for passages), making it particularly effective for retrieval tasks.

Q: What are the recommended use cases?

The model excels in text similarity matching, information retrieval, and document classification tasks. It's particularly well-suited for Chinese language applications requiring semantic understanding and comparison.

piccolo-large-zh

piccolo-large-zh

What is piccolo-large-zh?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models