piccolo-large-zh-v2

Maintained By
sensenova

piccolo-large-zh-v2

PropertyValue
Model Size1.21 GB
Embedding Dimension1792
PaperarXiv:2405.06932
Sequence Length512
C-MTEB Score70.95 (Current SOTA)

What is piccolo-large-zh-v2?

piccolo-large-zh-v2 is an advanced Chinese embedding model developed by SenseTime Research that leverages multi-task hybrid loss training to achieve state-of-the-art performance on the C-MTEB benchmark. The model builds upon the success of its predecessor by implementing an efficient training approach that combines multiple specialized loss functions for different types of tasks.

Implementation Details

The model employs three distinct training approaches: InfoNCE with in-batch-negative for retrieval/sorting tasks, cosent loss for STS/pair classification tasks, and a modified InfoNCE approach for classification/clustering tasks. It's built on the stella-v3.5 architecture and trained for 2500 steps on 32 GPUs.

  • Supports flexible embedding dimensions (256 to 1792)
  • Implements MRL training for dimension adaptability
  • Achieves superior performance across 35 different evaluation metrics
  • Uses hybrid loss training to optimize for different task types

Core Capabilities

  • State-of-the-art performance on C-MTEB Chinese benchmarks
  • Excellent results in classification (74.59%), clustering (62.17%), and pair classification (90.24%)
  • Robust performance in reranking (70%) and retrieval tasks (74.36%)
  • Flexible dimension reduction while maintaining performance

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its multi-task hybrid loss training approach, combining different loss functions optimized for specific tasks, along with its flexible embedding dimension support ranging from 256 to 1792 dimensions.

Q: What are the recommended use cases?

The model excels in various scenarios including text similarity comparison, document retrieval, classification tasks, and clustering applications. It's particularly effective for Chinese language processing tasks requiring high-quality semantic embeddings.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.