xiaobu-embedding-v2

Maintained By
lier007

xiaobu-embedding-v2

PropertyValue
Based Onpiccolo-embedding
FrameworkPyTorch/Sentence-Transformers
PaperCircle Loss Paper

What is xiaobu-embedding-v2?

xiaobu-embedding-v2 is an advanced Chinese text embedding model that builds upon the piccolo-embedding architecture while introducing significant improvements. The model utilizes accumulated data from xiaobu-embedding-v1 and implements a unified circle loss approach to handle six different CMTEB (Chinese Multi-task Embedding Benchmark) tasks.

Implementation Details

The model employs a unified circle loss perspective to process multiple types of NLP tasks, offering two major advantages: 1) Better utilization of multiple positive examples from original datasets, and 2) Reduced complexity in managing multiple loss function weights. The implementation is based on the sentence-transformers framework, making it easily accessible for various applications.

  • Unified circle loss implementation for multiple task types
  • Improved synthetic data utilization
  • Built on sentence-transformers architecture
  • Optimized for Chinese language processing

Core Capabilities

  • Strong performance on classification tasks (up to 94.9% accuracy on OnlineShopping)
  • Effective text similarity measurement
  • Robust clustering capabilities (78.75% v-measure on ThuNewsClusteringP2P)
  • High-quality text retrieval (up to 90% MAP on various retrieval tasks)

Frequently Asked Questions

Q: What makes this model unique?

The model's unique strength lies in its unified approach to handling multiple NLP tasks through circle loss implementation, along with its optimization for Chinese language understanding and improved synthetic data usage.

Q: What are the recommended use cases?

The model excels in text similarity comparison, classification tasks, clustering, and information retrieval applications, making it particularly suitable for Chinese language processing tasks in these domains.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.