KURE-v1

Maintained By
nlpai-lab

KURE-v1

PropertyValue
DeveloperNLP&AI Lab
Base ModelBAAI/bge-m3
LicenseMIT
Embedding Dimension1024
Sequence Length8192

What is KURE-v1?

KURE-v1 (Korea University Retrieval Embedding) is a specialized embedding model designed for Korean text retrieval. Fine-tuned from BAAI/bge-m3 using CachedGISTEmbedLoss, it represents a significant advancement in Korean language processing, particularly excelling in retrieval tasks compared to other multilingual models.

Implementation Details

The model was trained on 2 million Korean query-document pairs with 5 hard negatives per example. The training procedure utilized a batch size of 4096, learning rate of 2e-05, and ran for one epoch using the CachedGISTEmbedLoss from sentence-transformers.

  • Supports both Korean and English text processing
  • Achieves state-of-the-art performance across 8 different benchmark datasets
  • Implements efficient embedding generation with 1024-dimensional vectors

Core Capabilities

  • Top-1 retrieval performance with 0.52640 recall and 0.60551 precision
  • Exceptional performance in various domains including finance, healthcare, legal, and commerce
  • Robust handling of long documents with 8192 token context window
  • Supports diverse retrieval tasks from Wikipedia-based queries to domain-specific applications

Frequently Asked Questions

Q: What makes this model unique?

KURE-v1 stands out for its superior performance in Korean text retrieval, consistently outperforming other multilingual models across multiple benchmarks. It's specifically optimized for Korean language understanding while maintaining English language capabilities.

Q: What are the recommended use cases?

The model excels in document retrieval tasks across various domains including finance, healthcare, legal, and public sector applications. It's particularly effective for multi-hop question answering, long document retrieval, and domain-specific information extraction.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.