ko-sbert-sts

Property	Value
Author	jhgan
Model Type	Sentence-BERT (SBERT)
Embedding Dimension	768
Performance	81.55 Cosine Pearson on KorSTS
Paper	KorNLI and KorSTS Paper

What is ko-sbert-sts?

ko-sbert-sts is a specialized Korean language model designed for generating semantic sentence embeddings. Built on the SBERT architecture, it maps Korean sentences and paragraphs into a 768-dimensional vector space, enabling advanced natural language processing tasks like semantic similarity comparison and clustering.

Implementation Details

The model implements a two-component architecture combining a BERT-based transformer with a pooling layer. It was trained using the KorSTS dataset with cosine similarity loss and achieves state-of-the-art performance on Korean semantic textual similarity tasks.

Utilizes mean pooling strategy for sentence embedding generation
Trained with AdamW optimizer (learning rate: 2e-05)
Implements warmup linear scheduling with 360 warmup steps
Batch size of 8 with 5 training epochs

Core Capabilities

Sentence embedding generation for Korean text
Semantic similarity computation between Korean sentences
Clustering and semantic search applications
Strong performance across multiple similarity metrics (Cosine, Euclidean, Manhattan)

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Korean language semantic similarity tasks, achieving impressive performance scores (81.55 Pearson correlation) on the KorSTS dataset. It's particularly notable for its effective handling of Korean language nuances while maintaining the robust architecture of SBERT.

Q: What are the recommended use cases?

The model excels in applications requiring semantic understanding of Korean text, including: semantic search systems, document clustering, similarity-based recommendation systems, and automated text analysis tools. It's particularly effective for tasks requiring precise semantic comparison between Korean sentences.

ko-sbert-sts

ko-sbert-sts

What is ko-sbert-sts?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models