gbert-large-paraphrase-cosine

Property	Value
Base Model	deepset/gbert-large
Embedding Dimension	1024
License	MIT
Language	German

What is gbert-large-paraphrase-cosine?

gbert-large-paraphrase-cosine is a specialized sentence-transformers model designed for German language processing. It transforms sentences and paragraphs into 1024-dimensional dense vector representations, optimized using cosine similarity metrics. The model is particularly suited for SetFit implementations and few-shot text classification tasks in German.

Implementation Details

The model is built upon deepset's gbert-large architecture and trained using MultipleNegativesRankingLoss with cosine similarity. Training was performed on a filtered dataset from deutsche-telekom/ger-backtrans-paraphrase, with specific hyperparameters including a learning rate of 8.345726930229726e-06 over 7 epochs and a batch size of 57.

Strict data filtering criteria (minimum 15 characters, Jaccard similarity < 0.3)
Token count limitations (max 30 tokens)
Cosine similarity threshold of 0.85

Core Capabilities

High-quality German sentence embeddings
Optimized for few-shot learning scenarios
Superior performance compared to multilingual alternatives
Efficient paraphrase detection and similarity matching

Frequently Asked Questions

Q: What makes this model unique?

This model stands out through its specialized optimization for German language processing and its superior performance in few-shot scenarios compared to multilingual models and even base German BERT models. It's specifically designed with cosine similarity metrics, making it ideal for semantic similarity tasks.

Q: What are the recommended use cases?

The model is best suited for German text classification tasks, especially in few-shot learning scenarios. It excels at semantic similarity matching, paraphrase detection, and can be effectively used with the SetFit framework for improved classification outcomes.