gbert-large-paraphrase-cosine
Property | Value |
---|---|
Base Model | deepset/gbert-large |
Embedding Dimension | 1024 |
License | MIT |
Language | German |
What is gbert-large-paraphrase-cosine?
gbert-large-paraphrase-cosine is a specialized sentence-transformers model designed for German language processing. It transforms sentences and paragraphs into 1024-dimensional dense vector representations, optimized using cosine similarity metrics. The model is particularly suited for SetFit implementations and few-shot text classification tasks in German.
Implementation Details
The model is built upon deepset's gbert-large architecture and trained using MultipleNegativesRankingLoss with cosine similarity. Training was performed on a filtered dataset from deutsche-telekom/ger-backtrans-paraphrase, with specific hyperparameters including a learning rate of 8.345726930229726e-06 over 7 epochs and a batch size of 57.
- Strict data filtering criteria (minimum 15 characters, Jaccard similarity < 0.3)
- Token count limitations (max 30 tokens)
- Cosine similarity threshold of 0.85
Core Capabilities
- High-quality German sentence embeddings
- Optimized for few-shot learning scenarios
- Superior performance compared to multilingual alternatives
- Efficient paraphrase detection and similarity matching
Frequently Asked Questions
Q: What makes this model unique?
This model stands out through its specialized optimization for German language processing and its superior performance in few-shot scenarios compared to multilingual models and even base German BERT models. It's specifically designed with cosine similarity metrics, making it ideal for semantic similarity tasks.
Q: What are the recommended use cases?
The model is best suited for German text classification tasks, especially in few-shot learning scenarios. It excels at semantic similarity matching, paraphrase detection, and can be effectively used with the SetFit framework for improved classification outcomes.