paraphrase-TinyBERT-L6-v2
Property | Value |
---|---|
Parameter Count | 67M |
Output Dimensions | 768 |
License | Apache 2.0 |
Paper | View Paper |
What is paraphrase-TinyBERT-L6-v2?
paraphrase-TinyBERT-L6-v2 is a compact yet powerful sentence transformer model designed for generating semantic text embeddings. Built on the TinyBERT architecture, it creates 768-dimensional vector representations of sentences and paragraphs, making it ideal for tasks like semantic search and text clustering.
Implementation Details
The model utilizes a two-component architecture consisting of a Transformer layer with a maximum sequence length of 128 tokens, followed by a Pooling layer. It supports multiple frameworks including PyTorch, ONNX, and TensorFlow, making it highly versatile for different deployment scenarios.
- Efficient architecture with only 67M parameters
- Supports mean pooling for token aggregation
- Compatible with sentence-transformers and HuggingFace Transformers libraries
- Processes sequences up to 128 tokens
Core Capabilities
- Sentence and paragraph embedding generation
- Semantic similarity computation
- Text clustering and classification
- Cross-lingual text matching
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its efficient balance between model size and performance, using the TinyBERT architecture to achieve strong semantic understanding while maintaining a relatively small parameter count of 67M.
Q: What are the recommended use cases?
The model excels in applications requiring semantic similarity matching, such as information retrieval, document clustering, and semantic search systems. It's particularly suitable for production environments where computational efficiency is important.