paraphrase-TinyBERT-L6-v2

Property	Value
Parameter Count	67M
Output Dimensions	768
License	Apache 2.0
Paper	View Paper

What is paraphrase-TinyBERT-L6-v2?

paraphrase-TinyBERT-L6-v2 is a compact yet powerful sentence transformer model designed for generating semantic text embeddings. Built on the TinyBERT architecture, it creates 768-dimensional vector representations of sentences and paragraphs, making it ideal for tasks like semantic search and text clustering.

Implementation Details

The model utilizes a two-component architecture consisting of a Transformer layer with a maximum sequence length of 128 tokens, followed by a Pooling layer. It supports multiple frameworks including PyTorch, ONNX, and TensorFlow, making it highly versatile for different deployment scenarios.

Efficient architecture with only 67M parameters
Supports mean pooling for token aggregation
Compatible with sentence-transformers and HuggingFace Transformers libraries
Processes sequences up to 128 tokens

Core Capabilities

Sentence and paragraph embedding generation
Semantic similarity computation
Text clustering and classification
Cross-lingual text matching

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient balance between model size and performance, using the TinyBERT architecture to achieve strong semantic understanding while maintaining a relatively small parameter count of 67M.

Q: What are the recommended use cases?

The model excels in applications requiring semantic similarity matching, such as information retrieval, document clustering, and semantic search systems. It's particularly suitable for production environments where computational efficiency is important.