all-MiniLM-L6-v1

Property	Value
Embedding Dimensions	384
Max Sequence Length	128 tokens
Training Data	1B+ sentence pairs
Model Type	Sentence Transformer
Author	sentence-transformers

What is all-MiniLM-L6-v1?

all-MiniLM-L6-v1 is a powerful sentence embedding model designed to convert text into fixed-length vector representations. Built on the MiniLM architecture and fine-tuned on over 1 billion sentence pairs, this model excels at capturing semantic meaning in a compact 384-dimensional space. The model was developed during the Hugging Face Community week using JAX/Flax, leveraging efficient hardware infrastructure including TPU v3-8 systems.

Implementation Details

The model implements a self-supervised contrastive learning approach, where it learns to identify correct sentence pairs among randomly sampled alternatives. It uses the pretrained nreimers/MiniLM-L6-H384-uncased as its foundation and employs an AdamW optimizer with a 2e-5 learning rate during training. The training process included 100k steps with a batch size of 1024 and a 500-step learning rate warm-up period.

Simple integration with sentence-transformers library
Efficient mean pooling implementation
Normalized embeddings output
Automatic truncation of sequences longer than 128 tokens

Core Capabilities

Semantic search and information retrieval
Text clustering and organization
Sentence similarity computation
Short paragraph encoding

Frequently Asked Questions

Q: What makes this model unique?

The model's strength lies in its efficient architecture and extensive training data, comprising over 1 billion sentence pairs from diverse sources including Reddit comments, scientific papers, and question-answer pairs. This broad training foundation makes it particularly robust for general-purpose sentence embedding tasks.

Q: What are the recommended use cases?

The model is ideal for applications requiring semantic understanding of text, such as document similarity matching, clustering related content, and information retrieval systems. It's particularly effective for short to medium-length text passages, with optimal performance on inputs under 128 tokens.

all-MiniLM-L6-v1

all-MiniLM-L6-v1

What is all-MiniLM-L6-v1?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models