LaBSE-en-ru

LaBSE-en-ru

cointegrated

LaBSE-en-ru is a specialized bilingual BERT model for English-Russian sentence embeddings, optimized to 27% of original size with maintained quality.

PropertyValue
Parameter Count129M
Authorcointegrated
PaperLanguage-agnostic BERT Sentence Embedding
Model TypeBilingual BERT
LanguagesEnglish, Russian

What is LaBSE-en-ru?

LaBSE-en-ru is a specialized bilingual version of Google's Language-agnostic BERT Sentence Embedding (LaBSE) model, specifically optimized for English and Russian languages. This model represents a significant optimization, reducing the original model size to just 27% while maintaining the quality of embeddings for these two languages.

Implementation Details

The model utilizes the BERT architecture and has been carefully truncated to retain only English and Russian tokens in its vocabulary, resulting in a 90% reduction in vocabulary size. With 129M parameters, it offers efficient sentence embedding generation using PyTorch and the Transformers library.

  • Optimized vocabulary focused on English and Russian tokens
  • Supports sentence similarity tasks
  • Implements efficient embedding generation
  • Uses normalized pooler output for representations

Core Capabilities

  • Bilingual sentence embedding generation
  • Cross-lingual sentence similarity comparison
  • Efficient processing with reduced model size
  • Maximum sequence length of 64 tokens

Frequently Asked Questions

Q: What makes this model unique?

This model's uniqueness lies in its efficient bilingual optimization, offering the same quality as the original LaBSE but with significantly reduced model size and focused vocabulary for English and Russian languages.

Q: What are the recommended use cases?

The model is ideal for cross-lingual sentence similarity tasks between English and Russian, document alignment, and bilingual text processing applications where efficient computation is required.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026