indo-sentence-bert-base

indo-sentence-bert-base

firqaaa

Indonesian BERT-based sentence embedding model that maps text to 768-dimensional vectors, optimized for semantic similarity tasks in Bahasa Indonesia.

PropertyValue
LicenseApache 2.0
LanguageIndonesian
Vector Dimension768
Research PaperView Paper

What is indo-sentence-bert-base?

indo-sentence-bert-base is a specialized sentence transformer model designed specifically for Indonesian language processing. Built on the BERT architecture, it converts sentences and paragraphs into 768-dimensional dense vector representations, enabling powerful semantic analysis and similarity comparisons in Bahasa Indonesia.

Implementation Details

The model was trained using Multiple Negatives Ranking Loss with a batch size of 16 and runs for 5 epochs. It utilizes AdamW optimizer with a learning rate of 2e-05 and implements warmup steps of 9930. The architecture combines a BERT transformer model with a pooling layer that performs mean pooling on token embeddings.

  • Supports maximum sequence length of 512 tokens
  • Implements mean pooling strategy for sentence embeddings
  • Trained with Multiple Negatives Ranking Loss (scale: 20.0)
  • Uses cosine similarity for comparing embeddings

Core Capabilities

  • Semantic similarity computation between Indonesian texts
  • Text clustering and classification
  • Information retrieval in Bahasa Indonesia
  • Feature extraction for downstream NLP tasks

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Indonesian language processing, making it particularly effective for semantic tasks in Bahasa Indonesia. Its architecture combines BERT's powerful language understanding with specialized sentence embedding capabilities.

Q: What are the recommended use cases?

The model excels in tasks such as semantic search, document similarity analysis, text clustering, and any application requiring semantic understanding of Indonesian text. It's particularly useful for applications needing to compare or analyze relationships between sentences or paragraphs.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026