silma-embeddding-sts-v0.1

Maintained By
silma-ai

SILMA Embedding STS v0.1

PropertyValue
Parameter Count135M
Output Dimensions768
Max Sequence Length512 tokens
LicenseApache 2.0
LanguagesArabic, English

What is silma-embeddding-sts-v0.1?

SILMA Embedding STS is a specialized sentence transformer model designed for generating high-quality semantic embeddings for both Arabic and English text. Built on the foundation of bert-base-arabertv02, this model has been fine-tuned through a two-phase process to excel at semantic textual similarity tasks.

Implementation Details

The model employs a sophisticated architecture that generates 768-dimensional dense vector representations of input text, utilizing cosine similarity for comparing embeddings. It was trained using a two-phase approach: first on a dataset of 2.25M triplets, then fine-tuned on 30k sentence pairs with similarity scores.

  • Base Architecture: bert-base-arabertv02
  • Training Framework: Sentence Transformers 3.2.0
  • Optimization: Mixed precision training (BF16)
  • Evaluation Metrics: Achieved 85.59% Spearman correlation on Arabic STS tasks

Core Capabilities

  • Bilingual semantic similarity assessment
  • Cross-lingual text comparison
  • Semantic search implementation
  • Text classification and clustering
  • Question-answer matching

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its strong performance in both Arabic and English semantic tasks, achieving particularly impressive results on Arabic STS tasks (85.58% Spearman correlation). It's specifically optimized for production use with efficient inference capabilities.

Q: What are the recommended use cases?

The model excels in applications requiring semantic understanding such as text similarity comparison, document clustering, semantic search, and intent classification. It's particularly effective for Arabic language processing while maintaining good performance for English content.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.