msmarco-cotmae-MiniLM-L12_en-ko-ja

msmarco-cotmae-MiniLM-L12_en-ko-ja

sangmini

A multilingual sentence transformer model that maps text to 1536-dimensional vectors, supporting English, Korean, and Japanese for semantic search and clustering.

PropertyValue
Authorsangmini
Downloads39,856
Output Dimension1536
FrameworkPyTorch

What is msmarco-cotmae-MiniLM-L12_en-ko-ja?

This is a sophisticated sentence transformer model designed for multilingual text processing, specifically optimized for English, Korean, and Japanese languages. Built on the BERT architecture, it converts sentences and paragraphs into high-dimensional vector representations (1536 dimensions), enabling powerful semantic search and clustering capabilities.

Implementation Details

The model utilizes a three-component architecture: a Transformer layer based on BERT, a Pooling layer, and a Dense layer. It was trained using MSE Loss with AdamW optimizer over 10 epochs, featuring a learning rate of 1e-05 and warmup steps optimization.

  • Maximum sequence length: 128 tokens
  • Word embedding dimension: 384
  • Final output dimension: 1536
  • Pooling strategy: Mean tokens

Core Capabilities

  • Multilingual sentence embedding generation
  • Semantic similarity computation
  • Cross-lingual text matching
  • Document clustering
  • Information retrieval across languages

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to handle three major Asian and Western languages (English, Korean, and Japanese) while producing high-dimensional embeddings makes it particularly valuable for cross-lingual applications and semantic search systems.

Q: What are the recommended use cases?

The model excels in multilingual document similarity matching, semantic search implementations, content clustering, and cross-lingual information retrieval systems. It's particularly useful for applications requiring understanding of semantic relationships across English, Korean, and Japanese content.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026