GLuCoSE-base-ja

GLuCoSE-base-ja

pkshatech

Japanese text embedding model based on LUKE, optimized for sentence similarity and semantic search. 768-dim output, supports 512 tokens max.

PropertyValue
LicenseApache-2.0
Base Modelstudio-ousia/luke-base
Output Dimension768
Max Tokens512
LanguageJapanese

What is GLuCoSE-base-ja?

GLuCoSE (General LUke-based COntrastive Sentence Embedding) is a specialized Japanese text embedding model built on LUKE architecture. It's designed to excel in sentence similarity tasks and semantic search applications, trained on a diverse mix of web data and natural language inference datasets.

Implementation Details

The model implements mean pooling strategy and has been trained using cosine similarity as its loss function. It's built on the LUKE architecture and has been fine-tuned on multiple datasets including mC4, JGLUE, PAWS-X, and various other Japanese language resources.

  • Achieves state-of-the-art performance on JSTS benchmark with 0.864 Spearman correlation
  • Outperforms OpenAI's text-embedding-ada-002 in zero-shot search tasks
  • Seamlessly integrates with sentence-transformers library

Core Capabilities

  • Sentence similarity computation
  • Semantic search operations
  • Dense vector generation for Japanese text
  • Natural language inference tasks

Frequently Asked Questions

Q: What makes this model unique?

GLuCoSE-base-ja stands out for its specialized optimization for Japanese language processing, combining LUKE architecture with extensive training on diverse Japanese datasets. It achieves superior performance in both semantic similarity and search tasks compared to existing models.

Q: What are the recommended use cases?

The model is particularly well-suited for semantic search applications, sentence similarity comparison, and text embedding generation for Japanese content. It can be effectively used in production environments through the sentence-transformers library or integrated into LangChain applications.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026