sup-SimCSE-VietNamese-phobert-base

sup-SimCSE-VietNamese-phobert-base

VoVanPhuc

Vietnamese sentence similarity model based on SimCSE and PhoBERT, offering 136M params for both supervised and unsupervised learning approaches with state-of-the-art performance.

PropertyValue
Parameter Count136M
Model TypeSentence Similarity
ArchitecturePhoBERT-base
PaperSimCSE Paper
LanguageVietnamese

What is sup-SimCSE-VietNamese-phobert-base?

This is a state-of-the-art Vietnamese sentence embedding model that combines the power of SimCSE (Simple Contrastive Learning of Sentence Embeddings) with PhoBERT, specifically designed for Vietnamese language understanding. It uses supervised learning techniques to create high-quality sentence embeddings that can effectively capture semantic relationships between Vietnamese texts.

Implementation Details

The model is built upon the PhoBERT base architecture, utilizing 136M parameters and implementing the SimCSE approach for contrastive learning. It supports both sentence-transformers and transformers libraries, requiring PyVi for Vietnamese word segmentation.

  • Pre-trained on Vietnamese text using supervised learning
  • Implements contrastive learning techniques from SimCSE
  • Uses PhoBERT tokenization and encoding
  • Supports batch processing and GPU acceleration

Core Capabilities

  • Vietnamese sentence embedding generation
  • Semantic similarity computation
  • Support for both supervised and unsupervised approaches
  • Integration with popular deep learning frameworks

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Vietnamese language processing, combining SimCSE's contrastive learning approach with PhoBERT's Vietnamese language understanding capabilities, making it particularly effective for Vietnamese sentence similarity tasks.

Q: What are the recommended use cases?

The model is ideal for Vietnamese text processing tasks such as semantic similarity matching, document clustering, information retrieval, and text classification. It's particularly useful in applications requiring understanding of semantic relationships between Vietnamese sentences.

Related Models

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026