company-names-similarity-sentence-transformer

Maintained By
Vsevolod

company-names-similarity-sentence-transformer

PropertyValue
AuthorVsevolod
Downloads1,086
Vector Dimension384
FrameworkPyTorch/Sentence-Transformers

What is company-names-similarity-sentence-transformer?

This is a specialized sentence transformer model designed specifically for comparing and analyzing company names. It utilizes the SBERT (Sentence-BERT) architecture to map company names and text into a 384-dimensional vector space, enabling efficient semantic similarity comparisons and clustering operations.

Implementation Details

The model is built on the Sentence-Transformers framework and uses a BERT-based architecture with custom training optimizations. It implements a cosine similarity loss function and uses AdamW optimizer with a learning rate of 2e-05. The model includes a sophisticated pooling layer that performs mean token pooling and normalization.

  • Maximum sequence length: 256 tokens
  • Trained with batch size of 32
  • Uses WarmupLinear scheduler with 100 warmup steps
  • Implements weight decay of 0.01

Core Capabilities

  • Company name similarity matching
  • Semantic search operations
  • Text embedding generation
  • Clustering of company names
  • Feature extraction for downstream tasks

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for handling company names, utilizing a carefully tuned BERT architecture with mean token pooling. Its 384-dimensional output space provides a good balance between computational efficiency and semantic representation power.

Q: What are the recommended use cases?

The model is ideal for company name deduplication, similarity search in business databases, company clustering, and automated company matching systems. It's particularly useful in scenarios requiring semantic understanding of business names and entities.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.