paraphrase-bert-pt

Maintained By
Prompsit

paraphrase-bert-pt

PropertyValue
AuthorPrompsit
Base Modelneuralmind/bert-base-portuguese-cased
Accuracy78.09%
Model URLHugging Face

What is paraphrase-bert-pt?

paraphrase-bert-pt is a specialized Portuguese language model designed for paraphrase detection. Developed by Prompsit under a TSI project co-financed by Spain's Ministry of Economic Affairs and Digital Transformation, this model evaluates whether two given phrases express the same meaning using different words.

Implementation Details

The model is fine-tuned from the neuralmind/bert-base-portuguese-cased architecture and outputs binary classification probabilities: 0 for non-paraphrases and 1 for valid paraphrases. It's specifically optimized for phrase-level analysis rather than full sentences, making it efficient for targeted paraphrase detection tasks.

  • Binary classification output (paraphrase/non-paraphrase)
  • Tested on 16,500 human-tagged phrase pairs
  • Achieves 71.57% precision and 40.55% recall
  • F1 score of 0.518 and Matthews Correlation of 0.416

Core Capabilities

  • Phrase-level paraphrase detection in Portuguese
  • Probability-based classification output
  • Efficient processing (607.587 samples per second)
  • Optimized for short text fragments without punctuation

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically designed for Portuguese language paraphrase detection at the phrase level, making it particularly useful for applications requiring semantic similarity assessment in Portuguese text fragments.

Q: What are the recommended use cases?

The model is best suited for: phrase-level paraphrase verification, semantic similarity checking in Portuguese, content matching systems, and automated text analysis where identifying equivalent expressions is crucial.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.