bert-base-greek-uncased-v1

Maintained By
nlpaueb

bert-base-greek-uncased-v1

PropertyValue
Parameters110M
Architecture12-layer, 768-hidden, 12-heads
PaperGREEK-BERT: The Greeks visiting Sesame Street
Training DataGreek Wikipedia, EU Parliament Proceedings, OSCAR

What is bert-base-greek-uncased-v1?

bert-base-greek-uncased-v1 is a specialized BERT model trained specifically for the Greek language, developed by AUEB's Natural Language Processing Group. This model represents a significant advancement in Greek language processing, trained on a diverse corpus including Wikipedia, European Parliament proceedings, and the Greek portion of OSCAR.

Implementation Details

The model follows the bert-base-uncased architecture but is specifically optimized for Greek text processing. It was trained for 1 million steps with batches of 256 sequences of length 512, using an initial learning rate of 1e-4 on a Google Cloud TPU v3-8.

  • Uncased model that works with lowercase, deaccented Greek text
  • Native preprocessing support in the default tokenizer
  • State-of-the-art performance on Greek NLP tasks

Core Capabilities

  • Named Entity Recognition (85.7% F1 score)
  • Natural Language Inference (78.6% accuracy on XNLI)
  • Masked Language Modeling for Greek text
  • Support for both PyTorch and TensorFlow 2 frameworks

Frequently Asked Questions

Q: What makes this model unique?

This is the first BERT model specifically trained for Greek language processing, showing superior performance compared to multilingual models like M-BERT and XLM-R on Greek NLP tasks.

Q: What are the recommended use cases?

The model excels in various Greek NLP tasks including named entity recognition, natural language inference, and can be fine-tuned for specific downstream tasks like text classification, question answering, and sentiment analysis in Greek.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.