indonesian-roberta-base-posp-tagger

indonesian-roberta-base-posp-tagger

w11wo

Indonesian RoBERTa-based POS tagger achieving 96.25% accuracy on IndoNLU dataset. 124M params, MIT licensed, optimized for Indonesian text.

PropertyValue
Parameter Count124M
LicenseMIT
FrameworkPyTorch, Transformers
Base Modelflax-community/indonesian-roberta-base

What is indonesian-roberta-base-posp-tagger?

This is a specialized Part-of-Speech (POS) tagger built on RoBERTa architecture, specifically fine-tuned for Indonesian language processing. The model demonstrates exceptional performance with 96.25% accuracy across precision, recall, and F1 metrics on the IndoNLU dataset.

Implementation Details

The model is implemented using the Transformers library and PyTorch framework, fine-tuned from the indonesian-roberta-base model. Training was conducted over 10 epochs using the Adam optimizer with a learning rate of 2e-05 and linear scheduler.

  • Batch size: 16 for both training and evaluation
  • Training optimization: Adam (β1=0.9, β2=0.999, ε=1e-08)
  • Final validation loss: 0.1668
  • Best performance achieved at epoch 10

Core Capabilities

  • High-accuracy POS tagging for Indonesian text
  • Token classification with 96.25% precision and recall
  • Optimized for Indonesian language understanding
  • Suitable for integration into larger NLP pipelines

Frequently Asked Questions

Q: What makes this model unique?

This model combines the power of RoBERTa architecture with specific optimizations for Indonesian language, achieving state-of-the-art performance in POS tagging tasks with consistent 96.25% accuracy across all metrics.

Q: What are the recommended use cases?

The model is ideal for Indonesian text analysis tasks requiring part-of-speech tagging, including syntactic parsing, grammatical analysis, and text preprocessing for downstream NLP tasks.

Related Models

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026