trocr-small-korean

Maintained By
team-lucid

trocr-small-korean

PropertyValue
Parameter Count54.5M
LicenseApache 2.0
LanguageKorean
Model TypeVision-Encoder-Decoder

What is trocr-small-korean?

trocr-small-korean is a specialized OCR (Optical Character Recognition) model designed specifically for Korean text recognition. It implements a vision-encoder-decoder architecture that combines image transformer encoding with text transformer decoding capabilities. The model leverages DeiT weights for the image encoder and custom-trained RoBERTa weights for the text decoder.

Implementation Details

The model was trained using Google's TPU Research Cloud (TRC) on a massive dataset of 6 million synthetic images generated using synthtiger. The training process involved specific hyperparameters including a learning rate of 1e-4, batch size of 512, and 500,000 training steps with warmup.

  • Encoder: Uses DeiT-based vision transformer
  • Decoder: Custom-trained RoBERTa architecture
  • Training Data: 6M synthetic Korean text images
  • Optimization: Adam optimizer with β1=0.9, β2=0.98

Core Capabilities

  • Korean text recognition from images
  • Handles various text styles and formats
  • Efficient processing with 54.5M parameters
  • Simple integration with PyTorch workflow

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized focus on Korean text recognition, combining modern vision-transformer architecture with custom-trained language modeling. Its training on 6M synthetic images makes it robust for various Korean text recognition tasks.

Q: What are the recommended use cases?

The model is ideal for Korean document digitization, automated text extraction from images, and any application requiring Korean OCR capabilities. It's particularly suitable for production environments due to its smaller parameter count while maintaining effectiveness.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.