sec-bert-base

sec-bert-base

nlpaueb

SEC-BERT-BASE is a BERT model fine-tuned on 260K+ financial documents from SEC filings, optimized for financial NLP tasks with 110M parameters

PropertyValue
Parameters110M
Architecture12-layer, 768-hidden, 12-heads
Training Data260,773 10-K SEC filings (1993-2019)
PaperFiNER: Financial Numeric Entity Recognition for XBRL Tagging

What is sec-bert-base?

SEC-BERT-BASE is a specialized BERT model trained specifically for financial domain natural language processing. It's part of the SEC-BERT family of models developed by AUEB's Natural Language Processing Group, designed to enhance financial text analysis capabilities. The model was pre-trained on a massive dataset of SEC filings, making it particularly adept at understanding financial terminology and contexts.

Implementation Details

The model implements a custom 30k subword vocabulary trained from scratch on financial documents. It follows BERT's base architecture but with domain-specific training on financial texts. The training process involved 1 million steps with 256-sequence batches and a 1e-4 learning rate, utilizing Google Cloud TPU v3-8.

  • Custom financial vocabulary of 30k subwords
  • Pre-trained on 260,773 10-K filings
  • Compatible with both PyTorch and TensorFlow 2
  • Trained with masked language modeling objective

Core Capabilities

  • Superior performance in financial text prediction tasks
  • Enhanced understanding of financial terminology
  • Accurate numeric value and context prediction
  • Improved financial entity recognition

Frequently Asked Questions

Q: What makes this model unique?

SEC-BERT-BASE stands out due to its specialized training on financial documents from SEC filings, making it particularly effective for financial NLP tasks compared to general-purpose BERT models. The model shows significantly better performance in predicting financial contexts and terminology.

Q: What are the recommended use cases?

The model is ideal for financial text analysis tasks including: financial document parsing, numeric entity recognition, financial sentiment analysis, and automated financial report analysis. It's particularly useful for FinTech applications and financial research requiring deep understanding of SEC documents.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026