bpmn-information-extraction-v2

Property	Value
Parameter Count	108M
Base Model	BERT-base-cased
License	Apache 2.0
F1 Score	90.31%
Accuracy	95.16%

What is bpmn-information-extraction-v2?

This is a specialized token classification model built on BERT-base-cased architecture, designed to extract structured information from business process descriptions. The model has been fine-tuned on a dataset of 104 textual process descriptions and can identify five key elements: Agents, Tasks, Task Information, Process Information, and Conditions.

Implementation Details

The model leverages a fine-tuned BERT architecture with 108M parameters, trained using the Adam optimizer with a learning rate of 2e-05 over 15 epochs. The training process utilized a batch size of 8 and achieved impressive metrics with 88.26% precision and 92.46% recall.

Token Classification Architecture
PyTorch Implementation
TensorBoard Integration
Safetensors Support

Core Capabilities

Extraction of process agents and actors
Identification of business tasks and activities
Recognition of conditional statements in process flows
Classification of process-related metadata
Structured information extraction from natural language descriptions

Frequently Asked Questions

Q: What makes this model unique?

The model's specialization in business process text analysis and its high accuracy (95.16%) in identifying process elements make it particularly valuable for automated business process modeling and analysis.

Q: What are the recommended use cases?

The model is ideal for converting natural language process descriptions into structured BPMN elements, automating business process documentation, and analyzing workflow descriptions for process mining and optimization.