roberta-base-go_emotions-onnx

Property	Value
License	MIT
Framework	ONNX
Task	Multi-label Emotion Classification
Size	Full: 499MB, Quantized: 125MB

What is roberta-base-go_emotions-onnx?

This is an ONNX-optimized version of the RoBERTa-based emotion detection model, designed for efficient inference and deployment. It comes in two variants: a full precision model and an INT8 quantized version, both offering significant speed improvements over the original Transformers implementation while maintaining comparable accuracy.

Implementation Details

The model is built on RoBERTa architecture and optimized for the go_emotions dataset, capable of detecting 28 different emotions. The quantized version reduces the model size by 75% while maintaining nearly identical performance metrics (Accuracy: 0.475, Precision: 0.582, Recall: 0.398, F1: 0.447).

Full precision version: 2-3x faster than original Transformers
Quantized version: 5x faster than original Transformers
Optimized for CPU inference with ONNXRuntime
Supports both pipeline and direct ONNXRuntime implementation

Core Capabilities

Multi-label emotion classification across 28 categories
Efficient batch processing with dynamic padding
Sigmoid-based probability outputs for each emotion
Compatible with both Optimum library and direct ONNXRuntime usage

Frequently Asked Questions

Q: What makes this model unique?

The model combines the accuracy of RoBERTa with ONNX optimization, offering significant speed improvements and a smaller footprint through quantization, while maintaining accuracy. It's particularly efficient for small batch sizes on CPU.

Q: What are the recommended use cases?

This model is ideal for production environments where efficient CPU-based emotion detection is needed, particularly for applications requiring real-time or near-real-time processing with limited computational resources.