roberta-base-go_emotions-onnx

roberta-base-go_emotions-onnx

SamLowe

ONNX-optimized RoBERTa model for emotion detection with 28 emotion categories. Features both full precision and INT8 quantized versions for faster inference and smaller size.

PropertyValue
LicenseMIT
FrameworkONNX
TaskMulti-label Emotion Classification
SizeFull: 499MB, Quantized: 125MB

What is roberta-base-go_emotions-onnx?

This is an ONNX-optimized version of the RoBERTa-based emotion detection model, designed for efficient inference and deployment. It comes in two variants: a full precision model and an INT8 quantized version, both offering significant speed improvements over the original Transformers implementation while maintaining comparable accuracy.

Implementation Details

The model is built on RoBERTa architecture and optimized for the go_emotions dataset, capable of detecting 28 different emotions. The quantized version reduces the model size by 75% while maintaining nearly identical performance metrics (Accuracy: 0.475, Precision: 0.582, Recall: 0.398, F1: 0.447).

  • Full precision version: 2-3x faster than original Transformers
  • Quantized version: 5x faster than original Transformers
  • Optimized for CPU inference with ONNXRuntime
  • Supports both pipeline and direct ONNXRuntime implementation

Core Capabilities

  • Multi-label emotion classification across 28 categories
  • Efficient batch processing with dynamic padding
  • Sigmoid-based probability outputs for each emotion
  • Compatible with both Optimum library and direct ONNXRuntime usage

Frequently Asked Questions

Q: What makes this model unique?

The model combines the accuracy of RoBERTa with ONNX optimization, offering significant speed improvements and a smaller footprint through quantization, while maintaining accuracy. It's particularly efficient for small batch sizes on CPU.

Q: What are the recommended use cases?

This model is ideal for production environments where efficient CPU-based emotion detection is needed, particularly for applications requiring real-time or near-real-time processing with limited computational resources.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026