VulBERTa-MLP-Devign

VulBERTa-MLP-Devign

claudios

A specialized transformer model for detecting code vulnerabilities in C/C++, based on RoBERTa architecture with 125M parameters and MLP classification head.

PropertyValue
Parameter Count125M
LicenseMIT
PaperarXiv:2205.12424
Accuracy64.71%
F1 Score56.93%

What is VulBERTa-MLP-Devign?

VulBERTa-MLP-Devign is a specialized deep learning model designed for detecting security vulnerabilities in source code. Based on the RoBERTa architecture, it combines a pre-trained transformer model with an MLP (Multi-Layer Perceptron) classification head, specifically optimized for analyzing C/C++ code.

Implementation Details

The model utilizes a custom tokenization pipeline that includes automatic comment removal and specialized code processing. It requires libclang for tokenization and must be initialized with trust_remote_code=True due to its custom components.

  • Pre-trained on real-world code from open-source C/C++ projects
  • Implements binary classification for vulnerability detection
  • Achieves 64.71% accuracy and 71.02% ROC-AUC score
  • Uses F32 tensor type for computations

Core Capabilities

  • Automated vulnerability detection in C/C++ source code
  • Deep semantic code analysis
  • Binary classification of secure vs vulnerable code segments
  • Support for complex code structures and patterns

Frequently Asked Questions

Q: What makes this model unique?

VulBERTa stands out for its simplified yet effective approach to code vulnerability detection, achieving state-of-the-art performance with a relatively modest parameter count of 125M. Its custom tokenization pipeline and specialized pre-training on real-world code make it particularly effective for practical applications.

Q: What are the recommended use cases?

The model is specifically designed for security teams and developers who need to analyze C/C++ codebases for potential security vulnerabilities. It's particularly useful in automated code review processes and continuous integration pipelines where security scanning is required.

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026