deberta-v3-base-injection
Property | Value |
---|---|
Parameter Count | 184M |
License | MIT |
Languages | English, German |
Training Accuracy | 99.14% |
What is deberta-v3-base-injection?
deberta-v3-base-injection is a specialized model fine-tuned from Microsoft's DeBERTa-v3-base architecture for detecting prompt injection attempts in text. Developed by deepset, this model serves as a security tool to identify potentially malicious prompt manipulation attempts, classifying inputs as either "INJECTION" or "LEGIT".
Implementation Details
The model was trained using the prompt-injection dataset with careful consideration of hyperparameters, including a learning rate of 2e-05 and Adam optimizer. Training was conducted over 3 epochs with a batch size of 8, achieving impressive final validation metrics.
- Built on Microsoft's DeBERTa-v3-base architecture
- Trained using PyTorch 2.0.0 and Transformers 4.29.1
- Implements safetensors for secure model storage
- Achieves 99.14% accuracy on evaluation set
Core Capabilities
- Binary classification of text inputs (INJECTION vs LEGIT)
- Supports both English and German language processing
- Optimized for security applications in AI systems
- Highly accurate detection of prompt injection attempts
Frequently Asked Questions
Q: What makes this model unique?
This model specializes in detecting prompt injection attempts with extremely high accuracy (99.14%), making it valuable for securing AI systems against manipulation attempts. Its bilingual capability and foundation on the robust DeBERTa-v3 architecture set it apart from similar security models.
Q: What are the recommended use cases?
The model is ideal for securing AI systems, chatbots, and language models against prompt injection attacks. It can be used as a preprocessing step to filter potentially malicious inputs, though users can retrain it with custom legitimate examples if needed to reduce false positives.