final-complete-malicious-url-model
Property | Value |
---|---|
Model Type | BERT-based URL Classifier with LoRA |
Parameters | 110M |
Accuracy | 98% |
F1 Score | 0.965 |
Author | r3ddkahili |
Model URL | https://huggingface.co/r3ddkahili/final-complete-malicious-url-model |
What is final-complete-malicious-url-model?
This is a specialized BERT-based model fine-tuned using Low-Rank Adaptation (LoRA) for detecting malicious URLs in real-time. Built on bert-base-uncased, it can classify URLs into four categories: benign, defacement, phishing, and malware. The model was trained on the Kaggle Malicious URLs Dataset containing approximately 651,191 samples.
Implementation Details
The model utilizes the Hugging Face Transformers library with PyTorch backend and PEFT for efficient fine-tuning. It processes URLs with a maximum sequence length of 128 tokens and was trained using the AdamW optimizer with weight decay and weighted cross-entropy loss.
- Batch Size: 16 with 5 training epochs
- Learning Rate: 2e-5
- Evaluation Strategy: Epoch-based
- Fine-Tuning: LoRA applied to BERT layers
Core Capabilities
- Real-time URL classification with 98% validation accuracy
- Category-specific performance: Benign (F1: 0.985), Defacement (F1: 0.985), Phishing (F1: 0.935), Malware (F1: 0.955)
- Integration capabilities with browser extensions and security tools
- Suitable for SOC (Security Operations Centers) implementation
Frequently Asked Questions
Q: What makes this model unique?
The model combines BERT's powerful language understanding capabilities with LoRA fine-tuning, achieving high accuracy while maintaining computational efficiency. Its ability to distinguish between different types of threats makes it particularly valuable for cybersecurity applications.
Q: What are the recommended use cases?
The model is ideal for real-time URL classification in cybersecurity tools, browser extensions for instant threat alerts, phishing detection systems, and security monitoring in SOCs. It can be deployed via Streamlit web app, browser extension, or REST API integration.