FraudFoxAI Phishing Detection Model
Fine-tuned DistilBERT model for detecting phishing and fraudulent emails. Trained on 565,000+ curated emails with 99.71% accuracy.
Model Details
- Base Model: distilbert-base-uncased
- Training Data: 565,293 curated emails from multiple sources
- Inference Runtime: ONNX Runtime (PyTorch + ONNX available)
- Classes:
- LABEL_0: Legitimate Email
- LABEL_1: Phishing/Fraud Email
Performance
| Metric | Score |
|---|---|
| Accuracy | 99.71% |
| F1 Score | 0.9871 |
| Precision | 0.9897 |
| Recall | 0.9846 |
Training Data
Trained on 565,293 curated emails from multiple sources:
- Corporate email archives (legitimate emails)
- Reported phishing samples
- Known 419/advance-fee fraud emails
- Community-sourced spam and scam samples
Continuously improved with user feedback.
Training Configuration
- Epochs: 2
- Batch Size: 32
- Warmup Steps: 1,000
- Weight Decay: 0.01
- Max Length: 512 tokens
- Framework: PyTorch + Transformers
- Training Time: ~12 hours on Colab GPU
Usage
ONNX Runtime (recommended, low memory)
from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("xanderabim/fraudfoxai-phishing")
model = ORTModelForSequenceClassification.from_pretrained("xanderabim/fraudfoxai-phishing")
inputs = tokenizer("URGENT: Verify your account now!", return_tensors="np", truncation=True)
outputs = model(**inputs)
PyTorch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("xanderabim/fraudfoxai-phishing")
model = AutoModelForSequenceClassification.from_pretrained("xanderabim/fraudfoxai-phishing")
inputs = tokenizer("URGENT: Verify your account now!", return_tensors="pt", truncation=True)
outputs = model(**inputs)
Sample Predictions
| Phishing Score | Verdict | |
|---|---|---|
| "URGENT: Your PayPal account has been suspended!" | 99.99% | PHISHING |
| "Hi team, meeting at 2pm tomorrow" | 0.00% | SAFE |
| "Congratulations! You've won $1,000,000!" | 98.66% | PHISHING |
| "Meeting notes from yesterday attached" | 0.00% | SAFE |
| "Dear valued customer, your package delivery failed" | 99.92% | PHISHING |
Production API
Deployed at: https://fraudfoxai.xanderabim.workers.dev
Or forward any email to: check@fraudfox.ai
Limitations
- English language only
- Max 512 tokens per input
- May flag aggressive marketing emails as phishing
- Subject-only inputs are less accurate than full email (subject + body)
License
MIT
Author
- Downloads last month
- 64