FraudFoxAI Phishing Detection Model

Fine-tuned DistilBERT model for detecting phishing and fraudulent emails. Trained on 565,000+ curated emails with 99.71% accuracy.

Model Details

  • Base Model: distilbert-base-uncased
  • Training Data: 565,293 curated emails from multiple sources
  • Inference Runtime: ONNX Runtime (PyTorch + ONNX available)
  • Classes:
    • LABEL_0: Legitimate Email
    • LABEL_1: Phishing/Fraud Email

Performance

Metric Score
Accuracy 99.71%
F1 Score 0.9871
Precision 0.9897
Recall 0.9846

Training Data

Trained on 565,293 curated emails from multiple sources:

  • Corporate email archives (legitimate emails)
  • Reported phishing samples
  • Known 419/advance-fee fraud emails
  • Community-sourced spam and scam samples

Continuously improved with user feedback.

Training Configuration

  • Epochs: 2
  • Batch Size: 32
  • Warmup Steps: 1,000
  • Weight Decay: 0.01
  • Max Length: 512 tokens
  • Framework: PyTorch + Transformers
  • Training Time: ~12 hours on Colab GPU

Usage

ONNX Runtime (recommended, low memory)

from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("xanderabim/fraudfoxai-phishing")
model = ORTModelForSequenceClassification.from_pretrained("xanderabim/fraudfoxai-phishing")

inputs = tokenizer("URGENT: Verify your account now!", return_tensors="np", truncation=True)
outputs = model(**inputs)

PyTorch

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("xanderabim/fraudfoxai-phishing")
model = AutoModelForSequenceClassification.from_pretrained("xanderabim/fraudfoxai-phishing")

inputs = tokenizer("URGENT: Verify your account now!", return_tensors="pt", truncation=True)
outputs = model(**inputs)

Sample Predictions

Email Phishing Score Verdict
"URGENT: Your PayPal account has been suspended!" 99.99% PHISHING
"Hi team, meeting at 2pm tomorrow" 0.00% SAFE
"Congratulations! You've won $1,000,000!" 98.66% PHISHING
"Meeting notes from yesterday attached" 0.00% SAFE
"Dear valued customer, your package delivery failed" 99.92% PHISHING

Production API

Deployed at: https://fraudfoxai.xanderabim.workers.dev

Or forward any email to: check@fraudfox.ai

Limitations

  • English language only
  • Max 512 tokens per input
  • May flag aggressive marketing emails as phishing
  • Subject-only inputs are less accurate than full email (subject + body)

License

MIT

Author

@xanderabim

Downloads last month
64
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using xanderabim/fraudfoxai-phishing 1