eu-pii-multi-mini-preview

!IMPORTANT! This is a preview model currently under training. Take extra caution when deploying it in any production setting.

Multilingual (24 EU languages) PII token-classification model, fine-tuned from nreimers/mMiniLMv2-L6-H384-distilled-from-XLMR-Large (34 entity classes, BIO scheme).

Results

split macro P macro R macro F1 (w/o O)
valid 0.9082 0.8970 0.8963
test 0.8080 0.7862 0.7863

Per-language test F1 (macro, without O)

lang docs F1
bg 45 0.9055
cs 55 0.9183
da 63 0.9023
de 59 0.8919
el 62 0.9142
en 62 0.8935
es 57 0.8732
et 78 0.9061
fi 65 0.9429
fr 52 0.917
ga 56 0.8993
hr 58 0.9231
hu 58 0.9111
it 51 0.8951
lt 52 0.9242
lv 52 0.9282
mt 52 0.9182
nl 60 0.9049
pl 69 0.7252
pt 67 0.8945
ro 53 0.8969
sk 56 0.8791
sl 68 0.8885
sv 57 0.8676
ALL 1407 0.7863

Entity classes

ACCOUNT_IDENTIFIER, AUTH_SECRET, BANK_ACCOUNT_IDENTIFIER, BIOMETRIC_DATA, CONTACT_HANDLE, CRIMINAL_OFFENCE_DATA, DATE, DATE_OF_BIRTH, DEVICE_IDENTIFIER, DOCUMENT_IDENTIFIER, EMAIL_ADDRESS, ETHNIC_ORIGIN, FINANCIAL_AMOUNT, GEO_LOCATION, HEALTH_DATA, IDENTIFYING_LINK, IP_ADDRESS, LOCATION, ORGANIZATION_IDENTIFIER, ORGANIZATION_NAME, PAYMENT_CARD, PAYMENT_CARD_SECURITY, PERSON_ATTRIBUTE, PERSON_IDENTIFIER, PERSON_NAME, PERSON_ROLE_OR_TITLE, PHONE_NUMBER, POLITICAL_OPINION, POSTAL_ADDRESS, PROPER_NAME, RELIGION_OR_BELIEF, SEXUAL_ORIENTATION, TRADE_UNION_MEMBERSHIP, VEHICLE_IDENTIFIER

Training

  • learning rate 3e-05, batch size 32, 5 epochs, weight decay 0.01
  • early stopping on validation macro-F1 (without O)

ONNX

The onnx/ folder contains model.onnx and a dynamically int8-quantized model_quantized.onnx for CPU inference:

from optimum.onnxruntime import ORTModelForTokenClassification
model = ORTModelForTokenClassification.from_pretrained(
    "bardsai/eu-pii-multi-mini-preview", subfolder="onnx", file_name="model_quantized.onnx"
)
Downloads last month
17
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bardsai/eu-pii-multi-mini-preview

Quantized
(2)
this model