A collection with text-classification and token-classification models for PII Protection
Alvaro Bartolome
AI & ML interests
machine learning @huggingface (inference + cloud)
Recent Activity
upvoted a changelog 1 day ago
Community Evals and Benchmark Repositories upvoted an article 1 day ago
Did GPT 5.2 make a breakthrough discovery in theoretical physics? new activity
1 day ago
huggingface/documentation-images:Add images for Cloud Run examples Organizations
Critique Models (CM) on the 🤗 Hub
This collection contains some Critique Models (CM) for LLM evaluation available in the HuggingFace Hub
-
openbmb/UltraCM-13b
Text Generation • Updated • 22 • 20 -
prometheus-eval/prometheus-7b-v1.0
Text Generation • Updated • 31 • 31 -
prometheus-eval/prometheus-13b-v1.0
Text Generation • Updated • 101 • 143 -
prometheus-eval/prometheus-7b-v2.0
Text Generation • 7B • Updated • 33.4k • 100
AIF Datasets (with distilabel)
Small to medium size datasets either: synthetically generated, labelled with AI Feedback (AIF), or both
NER in Spanish
Fine-tuned models to perform NER in Spanish using the framework SpanMarker and different encoders and datasets
-
alvarobartt/bert-base-multilingual-cased-ner-spanish
Token Classification • 0.2B • Updated • 11 • 3 -
alvarobartt/span-marker-xlm-roberta-large-conll-2002-es
Token Classification • Updated • 1 • 2 -
alvarobartt/span-marker-roberta-base-bne-conll-2002-es
Token Classification • Updated • 13 • 1
From zero to GPT-hero
Reading list to fully understand GPT (and GPT-2) and to be able to implement those from scratch
-
Neural Machine Translation of Rare Words with Subword Units
Paper • 1508.07909 • Published • 4 -
Attention Is All You Need
Paper • 1706.03762 • Published • 115 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 26 -
Generating Wikipedia by Summarizing Long Sequences
Paper • 1801.10198 • Published • 3
Studio Ghibli Diffusion
Text-To-Image fine-tunes with Studio Ghibli style
- Running on Zero22
FLUX.1 Studio Ghibli LoRA
🖼22Generate Studio Ghibli-style images from text prompts
-
alvarobartt/ghibli-characters
Viewer • Updated • 9 • 57 • 9 -
black-forest-labs/FLUX.1-dev
Text-to-Image • Updated • 694k • • 12.3k -
alvarobartt/ghibli-characters-flux-lora
Text-to-Image • Updated • 89 • • 64
About ORPO
Contains some information and experiments fine-tuning LLMs using 🤗 `trl.ORPOTrainer`
-
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 72 -
HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
Text Generation • 141B • Updated • 196 • 269 -
alvarobartt/mistral-orpo-mix
Text Generation • 7B • Updated • 4 • 1 -
alvarobartt/Mistral-7B-v0.1-ORPO
Text Generation • 7B • Updated • 15 • 14
Apple MLX-compatible 7B LLMs on the 🤗 Hub
This collection contains the model weights for 7B LLMs for Apple's MLX framework. Find more information at https://github.com/ml-explore/mlx
🇪🇸 Datasets in Spanish for LLM Evaluation
This collection contains some datasets for LLM evaluation in Spanish, from nlp.uoregon.edu, translated using ChatGPT (including English counterparts)
🔒 Models for PII Protection
A collection with text-classification and token-classification models for PII Protection
Studio Ghibli Diffusion
Text-To-Image fine-tunes with Studio Ghibli style
- Running on Zero22
FLUX.1 Studio Ghibli LoRA
🖼22Generate Studio Ghibli-style images from text prompts
-
alvarobartt/ghibli-characters
Viewer • Updated • 9 • 57 • 9 -
black-forest-labs/FLUX.1-dev
Text-to-Image • Updated • 694k • • 12.3k -
alvarobartt/ghibli-characters-flux-lora
Text-to-Image • Updated • 89 • • 64
Critique Models (CM) on the 🤗 Hub
This collection contains some Critique Models (CM) for LLM evaluation available in the HuggingFace Hub
-
openbmb/UltraCM-13b
Text Generation • Updated • 22 • 20 -
prometheus-eval/prometheus-7b-v1.0
Text Generation • Updated • 31 • 31 -
prometheus-eval/prometheus-13b-v1.0
Text Generation • Updated • 101 • 143 -
prometheus-eval/prometheus-7b-v2.0
Text Generation • 7B • Updated • 33.4k • 100
About ORPO
Contains some information and experiments fine-tuning LLMs using 🤗 `trl.ORPOTrainer`
-
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 72 -
HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1
Text Generation • 141B • Updated • 196 • 269 -
alvarobartt/mistral-orpo-mix
Text Generation • 7B • Updated • 4 • 1 -
alvarobartt/Mistral-7B-v0.1-ORPO
Text Generation • 7B • Updated • 15 • 14
AIF Datasets (with distilabel)
Small to medium size datasets either: synthetically generated, labelled with AI Feedback (AIF), or both
Apple MLX-compatible 7B LLMs on the 🤗 Hub
This collection contains the model weights for 7B LLMs for Apple's MLX framework. Find more information at https://github.com/ml-explore/mlx
NER in Spanish
Fine-tuned models to perform NER in Spanish using the framework SpanMarker and different encoders and datasets
-
alvarobartt/bert-base-multilingual-cased-ner-spanish
Token Classification • 0.2B • Updated • 11 • 3 -
alvarobartt/span-marker-xlm-roberta-large-conll-2002-es
Token Classification • Updated • 1 • 2 -
alvarobartt/span-marker-roberta-base-bne-conll-2002-es
Token Classification • Updated • 13 • 1
🇪🇸 Datasets in Spanish for LLM Evaluation
This collection contains some datasets for LLM evaluation in Spanish, from nlp.uoregon.edu, translated using ChatGPT (including English counterparts)
From zero to GPT-hero
Reading list to fully understand GPT (and GPT-2) and to be able to implement those from scratch
-
Neural Machine Translation of Rare Words with Subword Units
Paper • 1508.07909 • Published • 4 -
Attention Is All You Need
Paper • 1706.03762 • Published • 115 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 26 -
Generating Wikipedia by Summarizing Long Sequences
Paper • 1801.10198 • Published • 3