AI & ML interests

None defined yet.

Recent Activity

MaziyarPanahi 
posted an update 4 days ago
view post
Post
1310
Announcing: OpenMed Multilingual PII Detection Models

Today I am releasing 105 open-source models for Personally Identifiable Information (PII) detection in French, German, and Italian.

All Apache 2.0 licensed. Free for commercial use. No restrictions.

Performance:

- French: 97.97% F1 (top model)
- German: 97.61% F1 (top model)
- Italian: 97.28% F1 (top model)

All top-10 models per language exceed 96% F1

Coverage:

55+ PII entity types per language
Native ID formats: NSS (French), Sozialversicherungsnummer (German), Codice Fiscale (Italian)
Language-specific address, phone, and name patterns

Training Data:

French: 49,580 samples
German: 42,250 samples
Italian: 40,944 samples

Why Multilingual?

European healthcare operates in European languages. Clinical notes, patient records, and medical documents are generated in French, German, Italian, and other languages.

Effective de-identification requires:

- Native language understanding — not translation
- Local ID format recognition — each country has unique patterns
- Cultural context awareness — names, addresses, and formats vary
- These models deliver production-ready accuracy without requiring data to leave your infrastructure or language.

HIPAA & GDPR Compliance
Built for US and European privacy regulations:

- On-premise deployment: Process data locally with zero external dependencies
- Data sovereignty: No API calls, no cloud services, no cross-border transfers
- Air-gapped capable: Deploy in fully isolated environments if required
- Regulatory-grade accuracy: Supporting Expert Determination standards
- HIPAA and GDPR compliance across languages, without compliance gaps.

Use Cases
- Hospital EHR systems: Automated patient record de-identification
- Clinical research: Multilingual dataset preparation for studies
- Insurance companies: Claims processing across

https://huggingface.co/collections/OpenMed/multilingual-pii-and-de-identification
  • 1 reply
·
danielhanchen 
posted an update 4 days ago
view post
Post
4914
We collaborated with Hugging Face to enable you to train MoE models 12× faster with 35% less VRAM via our new Triton kernels (no accuracy loss). 🤗

Train gpt-oss locally on 12.8GB VRAM with our free notebooks: https://unsloth.ai/docs/new/faster-moe
  • 1 reply
·
MaziyarPanahi 
posted an update 7 days ago
view post
Post
1169
From Golden Gate Bridge to Broken JSON: Why Anthropic's SAE Steering Fails for Structured Output

I ran 6 experiments trying to use Anthropic's SAE steering for JSON generation.

- Base model: 86.8% valid JSON
- Steering only: 24.4%
- Fine-tuned: 96.6%
- FSM constrained: 100%

Steering is for semantics, not syntax.

https://huggingface.co/blog/MaziyarPanahi/sae-steering-json
MaziyarPanahi 
posted an update 8 days ago
view post
Post
3894
🚨 Day 8/8: OpenMed Medical Reasoning Dataset Release - THE GRAND FINALE

Today I complete my 8-day release series with Medical-Reasoning-SFT-Mega.
The largest open medical reasoning dataset, combining 7 state-of-the-art AI models with fair distribution deduplication.

THE 7 SOURCE MODELS (Original Sample Counts):

1. Trinity-Mini: 810,284 samples
2. Qwen3-Next-80B: 604,249 samples
3. GPT-OSS-120B: 506,150 samples
4. Nemotron-Nano-30B: 444,544 samples
5. GLM-4.5-Air: 225,179 samples
6. MiniMax-M2.1: 204,773 samples
7. Baichuan-M3-235B: 124,520 samples

TOTAL BEFORE DEDUPLICATION: 2,919,699 samples

TOKEN COUNTS:
- Content tokens: 2.22 Billion
- Reasoning tokens: 1.56 Billion
- Total tokens: 3.78 Billion
- Samples with chain-of-thought: 100%

Quick Start:
from datasets import load_dataset
ds = load_dataset("OpenMed/Medical-Reasoning-SFT-Mega")


All datasets Apache 2.0 licensed. Free for research and commercial use.

Thank you for following OpenMed's release series. I can't wait to see what you build. 🔥

OpenMed/Medical-Reasoning-SFT-Mega
OpenMed/Medical-Reasoning-SFT-GPT-OSS-120B-V2
OpenMed/Medical-Reasoning-SFT-Trinity-Mini
OpenMed/Medical-Reasoning-SFT-GLM_4.5_Air
OpenMed/Medical-Reasoning-SFT-MiniMax-M2.1
OpenMed/Medical-Reasoning-SFT-Qwen3-Next-80B
OpenMed/Medical-Reasoning-SFT-Nemotron-Nano-30B
https://huggingface.co/datasets/OpenMed/Medical-Reasonin

https://huggingface.co/collections/OpenMed/medical-datasets
·
danielhanchen 
posted an update 9 days ago
view post
Post
3595
We created a tool-calling guide for local LLMs!

Learn how to use any open model like Qwen3-Coder-Next and GLM-4.7-Flash for function calling.

Guide: https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms

We provide hands-on examples for: story writing, Python execution, terminal tool calls, maths and more.
·
danielhanchen 
posted an update 11 days ago
danielhanchen 
posted an update 17 days ago
danielhanchen 
posted an update 23 days ago
view post
Post
2600
You can now fine-tune embedding models in our free Unsloth notebook! 🤗

Fine-tuning embedding models improves retrieval & RAG by aligning vectors to your domain-specific notion of similarity, improving search, clustering, and recommendations on your data.

⭐ Blog + Notebooks: https://unsloth.ai/docs/new/embedding-finetuning

Unsloth trains embedding models 1.8-3.3x faster with 20% less VRAM, 2x longer context & no accuracy loss vs. FA2 setups.

We'd like to thank Hugging Face and Unsloth contributor: electroglyph for making this possible!
·
danielhanchen 
posted an update 25 days ago
danielhanchen 
posted an update about 1 month ago
view post
Post
2846
You can now do reinforcement learning training with 7× longer context and no accuracy loss, via our new batching algorithms.

Long reasoning chains in RL are costly, but now we enable you to train gpt-oss with GRPO & reach 380K context on a 192GB GPU.

Blog: https://unsloth.ai/docs/new/grpo-long-context
MaziyarPanahi 
posted an update about 1 month ago
view post
Post
3698
🎉 OpenMed 2025 Year in Review: 6 Months of Open Medical AI

I'm thrilled to share what the OpenMed community has accomplished since our July 2025 launch!

📊 The Numbers

29,700,000 downloads Thank you! 🙏

- 481 total models (475 medical NER models + 6 fine-tuned LLMs)
- 475 medical NER models in [OpenMed](
OpenMed
) organization
- 6 fine-tuned LLMs in [openmed-community](
openmed-community
)
- 551,800 PyPI downloads of the [openmed package](https://pypi.org/project/openmed/)
- 707 followers on HuggingFace (you!)
- 97 GitHub stars on the [toolkit repo](https://github.com/maziyarpanahi/openmed)

🏆 Top Models by Downloads

1. [OpenMed-NER-PharmaDetect-SuperClinical-434M]( OpenMed/OpenMed-NER-PharmaDetect-SuperClinical-434M) — 147,305 downloads
2. [OpenMed-NER-ChemicalDetect-ElectraMed-33M]( OpenMed/OpenMed-NER-ChemicalDetect-ElectraMed-33M) — 126,785 downloads
3. [OpenMed-NER-BloodCancerDetect-TinyMed-65M]( OpenMed/OpenMed-NER-BloodCancerDetect-TinyMed-65M) — 126,465 downloads

🔬 Model Categories

Our 481 models cover comprehensive medical domains:

- Disease Detection (~50 variants)
- Pharmaceutical Detection (~50 variants)
- Oncology Detection (~50 variants)
- Genomics/DNA Detection (~80 variants)
- Chemical Detection (~50 variants)
- Species/Organism Detection (~60 variants)
- Protein Detection (~50 variants)
- Pathology Detection (~50 variants)
- Blood Cancer Detection (~30 variants)
- Anatomy Detection (~40 variants)
- Zero-Shot NER (GLiNER-based)


OpenMed

OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets (2508.01630)
https://huggingface.co/collections/OpenMed/medical-and-clinical-ner
https://huggingface.co/collections/OpenMed/zeroshot-medical-and-clinical-ner
OpenMed/Medical-Reasoning-SFT-GPT-OSS-120B
  • 1 reply
·
danielhanchen 
posted an update about 2 months ago
danielhanchen 
posted an update about 2 months ago
danielhanchen 
posted an update about 2 months ago
danielhanchen 
posted an update 2 months ago
danielhanchen 
posted an update 2 months ago
danielhanchen 
posted an update 2 months ago
view post
Post
3872
Mistral's new Ministral 3 models can now be Run & Fine-tuned locally! (16GB RAM)
Ministral 3 have vision support and the best-in-class performance for their sizes.
14B Instruct GGUF: unsloth/Ministral-3-14B-Instruct-2512-GGUF
14B Reasoning GGUF: unsloth/Ministral-3-14B-Reasoning-2512-GGUF

🐱 Step-by-step Guide: https://docs.unsloth.ai/new/ministral-3
All GGUFs, BnB, FP8 etc. variants uploads: https://huggingface.co/collections/unsloth/ministral-3
·
danielhanchen 
posted an update 3 months ago
danielhanchen 
posted an update 3 months ago