Hugging Face OSS Metrics

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

sayakpaul authored a paper 9 days ago

Fine-Grained Perturbation Guidance via Attention Head Selection

sayakpaul authored a paper 9 days ago

From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors

sayakpaul submitted a paper 12 days ago

From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors

View all activity

sayakpaul

authored 2 papers 9 days ago

Fine-Grained Perturbation Guidance via Attention Head Selection

Paper • 2506.10978 • Published Jun 12, 2025 • 25

From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors

Paper • 2602.21778 • Published 14 days ago • 14

albertvillanova

posted an update 12 days ago

Post

1819

🚀 TRL v0.29.0 introduces trl-training: an agent-native training skill.

This makes the TRL CLI a structured, agent-readable capability, allowing AI agents to reliably execute training workflows such as:
- Supervised Fine-Tuning (SFT)
- Direct Preference Optimization (DPO)
- Group Relative Policy Optimization (GRPO)

We’re excited to see what the community builds on top of this.

If you’re working on AI agents, alignment research, or scalable RL training infrastructure: give TRL v0.29.0 a try! 🤗

The future of ML tooling is agent-native.
🔗 https://github.com/huggingface/trl/releases/tag/v0.29.0

sayakpaul

submitted a paper to Daily Papers 12 days ago

From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors

Paper • 2602.21778 • Published 14 days ago • 14

victor

submitted a paper to Daily Papers 13 days ago

DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference

Paper • 2602.21548 • Published 14 days ago • 42

sayakpaul

authored a paper 19 days ago

TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models

Paper • 2602.15449 • Published 22 days ago • 6

qgallouedec

posted an update 19 days ago

Post

2719

@CohereLabs just released 🌿 Tiny Aya: a fully open-source 3B parameter model that speaks 70+ languages 🌍! But there’s a catch:

Tiny Aya is just a language model. It doesn’t support tool calling, the key capability that turns frontier models into powerful *agents*.
So the real question is:

How hard is it to turn Tiny Aya into an agent?

Turns out… it’s simple, thanks to Hugging Face TRL.
We’re sharing a hands-on example showing how to train Tiny Aya to turn it into a tool-calling agent using TRL, unlocking what could become the first *massively multilingual open agent*.

Small model. Global reach. Agent capabilities.

👉 https://github.com/huggingface/trl/blob/main/examples/notebooks/sft_tool_calling.ipynb

1 reply

lewtun

submitted a paper to Daily Papers 25 days ago

Single-minus gluon tree amplitudes are nonzero

Paper • 2602.12176 • Published 26 days ago • 8

lewtun

submitted a paper to Daily Papers 26 days ago

Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL

Paper • 2602.03773 • Published Feb 3 • 12

albertvillanova

posted an update 27 days ago

Post

1743

5 years already working in democratizing AI 🤗
Grateful to be part of such an awesome team making it happen every day.

cfahlgren1

submitted a paper to Daily Papers about 1 month ago

How AI Impacts Skill Formation

Paper • 2601.20245 • Published Jan 28 • 9

victor

posted an update about 1 month ago

Post

1161

Interesting article: use Claude Code to help open models write CUDA kernels (for eg) by turning CC traces into Skills. They made a library out of it 👀

https://huggingface.co/blog/upskill

yjernite

authored a paper about 1 month ago

INTIMA: A Benchmark for Human-AI Companionship Behavior

Paper • 2508.09998 • Published Aug 4, 2025 • 11

pcuenq

posted an update 2 months ago

Post

3697

👉 What happened in AI in 2025? 👈

We prepared the 2025 version of the HF AI Timeline Grid, highlighting open vs API-based model releases, and allowing you to browse and filter by access, modality, and release type!

Play with it here:
2025-ai-timeline/2025-ai-timeline

Here's my personal quarterly TL;DR:

1️⃣ Q1 — Learning to Reason
Deepseek not only releases a top-notch reasoning model, but shows how to train them and compete with closed frontier models. OpenAI debuts Deep Research.

Significant milestones: DeepSeek R1 & R1-Zero, Qwen 2.5 VL, OpenAI Deep Research, Gemini 2.5 Pro (experimental)

2️⃣ Q2 — Multimodality and Coding
More LLMs embrace multimodality by default, and there's a surge in coding agents. Strong vision, audio, and generative models emerge.

Significant milestones: Llama 4, Qwen 3, Imagen 4, OpenAI Codex, Google Jules, Claude 4

3️⃣ Q3 — "Gold" rush, OpenAI opens up, the community goes bananas
Flagship models get gold in Math olympiads and hard benchmarks. OpenAI releases strong open source models and Google releases the much anticipated nano-banana for image generation and editing. Agentic workflows become commonplace.

Significant milestones: Gemini and OpenAI IMO Gold, gpt-oss, Gemini 2.5 Flash Image, Grok 4, Claude Sonnet 4.5

4️⃣ Q4 — Mistral returns, leaderboard hill-climbing
Mistral is back with updated model families. All labs release impressive models to wrap up the year!

Significant milestones: Claude Opus 4.5, DeepSeek Math V2, FLUX 2, GPT 5.1, Kimi K2 Thinking, Nano Banana Pro, GLM 4.7, Gemini 3, Mistral 3, MiniMax M2.1 🤯

Credits
🙏 NHLOCAL for the source data https://github.com/NHLOCAL/AiTimeline

🫡 @reach-vb for the original idea, design and recipe

🙌 @ariG23498 and yours truly for compiling and verifying the 2025 edition

🥳 Here's to 2026, wishing it becomes the best year ever for open releases and on-device-first use-cases! 🥂

2 replies

qgallouedec

submitted a paper to Daily Papers 3 months ago

INTELLECT-3: Technical Report

Paper • 2512.16144 • Published Dec 18, 2025 • 20

victor

posted an update 3 months ago

Post

3441

Nvidia is on a roll lately. Nemotron 3 Nano is my new fav local model, but here's the real flex: they published the entire evaluation setup. Configs, prompts, logs, all of it. This is how you do open models 🔥

https://huggingface.co/blog/nvidia/nemotron-3-nano-evaluation-recipe

tomaarsen

posted an update 3 months ago

Post

4012

🐦‍🔥 I've just published Sentence Transformers v5.2.0! It introduces multi-processing for CrossEncoder (rerankers), multilingual NanoBEIR evaluators, similarity score outputs in mine_hard_negatives, Transformers v5 support and more. Details:

- CrossEncoder multi-processing: Similar to SentenceTransformer and SparseEncoder, you can now use multi-processing with CrossEncoder rerankers. Useful for multi-GPU and CPU settings, and simple to configure: just device=["cuda:0", "cuda:1"] or device=["cpu"]*4 on the model.predict or model.rank calls.

- Multilingual NanoBEIR Support: You can now use community translations of the tiny NanoBEIR retrieval benchmark instead of only the English one, by passing dataset_id, e.g. dataset_id="lightonai/NanoBEIR-de" for the German benchmark.

- Similarity scores in Hard Negatives Mining: When mining for hard negatives to create a strong training dataset, you can now pass output_scores=True to get similarity scores returned. This can be useful for some distillation losses!

- Transformers v5: This release works with both Transformers v4 and the upcoming v5. In the future, Sentence Transformers will only work with Transformers v5, but not yet!

- Python 3.9 deprecation: Now that Python 3.9 has lost security support, Sentence Transformers no longer supports it.

Check out the full changelog for more details: https://github.com/huggingface/sentence-transformers/releases/tag/v5.2.0

I'm quite excited about what's coming. There's a huge draft PR with a notable refactor in the works that should bring some exciting support. Specifically, better multimodality, rerankers, and perhaps some late interaction in the future!