220 444

dfuhoiysOHSVFh82934gfjklb

huba-buba

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago

unsloth/Qwen3.5-35B-A3B-GGUF

liked a dataset 4 days ago

togethercomputer/CoderForge-Preview

liked a model 5 days ago

Qwen/Qwen3.5-35B-A3B

View all activity

Organizations

None yet

upvoted a paper 7 days ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published 19 days ago • 215

upvoted an article 7 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

Dec 1, 2025

•

302

upvoted 2 papers 9 days ago

Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts

Paper • 2602.13367 • Published 17 days ago • 31

Small Language Models are the Future of Agentic AI

Paper • 2506.02153 • Published Jun 2, 2025 • 24

upvoted 3 papers 13 days ago

Experiential Reinforcement Learning

Paper • 2602.13949 • Published 15 days ago • 68

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

Paper • 2602.12099 • Published 17 days ago • 57

Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

Paper • 2602.10388 • Published 19 days ago • 236

upvoted an article 16 days ago

Article

Forge: Scalable Agent RL Framework and Algorithm

17 days ago

•

130

upvoted 2 papers 19 days ago

Weak-Driven Learning: How Weak Agents make Strong Agents Stronger

Paper • 2602.08222 • Published 21 days ago • 272

AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents

Paper • 2602.06855 • Published 23 days ago • 73

upvoted a paper 20 days ago

QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

Paper • 2602.07085 • Published 24 days ago • 186

upvoted a paper 21 days ago

F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare

Paper • 2602.06717 • Published 24 days ago • 71

upvoted 3 papers 22 days ago

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Paper • 2602.04634 • Published 26 days ago • 93

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

Paper • 2602.05885 • Published 24 days ago • 28

Reinforcement World Model Learning for LLM-based Agents

Paper • 2602.05842 • Published 24 days ago • 27

upvoted a paper 25 days ago

No One-Size-Fits-All: Building Systems For Translation to Bashkir, Kazakh, Kyrgyz, Tatar and Chuvash Using Synthetic And Original Data

Paper • 2602.04442 • Published 26 days ago • 3

upvoted an article 27 days ago

Article

🐯 Liger GRPO meets TRL

May 25, 2025

•

upvoted 3 papers 27 days ago

dfuhoiysOHSVFh82934gfjklb

AI & ML interests

Recent Activity

Organizations

huba-buba's activity

Transformers v5: Simple model definitions powering the AI ecosystem

Forge: Scalable Agent RL Framework and Algorithm

🐯 Liger GRPO meets TRL