wei
zhuww
AI & ML interests
None yet
Organizations
None yet
RL
-
Large Reasoning Models Learn Better Alignment from Flawed Thinking
Paper • 2510.00938 • Published • 59 -
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT
Paper • 2509.19284 • Published • 23 -
Learning to Reason as Action Abstractions with Scalable Mid-Training RL
Paper • 2509.25810 • Published • 6 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 273
multi-turn
RL
-
Large Reasoning Models Learn Better Alignment from Flawed Thinking
Paper • 2510.00938 • Published • 59 -
What Characterizes Effective Reasoning? Revisiting Length, Review, and Structure of CoT
Paper • 2509.19284 • Published • 23 -
Learning to Reason as Action Abstractions with Scalable Mid-Training RL
Paper • 2509.25810 • Published • 6 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 273
models 0
None public yet
datasets 0
None public yet