Training Data Efficiency in Multimodal Process Reward Models Paper • 2602.04145 • Published 1 day ago • 56
Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing Paper • 2602.03845 • Published 1 day ago • 23
RelayLLM: Efficient Reasoning via Collaborative Decoding Paper • 2601.05167 • Published 28 days ago • 29
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning Paper • 2512.15687 • Published Dec 17, 2025 • 20
MotionEdit: Benchmarking and Learning Motion-Centric Image Editing Paper • 2512.10284 • Published Dec 11, 2025 • 26