GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 3 days ago • 130
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization Paper • 2601.01554 • Published 7 days ago • 50
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation Paper • 2512.23576 • Published 13 days ago • 64
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization Paper • 2511.15705 • Published Nov 19, 2025 • 93
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning Paper • 2507.16812 • Published Jul 22, 2025 • 63
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Paper • 2406.12753 • Published Jun 18, 2024 • 16