Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation Paper • 2605.18740 • Published May 18 • 5
Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts Paper • 2606.05922 • Published 14 days ago • 52
Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models Paper • 2606.11025 • Published 10 days ago • 41
SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning Paper • 2606.10804 • Published 10 days ago • 43
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published Apr 13 • 102
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published Apr 14 • 110
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling Paper • 2603.25746 • Published Mar 26 • 155
CodeGoat24/UniGenBench-EvalModel-qwen-72b-v1 Image-Text-to-Text • 73B • Updated Oct 25, 2025 • 154 • 4
Evaluating and Steering Modality Preferences in Multimodal Large Language Model Paper • 2505.20977 • Published May 27, 2025 • 10
view article Article NEO-unify: Building Native Multimodal Unified Models End to End sensenova • Mar 5 • 165
CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation Paper • 2603.08652 • Published Mar 9 • 41
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published May 6, 2025 • 94
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published Mar 7, 2025 • 124
Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders Paper • 2601.10332 • Published Jan 15 • 32
Enhancing Spatial Understanding in Image Generation via Reward Modeling Paper • 2602.24233 • Published Feb 27 • 60