arxiv:2606.07326

AnchorWorld: Embodied Egocentric World Simulation with View-based Evolution Customization

Published on Jun 5

· Submitted by

yu li on Jun 8

Kling Team

Upvote

Authors:

Abstract

AnchorWorld advances egocentric simulation through enhanced interaction integrity and flexible world customization using 3D human motion and anchor view definitions.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

Despite being a pivotal frontier, interactive world modeling remains underexplored in terms of the versatile controllability required by practical scenarios. To bridge this gap, we present AnchorWorld, a framework that advances egocentric simulation through enhanced interaction integrity and a flexible mechanism for world customization. First, we utilize 3D human motion as the primary interaction modality. To complement the out-of-view or truncated body parts in egocentric views, we introduce an auxiliary training supervision that incorporates exogenous viewpoints decoupled from the agent's first-person sensorium. It allows the model to observe the agent's full-body positioning relative to the environment, facilitating a more robust spatial grounding of human-world interactions. Furthermore, we propose a simple yet effective mechanism for customizing self-evolving worlds. This is achieved by defining anchor views within a unified world coordinate system, coupled with textual descriptions dictating the dynamic evolution of local scenes. Experimental results show that AnchorWorld significantly outperforms state-of-the-art baselines, while ablation studies validate the effectiveness of our key designs. Notably, our customization scheme exhibits promising spatio-temporal geometric consistency and adheres strictly to the prescribed evolutionary dynamics.

View arXiv page View PDF Project page Add to collection

Community

lyabc

Paper submitter 1 day ago

•

edited 1 day ago

We propose AnchorWorld, a framework that combines embodied egocentric action control with world customization. AnchorWorld enables human-motion-driven exploration and interaction within customizable, self-evolving worlds