GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning Paper • 2606.17480 • Published 9 days ago • 3
PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models Paper • 2606.19534 • Published 8 days ago • 58
MaineCoon: Pursuing A Real-Time Audio-Visual Social World Model Paper • 2606.17800 • Published 9 days ago • 13
DF3DV-1K: A Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesis Paper • 2604.13416 • Published 7 days ago • 31
HumanScale: Egocentric Human Video Can Outperform Real-Robot Data for Embodied Pretraining Paper • 2606.20521 • Published 7 days ago • 10
DragMesh-2: Physically Plausible Dexterous Hand-Object Interaction with Articulated Objects Paper • 2606.15133 • Published 12 days ago • 72
JanusMesh: Fast and Zero-Shot 3D Visual Illusion Generation via Cross-Space Denoising Paper • 2606.20563 • Published 7 days ago • 20
S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence Paper • 2606.20515 • Published 7 days ago • 39
Adaptive Volumetric Mechanical Property Fields Invariant to Resolution Paper • 2606.18231 • Published 9 days ago • 5
Holo-World: Unified Camera, Object and Weather Control for Video World Model Paper • 2606.20083 • Published 7 days ago • 9
ENPIRE: Agentic Robot Policy Self-Improvement in the Real World Paper • 2606.19980 • Published 7 days ago • 14
From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning Paper • 2606.17682 • Published 9 days ago • 26
MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction Paper • 2606.18558 • Published 8 days ago • 50
ViT-Up: Faithful Feature Upsampling for Vision Transformers Paper • 2606.14024 • Published 13 days ago • 9
PAIWorld: A 3D-Consistent World Foundation Model for Robotic Manipulation Paper • 2606.18375 • Published 9 days ago • 11
Reinforcing Dual-Path Reasoning in Spatial Vision Language Models Paper • 2606.17539 • Published 9 days ago • 15
Unified Multimodal Autoregressive Modeling with Shared Context-Visual Tokenizer is Key to Unification Paper • 2606.18249 • Published 9 days ago • 14