ReconPhys: Reconstruct Appearance and Physical Attributes from Single Video
Abstract
ReconPhys presents a feedforward framework that jointly learns physical attribute estimation and 3D Gaussian Splatting reconstruction from monocular video, achieving faster inference and better reconstruction quality than existing optimization-based methods.
Reconstructing non-rigid objects with physical plausibility remains a significant challenge. Existing approaches leverage differentiable rendering for per-scene optimization, recovering geometry and dynamics but requiring expensive tuning or manual annotation, which limits practicality and generalizability. To address this, we propose ReconPhys, the first feedforward framework that jointly learns physical attribute estimation and 3D Gaussian Splatting reconstruction from a single monocular video. Our method employs a dual-branch architecture trained via a self-supervised strategy, eliminating the need for ground-truth physics labels. Given a video sequence, ReconPhys simultaneously infers geometry, appearance, and physical attributes. Experiments on a large-scale synthetic dataset demonstrate superior performance: our method achieves 21.64 PSNR in future prediction compared to 13.27 by state-of-the-art optimization baselines, while reducing Chamfer Distance from 0.349 to 0.004. Crucially, ReconPhys enables fast inference (<1 second) versus hours required by existing methods, facilitating rapid generation of simulation-ready assets for robotics and graphics.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- PhysVideo: Physically Plausible Video Generation with Cross-View Geometry Guidance (2026)
- MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction (2026)
- UFO-4D: Unposed Feedforward 4D Reconstruction from Two Images (2026)
- ArtHOI: Articulated Human-Object Interaction Synthesis by 4D Reconstruction from Video Priors (2026)
- UniQueR: Unified Query-based Feedforward 3D Reconstruction (2026)
- ReLi3D: Relightable Multi-view 3D Reconstruction with Disentangled Illumination (2026)
- SimRecon: SimReady Compositional Scene Reconstruction from Real Videos (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2604.07882 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper