Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2512.20848 • Published 12 days ago • 30
Omni-Weather: Unified Multimodal Foundation Model for Weather Generation and Understanding Paper • 2512.21643 • Published 11 days ago • 11
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation Paper • 2512.23705 • Published 7 days ago • 43
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published 14 days ago • 61
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published 17 days ago • 24
Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation Paper • 2512.16767 • Published 18 days ago • 4
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Paper • 2512.15603 • Published 19 days ago • 59
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published 27 days ago • 116
Multi-view Pyramid Transformer: Look Coarser to See Broader Paper • 2512.07806 • Published 28 days ago • 20
MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos Paper • 2512.10881 • Published 25 days ago • 29
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation Paper • 2512.09363 • Published 26 days ago • 71
SIMA 2: A Generalist Embodied Agent for Virtual Worlds Paper • 2512.04797 • Published Dec 4, 2025 • 24
RELIC: Interactive Video World Model with Long-Horizon Memory Paper • 2512.04040 • Published Dec 3, 2025 • 23
Deep Unsupervised Learning using Nonequilibrium Thermodynamics Paper • 1503.03585 • Published Mar 12, 2015 • 6