A unified framework for detecting point and collective anomalies in operating system logs via collaborative transformers Paper • 2512.23380 • Published 7 days ago • 41
UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement Paper • 2512.21185 • Published 12 days ago • 25
SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents Paper • 2512.22322 • Published 9 days ago • 37
Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion Paper • 2512.23709 • Published 6 days ago • 44
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published 5 days ago • 52
SpotEdit: Selective Region Editing in Diffusion Transformers Paper • 2512.22323 • Published 9 days ago • 37
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone Paper • 2512.22615 • Published 8 days ago • 43
Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published 9 days ago • 56
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published 5 days ago • 104
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation Paper • 2512.23576 • Published 6 days ago • 63
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published 7 days ago • 89
InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search Paper • 2512.18745 • Published 15 days ago • 11
Omni-Weather: Unified Multimodal Foundation Model for Weather Generation and Understanding Paper • 2512.21643 • Published 11 days ago • 10
ProEdit: Inversion-based Editing From Prompts Done Right Paper • 2512.22118 • Published 9 days ago • 17
UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture Paper • 2512.21675 • Published 11 days ago • 24
MAI-UI Technical Report: Real-World Centric Foundation GUI Agents Paper • 2512.22047 • Published 9 days ago • 26
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published 17 days ago • 95
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published 12 days ago • 60