InternVL3.5 Collection This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 54 items • Updated Sep 28, 2025 • 104
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated Oct 30, 2025 • 77
🛰️🌍 Geospatial Datasets Collection A curated collections of diverse geospatial and satellite imagery datasets. • 58 items • Updated Nov 18, 2025 • 27
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces Paper • 2506.00123 • Published May 30, 2025 • 35
D-FINE Collection State-of-the-art real-time object detection model with Apache 2.0 licence • 15 items • Updated May 5, 2025 • 56
view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality +2 Mar 4, 2025 • 78
Visual Document Retrieval Collection A collection of models, datasets, and spaces in the VDR series • 5 items • Updated Jan 10, 2025 • 8
view article Article 🚀 Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker! Jan 29, 2025 • 21
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Aug 25, 2025 • 82
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 13 days ago • 309