20 15 27

Cihang Xie

cihangxie

https://cihangxie.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewards

upvoted a collection about 2 months ago

SpatialThinker

upvoted a paper 2 months ago

When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought

View all activity

Organizations

upvoted a paper about 2 months ago

SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewards

Paper • 2511.07403 • Published Nov 10, 2025 • 14

upvoted a collection about 2 months ago

SpatialThinker

Collection

This collection consists of SpatialThinker 3B and 7B model checkpoints, and STVQA-7K, a Spatial VQA dataset used for training the models. • 4 items • Updated Nov 12, 2025 • 1

upvoted 2 papers 2 months ago

When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought

Paper • 2511.02779 • Published Nov 4, 2025 • 58

LightBagel: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation

Paper • 2510.22946 • Published Oct 27, 2025 • 16

upvoted a paper 4 months ago

OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

Paper • 2509.01644 • Published Sep 1, 2025 • 33

upvoted a paper 5 months ago

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Paper • 2507.21033 • Published Jul 28, 2025 • 21

upvoted a paper 8 months ago

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Paper • 2505.04601 • Published May 7, 2025 • 29

upvoted 3 papers 9 months ago

Complex-Edit: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark

Paper • 2504.13143 • Published Apr 17, 2025 • 7

SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models

Paper • 2504.11468 • Published Apr 10, 2025 • 30

ViLBench: A Suite for Vision-Language Process Reward Modeling

Paper • 2503.20271 • Published Mar 26, 2025 • 7

upvoted a paper 11 months ago

Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More

Paper • 2502.03738 • Published Feb 6, 2025 • 11

upvoted a paper about 1 year ago

Story-Adapter: A Training-free Iterative Framework for Long Story Visualization

Paper • 2410.06244 • Published Oct 8, 2024 • 19

upvoted 3 papers over 1 year ago

A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor?

Paper • 2409.15277 • Published Sep 23, 2024 • 38

VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models

Paper • 2406.16338 • Published Jun 24, 2024 • 26

HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

Paper • 2404.09990 • Published Apr 15, 2024 • 13

Cihang Xie

AI & ML interests

Recent Activity

Organizations

cihangxie's activity