Hi there, this is Xingrui! 👋

I am a PhD student in the Computer Science Department at Johns Hopkins University, advised by Prof. Alan Yuille. I am currently also interning with the GenAI group at AMD.

Before JHU, I obtained my Master’s degree from the University of Southern California, where I worked closely with Prof. Laurent Itti. Prior to USC, I received my B.S. in Statistics from Renmin University of China, where I was supervised by Prof. Hanfang Yang. I also spent time in the AI lab at the Samsung R&D Institute, supervised by Dr. Yang Liu.

My research focuses on 3D scene understanding and spatial reasoning, especially estimating object pose and developing vision-language models for 3D/4D reasoning. I also explore generative models for video and 3D object synthesis.

✉️: xwang378@jhu.edu

JHU

USC

PKU

RUC

Samsung

📣 News

Invited talk at BEAM Workshop at CVPR 2025 (slides). (Jun 2025)
Spatial457 accepted at CVPR 2025 as a highlight paper! (Feb 2025)
One paper accepted at ICLR 2025. (Jan 2025)
Introduced NS-4DPhysics and DynSuperCLEVR: a 4D neural symbolic model and benchmark for dynamic spatial reasoning. (May 2024)

Show older news

One paper accepted by NeurIPS 2023. (Sep 2023)
Joined CCVL at JHU as a PhD student. (Aug 2023)
SuperCLEVR selected as a highlight in CVPR 2023. (Mar 2023)
One paper accepted at CVPR 2023. (Feb 2023)
One paper accepted at ECCV 2022

Selected Publications

(Full list available on Google Scholar)

KeyVID

KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation

Xingrui Wang , Jiang Liu , Ze Wang , Xiaodong Yu , Jialian Wu , Ximeng Sun , Yusheng Su , Alan Yuille , Zicheng Liu , Emad Barsoum

Arxiv 2025

Paper | Code | Project

Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal Models

Xingrui Wang , Wufei Ma , Tiezheng Zhang , Celso M de Melo , Jieneng Chen , Alan Yuille

CVPR 2025 Highlight

Paper | Code | Project | Slides | Huggingface

DynSuperCLEVR

Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering

Xingrui Wang , Wufei Ma , Angtian Wang , Shuo Chen , Adam Kortylewski , Alan Yuille

ICLR 2025

Paper | Code | Project | Data

PO3D-VQA

3D-Aware Visual Question Answering about Parts, Poses and Occlusions

Xingrui Wang , Wufei Ma , Zhuowan Li , Adam Kortylewski , Alan Yuille

NeurIPS 2023

Paper | Code | Data | Poster

SuperCLEVR

Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning

Zhuowan Li , Xingrui Wang , Elias Stengel-Eskin , Adam Kortylewski , Wufei Ma , Benjamin Van Durme , Alan Yuille

CVPR 2023 Highlight

Paper | Code | Project

HVE

Contributions of Shape, Texture, and Color in Visual Recognition

Yunhao Ge , Yao Xiao , Zhi Xu , Xingrui Wang , Laurent Itti

ECCV 2022

Service

I am invited as a reviewer for CVPR, ICCV, NeurIPS, ICLR, WACV.
Co-host Advml Workshop at CVPR 2025.