Hi there, this is Xingrui! 👋
I am a PhD student in the Computer Science Department at Johns Hopkins University, advised by Prof. Alan Yuille. I am currently also interning with the GenAI group at AMD.
Before JHU, I obtained my Master’s degree from the University of Southern California, where I worked closely with Prof. Laurent Itti. Prior to USC, I received my B.S. in Statistics from Renmin University of China, where I was supervised by Prof. Hanfang Yang. I also spent time in the AI lab at the Samsung R&D Institute, supervised by Dr. Yang Liu.
✉️: xwang378@jhu.edu
Research Interest
My major research interests include 3D scene understanding, particularly in estimating objects’ 3D location and orientation using 3D generative models. Another key direction is 3D spatial reasoning for vision-language models (VLMs), where I am making progress in developing VLMs or benchmarks for 3D and 4D reasoning tasks.
I am also exploring generative models, including video generation and 3D objects generation.
News 📣
- [Feb 2025] PulseCheck457 accepted at CVPR 2025.
- [Jan 2025] One paper accepted at ICLR 2025.
- [Jan 2025] Served as a reviewer for CVPR 2025.
- [Dec 2024] Introduced PulseCheck457, a virtual benchmark for evaluating the 6D spatial reasoning ability of VLLMs.
- [Jul 2024] Joined AMD as a summer intern.
- [May 2024] Introduced NS-4DPhysics and DynSuperCLEVR, a 4D neural symbolic model for dynamical 3D spatial reasoning and a video question-answering benchmark.
- [Jan 2023] The Super-CLEVR-3D dataset is now available for download!
- [Sep 2023] One paper accepted by NeurIPS 2023.
- [Aug 2023] Joined the CCVL, JHU as a PhD student.
- [Mar 2023] The SuperCLEVR paper is selected as a highlight in CVPR 2023 (2.5% of submissions).
- [Feb 2023] One paper accepted at CVPR 2023.
Selected Publications
PulseCheck457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal Models
Xingrui Wang, Wufei Ma, Tiezheng Zhang, Celso M de Melo, Jieneng Chen, Alan Yuille.
CVPR 2025. (arxiv).Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Xingrui Wang, Wufei Ma, Angtian Wang, Shuo Chen, Adam Kortylewski, Alan Yuille.
ICLR 2025. (paper / data / code).3D-Aware Visual Question Answering about Parts, Poses and Occlusions
Xingrui Wang, Wufei Ma, Zhuowan Li, Adam Kortylewski, Alan Yuille.
NeurIPS 2023. (paper / data / code / poster)- Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning
Zhuowan Li, Xingrui Wang, Elias Stengel-Eskin, Adam Kortylewski, Wufei Ma, Benjamin Van Durme, Alan Yuille
CVPR 2023. (project / paper / code)Selected as highlight (top 2.5%) - Contributions of Shape, Texture, and Color in Visual Recognition
Yunhao Ge*, Yao Xiao*, Zhi Xu, Xingrui Wang, Laurent Itti.
ECCV 2022. (paper / code)