Publications
2025
- KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
Xingrui Wang, Jiang Liu, Ze Wang, Xiaodong Yu, Jialian Wu, Ximeng Sun, Yusheng Su, Alan Yuille, Zicheng Liu, Emad Barsoum.
Arxiv 2025. (paper, code, project)
- Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal Models
Xingrui Wang, Wufei Ma, Tiezheng Zhang, Celso M de Melo, Jieneng Chen, Alan Yuille.
CVPR 2025 Highlight. (project, paper, code, huggingface, slides)
2024
- Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Xingrui Wang, Wufei Ma, Angtian Wang, Shuo Chen, Adam Kortylewski, Alan Yuille.
ICLR 2025. (project, paper, data, code)
2023
- 3D-Aware Visual Question Answering about Parts, Poses and Occlusions
Xingrui Wang, Wufei Ma, Zhuowan Li, Adam Kortylewski, Alan Yuille.
NeurIPS 2023. (paper, code)
- Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning
Zhuowan Li, Xingrui Wang, Elias Stengel-Eskin, Adam Kortylewski, Wufei Ma, Benjamin Van Durme, Alan Yuille
CVPR 2023. (project, paper, code)Selected as highlight (top 2.5%)
2022
- Contributions of Shape, Texture, and Color in Visual Recognition
Yunhao Ge*, Yao Xiao*, Zhi Xu, Xingrui Wang, Laurent Itti.
ECCV 2022. (paper, code)
- Exploring Coarse-grained Pre-guided Attention to Assist Fine-grained Attention Reinforcement Learning Agents
Haoyu Liu, Yang Liu, Xingrui Wang, Hanfang Yang.
WCCI 2022. (paper, pdf)
2021
- Large scale GPS trajectory generation using map based on two stage GAN
Xingrui Wang, Xinyu Liu, Ziteng Lu, Hanfang Yang.
Journal of Data Science. (paper)
2020
- Technical solution discussion for key challenges of operational convolutional neural network-based building-damage assessment from satellite imagery: Perspective from benchmark xBD dataset
Su, J., Bai, Y., Wang, X., Lu, D., Zhao, B., Yang, H., … & Koshimura, S.
Remote Sensing. (paper)