Sheng Zhou (周晟)

[Email] [Github] [Google Scholar]

About Me

  • I will soon join King Abdullah University of Science and Technology (KAUST) as a Postdoctoral Fellow.
  • I obtained my Ph.D. degree in Computer Science at Hefei University of Technology (HFUT) through a direct M.S.–Ph.D. program from Sep. 2020 to Dec. 2025, advised by Prof. Dan Guo and Prof. Meng Wang.
  • From Aug. 2023 to Aug. 2024, I studied at University of Science and Technology of China (USTC) for one year under the guidance of Prof. Xun Yang.
  • From Sep. 2024 to Aug. 2025, I studied at National University of Singapore (NUS) as a visiting student under the guidance of Dr. Junbin Xiao, Prof. Angela Yao, and Prof. Tat-Seng Chua.
  • profile photo

    I am actively seeking research discussions and collaboration opportunities, so feel free to contact me!

    My group at KAUST is actively recruiting visiting/remote students, with several openings available until July 2026. Students interested in MLLM for Healthcare can contact me!

    Research

    My research focuses on visual-language understanding and reasoning, primarily on scene-text visual question answering and visual grounding. I am currently expanding my research scope to egocentric video understanding and multimodal large language models for human assistance.

    🔥News

  • 2026.02: One Paper is accepted by CVPR. 🎉.
  • 2025.11: Successfully defended my Ph.D.🎓 Thesis: Research on Scene Text-Driven Visual Question Answering.
  • 2025.05: Our work EgoTextVQA will be presented at the Egocentric Vision (EgoVis) Workshop and Vision-based Assistants in the Real-World (VAR) Workshop @ CVPR 2025! 😄
  • 2025.05: One Paper is accepted by IEEE TMM. 🎉
  • 2025.02: One Paper is accepted by CVPR. 🎉
  • 2025.02: I honor Tat-Seng Chua Scholarship.
  • 2024.09: I will be a visiting student at NUS for one year, collaborating with Dr. Junbin Xiao.
  • 2024.01: One paper is accepted by ACM TOMM. 🎉
  • 2023.07: Start a study at USTC and supervised by Prof. Xun Yang.
  • 2023.09: One paper is accepted by IEEE TIP. 🎉
  • Publications and Preprints

  • († Correspnding Author, # Core Contributor)
  • realunify.png RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark
    Yang Shi#, Yuhao Dong#, Yue Ding#, Yuran Wang#, Xuanyu Zhu#, Sheng Zhou#, Wenting Liu#, Haochen Tian#, Rundong Wang#, Huanqian Wang, Zuyan Liu, Bohan Zeng, Ruizhe Chen, Qixun Wang, Zhuoran Zhang, Xinlong Chen, Chengzhuo Tong, Bozhou Li, Qiang Liu, Haotian Wang†, Wenjing Yang, Yuanxing Zhang†, Pengfei Wan, YiFan Zhang†, Ziwei Liu†.
    CVPR'26 [arXiv] [Code] [Dataset]
    egotextvqa.png EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering
    Sheng Zhou, Junbin Xiao†, Qingyun Li, Yicong Li, Xun Yang, Dan Guo, Meng Wang, Tat-Seng Chua, Angela Yao.
    CVPR'25 [arXiv] [Project Page] [Code] [Dataset]
    vitxtgqa Scene-Text Grounding for Text-Based Video Question Answering
    Sheng Zhou, Junbin Xiao†, Xun Yang†, Peipei Song, Dan Guo†, Angela Yao, Meng Wang, Tat-Seng Chua.
    IEEE TMM'25 [arXiv] [Code] [Dataset]
    gpin Graph Pooling Inference Network for Text-based VQA
    Sheng Zhou, Dan Guo†, Xun Yang†, Jianfeng Dong, Meng Wang†.
    ACM TOMM'24 [Paper] [Code]
    ssgn Exploring Sparse Spatial Relation in Graph Inference for Text-Based VQA
    Sheng Zhou, Dan Guo†, Jia Li, Xun Yang†, Meng Wang†.
    IEEE TIP'23 [Paper] [Code]
    Selected Honors and Awards

  • [2025.02] Tat-Seng Chua Scholarship
  • [2022 - 2025] First Class Academic Scholarship (three times)
  • [2020 - 2022] Second Class Academic Scholarship (two times)
  • [2020] Outstanding Graduate of Innovation and Entrepreneurship in Hunan Province
  • [2016 - 2019] National Encouragement Scholarship (three times)
  • Services

  • Reviewer for Conference: CVPR (2026), ACM MM (2025, 2026), ECCV (2026), IJCNN (2025, 2026)
  • Reviewer for Journal: IEEE TIP, ACM TOMM, Information Fusion, Neurocomputing, ...