Weijia Wu

Weijia Wu is a Researcher at Huawei Singapore Research Center, working with Prof. Hanwang Zhang. Prior to this, he was a Research Fellow at (Show Lab), National University of Singapore, working with Prof. Mike Z. Shou. He received the PhD from Zhejiang University. Additionally, he is also a popper; and has been dancing popping for five years. In the spare time, he also enjoys playing basketball and swimming.

Email  /  Google Scholar  /  GitHub  /  Twitter  /  Linkedin

profile photo
Research Interests

My current research interests primarily focus on AI research for camera-based video production, including Video Generation/editing, controllable video generation, and long video generation.

Notes: We are hiring research interns at Huawei Singapore Research Center. If you are interested in video generation, controllable / long video generation, unified multimodal models, or world models, feel free to contact me by Email (weijiawu96@gmail.com).

Recent Updates

Experience

  • [January 2021 - January 2022]:Research Intern at MMU, KuaiShou, led by Debing Zhang
  • [January 2022 - August 2023]:Research Intern at MMU, KuaiShou, led by Jiahong Li
  • [August 2022 - August 2023]:Visting PhD student at Show lab, NUS, led by Asst Prof. Mike Shou
  • [Dec. 2023 - Apr. 2025]:Research Fellow at Show lab, NUS, work with Asst Prof. Mike Shou
  • [Apr. 2025 - Present]:Researcher at Huawei Singapore Research Center, work with Prof. Hanwang Zhang
  • Selected Publications (* Equal, # Corresponding)
    WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation

    Wei Chow, Jiachun Pan, Yongyuan Liang, Mingze Zhou, Xue Song, Liyu Jia, Saining Zhang, Siliang Tang, Juncheng Li, Fengda Zhang#, Weijia Wu#, Hanwang Zhang, Tat-Seng Chua .
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2026),

    Abstract / arXiv / BibTex / GitHub / Huggingface / Project Page / Twitter(X) /

    Paragraph-to-Image Generation with Information-Enriched Diffusion Model

    Weijia Wu, Zhuang Li, Yefei He, Mike Zheng Shou, Chunhua Shen, Lele Cheng,Yan Li, Tingting Gao,Zhang Di.
    International Journal of Computer Vision (IJCV),

    Abstract / arXiv / BibTex / GitHub / Project Page / Twitter(X) /

    MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation

    Weijia Wu, Mingyu Liu, Zeyu Zhu, Xi Xia, Haoen Feng, Wen Wang, Kevin Qinghong Lin, Chunhua Shen, Mike Zheng Shou.
    IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2025),

    Abstract / arXiv / BibTex / GitHub / Project Page / Twitter(X) /

    DragAnything: Motion Control for Anything using Entity Representation

    Weijia Wu, Zhuang Li, Yuchao Gu, Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, Di Zhang.
    The 18th European Conference on Computer Vision(ECCV 2024),

    Abstract / arXiv / BibTex / GitHub / Project Page / Twitter(X) /

    EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models

    Yefei He, Jing Liu, Weijia Wu, Hong Zhou, Bohan Zhuang.
    The Twelfth International Conference on Learning Representations(ICLR 2024 spotlight),

    Abstract / arXiv / BibTex /

    PTQD: Accurate Post-Training Quantization for Diffusion Models†

    Yefei He, Luping Liu, Jing Liu, Weijia Wu, Hong Zhou, Bohan Zhuang.
    Thirty-seventh Conference on Neural Information Processing Systems(NeurIPS 2023),

    Abstract / arXiv / BibTex /

    DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models

    Weijia Wu, Yuzhong Zhao, Hao Chen, Yuchao Gu, Rui Zhao, Yefei He, Hong Zhou, Mike Zheng Shou, Chunhua Shen.
    Thirty-seventh Conference on Neural Information Processing Systems(NeurIPS 2023),

    Abstract / arXiv / BibTex / GitHub / Project Page / 中文公众号报道 /

    BiViT: Extremely Compressed Binary Vision Transformers

    Yefei He,Zhenyu Lou,,Luoming Zhang Weijia Wu, Bohan Zhuang, Hong Zhou.
    International Conference on Computer Vision Conference (ICCV2023),

    Abstract / arXiv / BibTex /

    Generative Prompt Model for Weakly Supervised Object Localization

    Yuzhong Zhao,Qixiang Ye, Weijia Wu, Chunhua Shen, Fan Wan.
    International Conference on Computer Vision Conference (ICCV2023),

    Abstract / arXiv / BibTex / GitHub /

    DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models

    Weijia Wu, Yuzhong Zhao, Mike Zheng Shou, Hong Zhou, Chunhua Shen.
    International Conference on Computer Vision Conference (ICCV2023),

    Abstract / arXiv / BibTex / GitHub / Project Page /

    End-to-End Video Text Spotting with Transformer

    Weijia Wu, Yuanqiang Cai, Chunhua Shen, Debing Zhang, Ying Fu, Ping Luo, Hong Zhou .
    International Journal of Computer Vision (IJCV),

    Abstract / arXiv / IJCV / BibTex / GitHub / Youtube Demo / ZhiHu /

    Real-time End-to-End Video Text Spotter with Contrastive Representation Learning

    Weijia Wu, Zhuang Li, Jiahong Li, Chunhua Shen, Size Li, Zhongyuan Wang, Ping Luo, Hong Zhou .
    arxiv (arxiv),

    Abstract / arXiv / BibTex / Code / Youtube Demo /

    Polygon-free: Unconstrained Scene Text Detection with Box Annotations

    Weijia Wu, Enze Xie, Ruimao Zhang, Wenhai Wang, Ping Luo Hong Zhou .
    International Conference on Information Processing (ICIP2022),

    Abstract / arXiv / BibTex / Code /

    A Bilingual, Open World Video Text Dataset and End-to-end Video Text Spotter with Transformer

    Weijia Wu, Yuanqiang Cai, Debing Zhang, Jiahong Li Hong Zhou .
    NeurIPS 2021 Track on Datasets and Benchmarks (NeurIPS), 2021

    Abstract / arXiv / Homepage / BibTex / GitHub/ Demo / 中文公众号报道 / 感谢其他人的知乎解读

    Synthetic-to-Real Unsupervised Domain Adaptation for Scene Text Detection in the Wild

    Weijia Wu, Ning Lu, Enze Xie, Hong Zhou .
    Proceedings of the Asian Conference on Computer Vision (ACCV), 2020.

    Abstract / arXiv / BibTex

    Academic Service

    Conference Review

  • International Conference on Machine Learning(ICML), 2022, 2023, 2024, 2025.
  • Neural Information Processing Systems(NeurIPS), 2021, 2022, 2023, 2024.
  • Track Datasets and Benchmarks of Neural Information Processing Systems(NeurIPS), 2021, 2022, 2023, 2024, 2025.
  • The Association for Computational Linguistics (ACL) 2024.
  • International Conference on Learning Representations(ICLR), 2023, 2024, 2025.
  • The Association for the Advancement of Artificial Intelligence(AAAI), 2025.
  • IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2024, 2025.
  • European Conference on Computer Vision (ECCV), 2024.
  • International Conference on Computer Vision (ICCV), 2025.
  • International Conference and Exhibition on Computer Graphics and Interactive Techniques (SIGGRAPH), 2025.
  • ACM International Conference on Multimedia (ACM MM), 2025.
  • CVPR Workshop SyntaGen 2024, 2025.
  • CVPR 2025 Workshop CVEU.
  • Journal Review

  • IEEE Transactions on Neural Networks and Learning Systems (TNNLS).
  • IEEE Transactions on Circuits and Systems for Video Technology (TCSVT).
  • IEEE Transactions on Multimedia (TMM).
  • International Journal of Computer Vision (IJCV).
  • Transactions on Multimedia Computing Communications and Applications (ACM TOMM).
  • Engineering Applications of Artificial Intelligence (EAAI).
  • Awards

  • [10/2020] National Scholarship of China.
  • [03/2025] Excellent Doctoral Dissertation Award of Zhejiang University.
  • Talks

  • [10/2021] MMU, Kuaishou. Video Text Spotting
  • [08/2025] ICCV 2025 workshops on Generative AI for Storytelling


  • Template credits : Dr. Jon Barron