Yaohui Wang 王耀晖

Ph.D. Inria

inria

I am a Research Scientist at Shanghai Artificial Intelligence Laboratory, where I work on Deep generative models for general and human-centric video generation. My works include Latte, LaVie, SEINE, AnimateDiff, LIA and LEO.

I obtained my PhD from Inria, STARS team focusing on developing learning methods for video generation advised by Antitza Dantcheva and Francois Bremond. Before that, I completed Master Artificial Intelligence program directed by Isabelle Guyon and Michèle Sebag from Université Paris-Saclay.

I am looking for research interns on deep generative models for videos/3D/images. Feel free to contact me if you are interested.

News

10 / 2024
1 paper LaVie accepted to IJCV.
10 / 2024
1 paper 4Diffusion accepted to NeurIPS 2024.
08 / 2024
1 paper LIA (Journal extension) accepted to TPAMI and 1 paper LEO accepted to IJCV.
02 / 2024
4 paper VBench, EpiDiff, SinSR and Vlogger accepted to CVPR 2024.
01 / 2024
3 paper SEINE, InternVid and AnimateDiff accepted to ICLR 2024.
11 / 2023
2 paper ConditionVideo and Diff-Text accepted to AAAI 2024.
11 / 2023
07 / 2023
1 paper LAC accepted to ICCV 2023!
04 / 2023
1 paper Loris accepted to ICML 2023!

Research

(*equal contribution, correspondance & project lead)
Latte: Latent Diffusion Transformer for Video Generation
Xin Ma, Yaohui Wang, Xinyuan Chen, Gengyun Jia, Ziwei Liu, Yuan-fang Li, Cunjian Chen, Yu Qiao
arXiv:2401.03048
Paper | Arxiv | Project page | Code | Hugging face
LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
Yaohui Wang*, Xinyuan Chen*, Xin Ma*, Shangchen Zhou, Ziqi Huang, Yi Wang, Ceyuan Yang, Yinan He, Jiashuo Yu, Peiqing Yang, Yuwei Guo, Tianxing Wu, Chenyang Si, Yuming Jiang, Cunjian Chen, Chen Change Loy, Bo Dai, Dahua Lin, Yu Qiao, Ziwei Liu
In IJCV 2024
Paper | Arxiv | Project page | Code | Hugging face
Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models
Xin Ma, Yaohui Wang, Gengyun Jia, Xinyuan Chen, Yuan-fang Li, Cunjian Chen, Yu Qiao
arXiv:2407.15642
Paper | Arxiv | Project page | Code | Hugging face
4Diffusion: Multi-view Video Diffusion Model for 4D Generation
Haiyu Zhang, Xinyuan Chen, Yaohui Wang, Xihui Liu, Yunhong Wang, Yu Qiao
In Proc. NeurIPS, Vancouver, 2024
Paper | Arxiv | Project page | Code
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
Xinyuan Chen*, Yaohui Wang*, Lingjun Zhang, Shaobin Zhuang, Xin Ma, Jiashuo Yu, Yali Wang, Dahua Lin, Yu Qiao, Ziwei Liu
In Proc. ICLR, Vienna, 2024
Paper | Arxiv | Project page | Code | Hugging face
VBench: Comprehensive Benchmark Suite for Video Generative Models
Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, Yaohui Wang, Xinyuan Chen, Limin Wang, Dahua Lin, Yu Qiao and Ziwei Liu
In Proc. CVPR, Seattle, 2024
Paper | Arxiv | Project page | Code
EpiDiff:Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion
Zehuan Huang, Hao Wen, Junting Dong, Yaohui Wang, Yangguang Li, Xinyuan Chen, Yan-pei Cao, Ding Liang, Yu Qiao, Bo Dai and Lu Sheng
In Proc. CVPR, Seattle, 2024
Paper | Arxiv | Project page | Code
SinSR: Diffusion-Based Image Super-Resolution in a Single Step
Yufei Wang, Wenhan Yang, Xinyuan Chen, Yaohui Wang, Lanqing Guo, Lap-Pui Chau, Ziwei Liu, Yu Qiao, Alex C. Kot and Bihan Wen
In Proc. CVPR, Seattle, 2024
Paper | Arxiv | Project page | Code
Vlogger: Make Your Dream A Vlog
Shaobin Zhuang, Kunchang Li, Xinyuan Chen, Yaohui Wang, Ziwei Liu, Yu Qiao, Yali Wang
In Proc. CVPR, Seattle, 2024
Paper | Arxiv | Project page | Code
Brush Your Text: Synthesize Any Scene Text on Images via Diffusion Model
Lingjun Zhang, Xinyuan Chen, Yaohui Wang, Yue Lu, Yu Qiao
In Proc. AAAI, Vancouver, 2024
Paper | Arxiv | Project page | Code
ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation
Bo Peng, Xinyuan Chen, Yaohui Wang, Chaochao Lu, Yu Qiao
In Proc. AAAI, Vancouver, 2024
Paper | Arxiv | Project page | Code
LAC: Latent Action Composition for Skeleton-based Action Segmentation
Di Yang, Yaohui Wang, Antitza Dantcheva, Quan Kong, Lorenzo Garattoni, Gianpiero Francesca, Francois Bremond. corresponding author
In Proc. ICCV, Paris, 2023
Paper | Arxiv | Project page | Code
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation
Yi Wang*, Yinan He*, Yuzhuo Li*, Kunchang Li, Jiashuo Yu, Xin Ma, Xinyuan Chen, Yaohui Wang, Ping Luo, Ziwei Liu, Yali Wang, Limin Wang, Yu Qiao
In Proc. ICLR, Vienna, 2024
Paper | Arxiv | Project page | Code
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Yuwei Guo, Ceyuan Yang, Anyi Rao, Yaohui Wang, Yu Qiao, Dahua Lin, Bo Dai
In Proc. ICLR, Vienna, 2024
Paper | Arxiv | Project page | Code
LEO: Generative Latent Image Animator for Human Video Synthesis
Yaohui Wang, Xin Ma, Xinyuan Chen, Cunjian Chen, Antitza Dancheva, Bo Dai, Yu Qiao
In IJCV 2024
Paper | Arxiv | Project page | Code
Hierarchical Diffusion Autoencoders and Disentangled Image Manipulation
Zeyu Lu, Chengyue Wu, Xinyuan Chen, Yaohui Wang, Lei Bai, Yu Qiao, Xihui Liu
In Proc. WACV, Hawaii, 2024
Paper | Arxiv | Project page | Code
Long-Term Rhythmic Video Soundtracker
Jiashuo Yu, Yaohui Wang, Xinyuan Chen, Xiao Sun, Yu Qiao
In Proc. ICML, Hawaii, 2023
Paper | Arxiv | Project page | Code
Self-supervised Video Representation Learning via Latent Time Navigation
Di Yang, Yaohui Wang, Quan Kong, Antitza Dantcheva, Lorenzo Garattoni, Gianpiero Francesca and François Brémond
In Proc. AAAI, Washington, 2023
Paper | Arxiv | Project page | Code
ViA: View-invariant Skeleton Action Representation Learning via Motion Retargeting
Di Yang, Yaohui Wang, Antitza Dantcheva, Lorenzo Garattoni, Gianpiero Francesca and François Brémond
In IJCV, 2023
Paper | Arxiv | Project page | Code
Latent Image Animator: Learning to Animate Images via Latent Space Navigation
Yaohui Wang, Di Yang, Francois Bremond and Antitza Dantcheva
In Proc. ICLR, Virtual, 2022
LIA: Latent Image Animator
Yaohui Wang, Di Yang, Francois Bremond and Antitza Dantcheva
In TPAMI, 2024
Paper | Paper (TPAMI version) | Arxiv | Project page | Code
UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition
Di Yang*, Yaohui Wang*, Antitza Dantcheva, Lorenzo Garattoni, Gianpiero Francesca and Francois Bremond. *equal contribution
In Proc. BMVC, Virtual, 2021 (Oral)
Paper | Arxiv | Project page | Code
InMoDeGAN: Interpretable Motion Decomposition Generative Adversarial Network for Video Generation
Yaohui Wang, Francois Bremond, and Antitza Dantcheva
arXiv:2101.03049
Arxiv | Project page | Code
Joint Generative and Contrastive Learning for Unsupervised Person Re-identification
Hao Chen*, Yaohui Wang*, Benoit Lagadec, Antitza Dantcheva, and Francois Bremond. *equal contribution
In Proc. CVPR, Virtual, 2021.
Paper | Code
Learning Invariance from Generated Variance or Unsupervised Person Re-identification
In IEEE TPAMI, 2022.
Paper | Code
Selective Spatio-Temporal Aggregation Based Pose Refinement System
Di Yang, Rui Dai, Yaohui Wang, Rupayan Mallick, Luca Minciullo, Gianpiero Francesca, and Francois Bremond
In Proc. WACV, Virtual, 2021.
Paper | Code
G³AN: Disentangling appearance and motion for video generation
Yaohui Wang, Piotr Bilinski, Francois Bremond, and Antitza Dantcheva
In Proc. CVPR, Seattle, US, 2020.
In LUV-CVPR Workshop, Seattle, US, 2020. (Oral Presentation)
Paper | Project page | Code | Video
ImaGINator: Conditional Spatio-Temporal GAN for Video Generation
Yaohui Wang, Piotr Bilinski, Francois Bremond, and Antitza Dantcheva
In Proc. WACV, Aspen, US, 2020.
Paper | Code | Video
A video is worth more than 1000 lies. Comparing 3DCNN approaches for detecting deepfakes
Yaohui Wang and Antitza Dantcheva
In Proc. FG, Buenos Aires, Argentina, 2020.
Paper
From attribute-labels to faces: face generation using a conditional generative adversarial network
Yaohui Wang, Antitza Dantcheva and Francois Bremond
In Proc. ECCV Workshop, Munich, Germany, 2018.
Paper
Comparing methods for assessment of facial dynamics in patients with major neurocognitive disorders
Yaohui Wang, Antitza Dantcheva, Francois Bremond and Piotr Bilinski
In Proc. ECCV Workshop, Munich, Germany, 2018.
Paper

PhD Thesis

Learning to Generate Human Videos
Yaohui Wang
Thesis

Defense Jury:

Professional activities

Reviewer
SIGGRAPH 2022, CVPR 2022/2021, ECCV 2022/2020, WACV 2020 ...

Copyright © Yaohui Wang