The official repo for "Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models".
Diffuman4D enables high-fidelity free-viewpoint rendering of human performances from sparse-view videos.
Click here to experience immersive 4DGS rendering.
To enable model training, we meticulously process the DNA-Rendering dataset by recalibrating camera parameters, optimizing image color correction matrices (CCMs), predicting foreground masks, and estimating human skeletons.
We will release re-annotated labels for the DNA-Rendering dataset in this repository, which we believe will benefit future research in this area.
- For camera parameters, foreground masks, and keypoints, we will provide the processed data.
- For RGB images, we will provide only preprocessing scripts. Please download the raw data from the official DNA-Rendering website.
- Release project page and paper.
- Release inference code.
- Release data preprocessing scripts.
- Release processed DNA-Rendering dataset.
@inproceedings{jin2025diffuman4d,
title={Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models},
author={Jin, Yudong and Peng, Sida and Wang, Xuan and Xie, Tao and Xu, Zhen and Yang, Yifan and Shen, Yujun and Bao, Hujun and Zhou, Xiaowei},
booktitle={International Conference on Computer Vision (ICCV)},
year={2025}
}