Animate-X++: Universal Character Image Animation with Dynamic Backgrounds
Shuai Tan
·
Biao Gong
·
Zhuoxin Liu
·
Yan Wang
Yifan
Feng
·
Xi Chen
·
Hengshuang
Zhao†
HKU | Ant Group
This repository is the official implementation of paper "Animate-X++: Universal Character Image Animation with Dynamic Backgrounds". Animate-X++ is a universal animation framework based on latent diffusion models for various character types (collectively named X), including anthropomorphic characters.
|
📌 Updates
- [2025.09.17] 🔥 We release our Animate-X++ inference codes.
- [2025.09.17] 🔥 We release our Animate-X++ CKPT checkpoints.
- [2025.08.12] 🔥 Our paper is in public on arxiv.
🌄 Gallery
Animations produced by Animate-X++
🚀 Installation
Install with conda:
conda create -n Animate-X++ python=3.9.21
# or conda create -n Animate-X++ python=3.10.16 # Python>=3.10 is required for Unified Sequence Parallel (USP)
conda activate Animate-X++
# CUDA 11.8
pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu118
# CUDA 12.1
pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu121
# CUDA 12.4
pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu124
git clone https://github.com/Lucaria-Academy/Animate-X++.git
cd Animate-X++
pip install -e .
UniAnimate-DiT supports multiple Attention implementations. If you have installed any of the following Attention implementations, they will be enabled based on priority.
- Flash Attention 3
- Flash Attention 2
- Sage Attention
- torch SDPA (default.
torch>=2.5.0is recommended.)
🚀 Download Checkpoints
(i) Download Wan2.1-14B-I2V-720P models using huggingface-cli:
pip install "huggingface_hub[cli]"
huggingface-cli download Wan-AI/Wan2.1-I2V-14B-720P --local-dir ./Wan2.1-I2V-14B-720P
Or download Wan2.1-14B-I2V-720P models using modelscope-cli:
pip install modelscope
modelscope download Wan-AI/Wan2.1-I2V-14B-720P --local_dir ./Wan2.1-I2V-14B-720P
(ii) Download Animate-X++ checkpoints and Dwpose and CLIP checkpoints and put all files in checkpoints dir
(iii) Finally, the model weights will be organized in ./checkpoints/ as follows:
./checkpoints/
|---- animate-x++.ckpt
|---- animate-x++_simple.ckpt
|---- dw-ll_ucoco_384.onnx
|---- open_clip_pytorch_model.bin
└---- yolox_l.onnx
💡 Inference
The default inputs are a image (.jpg/.png/.jpeg) and a dance video (.mp4/.mov). The default output is a 81-frame video (.mp4) with 832x480 resolution, which will be saved in ./outputs dir. We give a set of example data in Animate-X++ example data. Please put it in ./data
- pre-process the video.
python process_data.py \ --source_video_paths data/videos \ --saved_pose_dir data/saved_pkl \ --saved_pose data/saved_pose \ --saved_frame_dir data/saved_frames - run Animate-X++. We provide a simple version (recommended):
- If you have many GPUs for inference, we also support Unified Sequence Parallel (USP), note that python>=3.10 is required for Unified Sequence Parallel (USP):
pip install xfuser CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --standalone --nproc_per_node=4 examples/inference_480p_usp.py # or CUDA_VISIBLE_DEVICES=0 torchrun --standalone --nproc_per_node=1 examples/inference_480p_usp.py - Full model of Animate-X++:
CUDA_VISIBLE_DEVICES=0 torchrun --standalone --nproc_per_node=1 examples/inference_480p.py
✔ Some tips:
Although Animate-x does not rely on strict pose alignment and we did not perform any manual alignment operations for all the results in the paper, we cannot guarantee that all cases are perfect. Therefore, users can perform handmade pose alignment operations themselves, e.g, applying the overall x/y translation and scaling on the pose skeleton of each frame to align with the position of the subject in the reference image. (put in
data/saved_pose)
📧 Acknowledgement
Our implementation is based on UniAnimate-DiT, MimicMotion, and MusePose. Thanks for their remarkable contribution and released code! If we missed any open-source projects or related articles, we would like to complement the acknowledgement of this specific work immediately.
⚖ License
This repository is released under the Apache-2.0 license as found in the LICENSE file.
📚 Citation
If you find this codebase useful for your research, please use the following entry.
@article{AnimateX2025,
title={Animate-X: Universal Character Image Animation with Enhanced Motion Representation},
author={Tan, Shuai and Gong, Biao and Wang, Xiang and Zhang, Shiwei and Zheng, Dandan and Zheng, Ruobing and Zheng, Kecheng and Chen, Jingdong and Yang, Ming},
journal={ICLR 2025},
year={2025}
}
@article{Mimir2025,
title={Mimir: Improving Video Diffusion Models for Precise Text Understanding},
author={Tan, Shuai and Gong, Biao and Feng, Yutong and Zheng, Kecheng and Zheng, Dandan and Shi, Shuwei and Shen, Yujun and Chen, Jingdong and Yang, Ming},
journal={arXiv preprint arXiv:2412.03085},
year={2025}
}