Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
Sze-qq 3e456b3c24 | 2 days ago | |
---|---|---|
.github/workflows | 1 week ago | |
assets | 3 days ago | |
configs | 3 days ago | |
docs | 3 days ago | |
eval | 3 days ago | |
gradio | 2 days ago | |
notebooks | 3 days ago | |
opensora | 2 days ago | |
scripts | 3 days ago | |
tests | 6 days ago | |
tools | 3 days ago | |
.gitignore | 6 days ago | |
.isort.cfg | 2 months ago | |
.pre-commit-config.yaml | 2 months ago | |
CONTRIBUTING.md | 6 days ago | |
LICENSE | 4 weeks ago | |
README.md | 2 days ago | |
requirements.txt | 2 days ago | |
setup.py | 2 days ago |
We present Open-Sora, an initiative dedicated to efficiently produce high-quality video and make the model,
tools and contents accessible to all. By embracing open-source principles,
Open-Sora not only democratizes access to advanced video generation techniques, but also offers a
streamlined and user-friendly platform that simplifies the complexities of video production.
With Open-Sora, we aim to inspire innovation, creativity, and inclusivity in the realm of content creation.
TBD
Videos are downsampled to .gif
for display. Click for original videos. Prompts are trimmed for display,
see here for full prompts.
More samples are available in our gallery.
Other useful documents and links are listed below.
# create a virtual env
conda create -n opensora python=3.10
# activate virtual environment
conda activate opensora
# install torch
# the command below is for CUDA 12.1, choose install commands from
# https://pytorch.org/get-started/locally/ based on your own CUDA version
pip install torch torchvision
# install flash attention (optional)
# set enable_flashattn=False in config to avoid using flash attention
pip install packaging ninja
pip install flash-attn --no-build-isolation
# install apex (optional)
# set enable_layernorm_kernel=False in config to avoid using apex
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" git+https://github.com/NVIDIA/apex.git
# install xformers
pip install -U xformers --index-url https://download.pytorch.org/whl/cu121
# install this project
git clone https://github.com/hpcaitech/Open-Sora
cd Open-Sora
pip install -v .
TBD
Resolution | Data | #iterations | Batch Size | GPU days (H800) | URL |
---|---|---|---|---|---|
16×512×512 | 20K HQ | 20k | 2×64 | 35 | :link: |
16×256×256 | 20K HQ | 24k | 8×64 | 45 | :link: |
16×256×256 | 366K | 80k | 8×64 | 117 | :link: |
Training orders: 16x256x256 $\rightarrow$ 16x256x256 HQ $\rightarrow$ 16x512x512 HQ.
Our model's weight is partially initialized from PixArt-α. The number of
parameters is 724M. More information about training can be found in our report. More about
the dataset can be found in datasets.md. HQ means high quality.
⚠️ LIMITATION: Our model is trained on a limited budget. The quality and text alignment is relatively poor.
The model performs badly, especially on generating human beings and cannot follow detailed instructions. We are working
on improving the quality and text alignment.
We have provided a Gradio application in this repository, you can use the following the command to start an interactive web application to experience video generation with Open-Sora.
pip install gradio spaces
python gradio/app.py
This will launch a Gradio application on your localhost. If you want to know more about the Gradio applicaiton, you can refer to the README file.
Since Open-Sora 1.1 supports inference with dynamic input size, you can pass the input size as an argument.
# video sampling
python scripts/inference.py configs/opensora-v1-1/inference/sample.py \
--ckpt-path CKPT_PATH --prompt "A beautiful sunset over the city" --num-frames 32 --image-size 480 854
See here for more instructions.
We have also provided an offline inference script. Run the following commands to generate samples, the required model weights will be automatically downloaded. To change sampling prompts, modify the txt file passed to --prompt-path
. See here to customize the configuration.
# Sample 16x512x512 (20s/sample, 100 time steps, 24 GB memory)
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x512x512.py --ckpt-path OpenSora-v1-HQ-16x512x512.pth --prompt-path ./assets/texts/t2v_samples.txt
# Sample 16x256x256 (5s/sample, 100 time steps, 22 GB memory)
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/16x256x256.py --ckpt-path OpenSora-v1-HQ-16x256x256.pth --prompt-path ./assets/texts/t2v_samples.txt
# Sample 64x512x512 (40s/sample, 100 time steps)
torchrun --standalone --nproc_per_node 1 scripts/inference.py configs/opensora/inference/64x512x512.py --ckpt-path ./path/to/your/ckpt.pth --prompt-path ./assets/texts/t2v_samples.txt
# Sample 64x512x512 with sequence parallelism (30s/sample, 100 time steps)
# sequence parallelism is enabled automatically when nproc_per_node is larger than 1
torchrun --standalone --nproc_per_node 2 scripts/inference.py configs/opensora/inference/64x512x512.py --ckpt-path ./path/to/your/ckpt.pth --prompt-path ./assets/texts/t2v_samples.txt
The speed is tested on H800 GPUs. For inference with other models, see here for more instructions.
To lower the memory usage, set a smaller vae.micro_batch_size
in the config (slightly lower sampling speed).
High-quality data is crucial for training good generation models.
To this end, we establish a complete pipeline for data processing, which could seamlessly convert raw videos to high-quality video-text pairs.
The pipeline is shown below. For detailed information, please refer to data processing.
Also check out the datasets we use.
Once you prepare the data in a csv
file, run the following commands to launch training on a single node.
# one node
torchrun --standalone --nproc_per_node 8 scripts/train.py \
configs/opensora-v1-1/train/stage1.py --data-path YOUR_CSV_PATH --ckpt-path YOUR_PRETRAINED_CKPT
# multiple nodes
colossalai run --nproc_per_node 8 --hostfile hostfile scripts/train.py \
configs/opensora-v1-1/train/stage1.py --data-path YOUR_CSV_PATH --ckpt-path YOUR_PRETRAINED_CKPT
Once you prepare the data in a csv
file, run the following commands to launch training on a single node.
# 1 GPU, 16x256x256
torchrun --nnodes=1 --nproc_per_node=1 scripts/train.py configs/opensora/train/16x256x256.py --data-path YOUR_CSV_PATH
# 8 GPUs, 64x512x512
torchrun --nnodes=1 --nproc_per_node=8 scripts/train.py configs/opensora/train/64x512x512.py --data-path YOUR_CSV_PATH --ckpt-path YOUR_PRETRAINED_CKPT
To launch training on multiple nodes, prepare a hostfile according
to ColossalAI, and run the
following commands.
colossalai run --nproc_per_node 8 --hostfile hostfile scripts/train.py configs/opensora/train/64x512x512.py --data-path YOUR_CSV_PATH --ckpt-path YOUR_PRETRAINED_CKPT
For training other models and advanced usage, see here for more instructions.
See here for more instructions.
Thanks goes to these wonderful contributors (emoji key
following all-contributors specification):
zhengzangw 💻 📖 🤔 📹 🚧 |
ver217 💻 🤔 📖 🐛 |
FrankLeeeee 💻 🚇 🔧 |
xyupeng 💻 📖 🎨 |
Yanjia0 📖 |
binmakeswell 📖 |
eltociear 📖 |
ganeshkrishnan1 📖 |
fastalgo 📖 |
powerzbt 📖 |
If you wish to contribute to this project, you can refer to the Contribution Guideline.
Zangwei Zheng and Xiangyu Peng equally contributed to
this work during their internship at HPC-AI Tech.
We are grateful for their exceptional work and generous contribution to open source.
No Description
Python Text Shell Jupyter Notebook
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》