Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
JeffDing 6c8ae48d8b | 2 weeks ago | |
---|---|---|
.github/workflows | 1 month ago | |
assets | 3 weeks ago | |
docker | 1 month ago | |
docs | 3 weeks ago | |
examples | 3 weeks ago | |
opensora | 2 weeks ago | |
scripts | 2 weeks ago | |
.gitignore | 3 weeks ago | |
LICENSE | 1 month ago | |
README.md | 3 weeks ago | |
app.py | 2 weeks ago | |
pyproject.toml | 3 weeks ago |
We are thrilled to present Open-Sora-Plan v1.0.0, which significantly enhances video generation quality and text control capabilities. See our report. We are training for higher resolution (>1024) as well as longer duration (>10s) videos, here is a preview of the next release. We show compressed .gif on github, which loses some quality.
Thanks to HUAWEI Ascend NPU Team for supporting us.
目前已支持国产AI芯片(华为昇腾910b,期待更多国产算力芯片)进行推理,下一步将支持国产算力训练,具体可参考PR180
.
257×512×512 (10s) | 65×1024×1024 (2.7s) | 65×1024×1024 (2.7s) |
---|---|---|
Time-lapse of a coastal landscape transitioning from sunrise to nightfall... | A quiet beach at dawn, the waves gently lapping at the shore and the sky painted in pastel hues.... | Sunset over the sea. |
65×512×512 (2.7s) | 65×512×512 (2.7s) | 65×512×512 (2.7s) |
---|---|---|
A serene underwater scene featuring a sea turtle swimming... | Yellow and black tropical fish dart through the sea. | a dynamic interaction between the ocean and a large rock... |
The dynamic movement of tall, wispy grasses swaying in the wind... | Slow pan upward of blazing oak fire in an indoor fireplace. | A serene waterfall cascading down moss-covered rocks... |
This project aims to create a simple and scalable repo, to reproduce Sora (OpenAI, but we prefer to call it "ClosedAI" ). We wish the open-source community can contribute to this project. Pull requests are welcome!!!
本项目希望通过开源社区的力量复现Sora,由北大-兔展AIGC联合实验室共同发起,当前版本离目标差距仍然较大,仍需持续完善和快速迭代,欢迎Pull request!!!
Project stages:
[2024.04.07] 🚀🚀🚀 Today, we are thrilled to present Open-Sora-Plan v1.0.0, which significantly enhances video generation quality and text control capabilities. See our report. Thanks to HUAWEI NPU for supporting us.
[2024.03.27] 🚀🚀🚀 We release the report of VideoCausalVAE, which supports both images and videos. We present our reconstructed video in this demonstration as follows. The text-to-video model is on the way.
[2024.03.10] 🚀🚀🚀 This repo supports training a latent size of 225×90×90 (t×h×w), which means we are able to train 1 minute of 1080P video with 30FPS (2× interpolated frames and 2× super resolution) under class-condition.
[2024.03.08] We support the training code of text condition with 16 frames of 512x512. The code is mainly borrowed from Latte.
[2024.03.07] We support training with 128 frames (when sample rate = 3, which is about 13 seconds) of 256x256, or 64 frames (which is about 6 seconds) of 512x512.
[2024.03.05] See our latest todo, pull requests are welcome.
[2024.03.04] We re-organizes and modulizes our code to make it easy to contribute to the project, to contribute please see the Repo structure.
[2024.03.03] We opened some discussions to clarify several issues.
[2024.03.01] Training code is available now! Learn more on our project page. Please feel free to watch 👀 this repository for the latest updates.
├── README.md
├── docs
│ ├── Data.md -> Datasets description.
│ ├── Contribution_Guidelines.md -> Contribution guidelines description.
├── scripts -> All scripts.
├── opensora
│ ├── dataset
│ ├── models
│ │ ├── ae -> Compress videos to latents
│ │ │ ├── imagebase
│ │ │ │ ├── vae
│ │ │ │ └── vqvae
│ │ │ └── videobase
│ │ │ ├── vae
│ │ │ └── vqvae
│ │ ├── captioner
│ │ ├── diffusion -> Denoise latents
│ │ │ ├── diffusion
│ │ │ ├── dit
│ │ │ ├── latte
│ │ │ └── unet
│ │ ├── frame_interpolation
│ │ ├── super_resolution
│ │ └── text_encoder
│ ├── sample
│ ├── train -> Training code
│ └── utils
git clone https://github.com/PKU-YuanGroup/Open-Sora-Plan
cd Open-Sora-Plan
conda create -n opensora python=3.8 -y
conda activate opensora
pip install -e .
pip install -e ".[train]"
pip install flash-attn --no-build-isolation
pip install -e '.[dev]'
Highly recommend trying out our web demo by the following command. We also provide online demo and in Huggingface Spaces.
🤝 Enjoying the and , created by @camenduru, who generously supports our research!
python -m opensora.serve.gradio_web_server
sh scripts/text_condition/sample_video.sh
Refer to Data.md
Refer to the document EVAL.md.
python examples/rec_video_vae.py --rec-path test_video.mp4 --video-path video.mp4 --resolution 512 --num-frames 1440 --sample-rate 1 --sample-fps 24 -
-device cuda --ckpt <Your ckpt>
Please refer to the document VQVAE.
sh scripts/text_condition/train_videoae_17x256x256.sh
sh scripts/text_condition/train_videoae_65x256x256.sh
sh scripts/text_condition/train_videoae_65x512x512.sh
In comparison to the original implementation, we implement a selection of training speed acceleration and memory saving features including gradient checkpointing, mixed precision training, and pre-extracted features, xformers, deepspeed. Some data points using a batch size of 1 with a A100:
gradient checkpointing | mixed precision | xformers | feature pre-extraction | deepspeed config | compress kv | training speed | memory |
---|---|---|---|---|---|---|---|
✔ | ✔ | ✔ | ✔ | ❌ | ❌ | 0.64 steps/sec | 43G |
✔ | ✔ | ✔ | ✔ | Zero2 | ❌ | 0.66 steps/sec | 14G |
✔ | ✔ | ✔ | ✔ | Zero2 | ✔ | 0.66 steps/sec | 15G |
✔ | ✔ | ✔ | ✔ | Zero2 offload | ❌ | 0.33 steps/sec | 11G |
✔ | ✔ | ✔ | ✔ | Zero2 offload | ✔ | 0.31 steps/sec | 12G |
gradient checkpointing | mixed precision | xformers | feature pre-extraction | deepspeed config | compress kv | training speed | memory |
---|---|---|---|---|---|---|---|
✔ | ✔ | ✔ | ✔ | ❌ | ❌ | 0.08 steps/sec | 77G |
✔ | ✔ | ✔ | ✔ | Zero2 | ❌ | 0.08 steps/sec | 41G |
✔ | ✔ | ✔ | ✔ | Zero2 | ✔ | 0.09 steps/sec | 36G |
✔ | ✔ | ✔ | ✔ | Zero2 offload | ❌ | 0.07 steps/sec | 39G |
✔ | ✔ | ✔ | ✔ | Zero2 offload | ✔ | 0.07 steps/sec | 33G |
We greatly appreciate your contributions to the Open-Sora Plan open-source community and helping us make it even better than it is now!
For more details, please refer to the Contribution Guidelines
No Description
Python Shell Cuda C++ other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》