Wukong-Huahua
Contents
查看中文
Wukong-Huahua Model
Wukong-Huahua is a diffusion-based model that perfoms text-to-image task in Chinese, which was developed by the Huawei Noah's Ark Lab in cooperation with the Distributed & Parallel Software Lab and Ascend Product Develop Unit. It was trained on Wukong dataset and used MindSpore + Ascend, a software and hardware solution to implement. Welcome to try Wukong-Huahua by Our Online Platform.
Environment Requirements
-
Ascend Software + Hardware Solution (Driver + Firmware + CANN)
Go to Ascend website. Follow the instructions to download and install.
-
AI Framework - Mindspore == 1.9
Go to MindSpore website 1.9. Follow the instructions to install.
If you need more help of MindSpore, please check
-
Third party dependency
pip install -r requirements.txt
Quick Start
Prepare Checkpoint
Download Wukong-Huahua pretrained checkpoint wukong-huahua-ms.ckpt and place it under wukong-huahua/models/ folder.
For fine tune task , we provide example datasets to show the format, please download here.
Text to Image Generation
To generate images according to input text, run txt2img.py or simply run infer.sh with default argumemts.
python txt2img.py --prompt [input text] --ckpt_path [ckpt_path] --ckpt_name [ckpt_name] \
--H [image_height] --W [image_width] --output_path [image save folder] \
--n_samples [number of images to generate]
or
bash scripts/infer.sh
Generating higher resolution requires more memory. For Ascend 910 chip, we can generate 2 1024x768 images or 16 512 x 512 images at same time.
Fine-tuning
modify the related configs in scripts/run_train.sh
bash scripts/run_train.sh
modify the related configs in scripts/run_train_parallel.sh
bash scripts/run_train_parallel.sh [DEVICE_NUM] [VISIABLE_DEVICES(0,1,2,3,4,5,6,7)] [RANK_TABLE_FILE]
Demos
Below are some of the images generated by our wukong-huahua model and corresponding [input text]
城市夜景 赛博朋克 格雷格·鲁特科夫斯基
莫奈 撑阳伞的女人 月亮 梦幻
海上日出时候的奔跑者
诺亚方舟在世界末日起航 科幻插画
时空 黑洞 辐射
乡村 田野 屏保
来自深渊 风景 绘画 写实风格