Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
|
1 year ago | |
---|---|---|
ascend310_infer | 1 year ago | |
data | 1 year ago | |
imgs | 1 year ago | |
modelarts | 1 year ago | |
scripts | 1 year ago | |
src | 1 year ago | |
README.md | 1 year ago | |
eval.py | 1 year ago | |
export.py | 1 year ago | |
postprocess.py | 1 year ago | |
preprocess.py | 1 year ago | |
train.py | 1 year ago |
Many problems in image processing, computer graphics, and computer vision can be posed as “translating” an input image into a corresponding output image, each of these tasks has been tackled with separate, special-purpose machinery, despite the fact that the setting is always the same: predict pixels from pixels.
Our goal in this paper is to develop a common framework for all these problems. Pix2pix model is a conditional GAN, which includes two modules--generator and discriminator. This model transforms an input image into a corresponding output image. The essence of the model is the mapping from pixel to pixel.
Paper: Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. "Image-to-Image Translation with Conditional Adversarial Networks", in CVPR 2017.
The Pix2Pix contains a generation network and a discriminant networks.In the generator part, the model can be any pixel to pixel mapping network (in the raw paper, the author proposed to use Unet). In the discriminator part, a patch GAN is used to judge whether each N*N patches is fake or true, thus can improve the reality of the generated image.
Generator(Unet-Based) architectures:
Encoder:
C64-C128-C256-C512-C512-C512-C512-C512
Decoder:
CD512-CD1024-CD1024-C1024-C1024-C512-C256-C128
Discriminator(70 × 70 discriminator) architectures:
C64-C128-C256-C512
Note: Let Ck denote a Convolution-BatchNorm-ReLU layer with k filters. CDk denotes a Convolution-BatchNorm-Dropout-ReLU layer with a dropout rate of 50%.
Dataset_1 used: facades
Dataset size: 29M, 606 images
400 train images
100 validation images
106 test images
Data format:.jpg images
Dataset_2 used: maps
Dataset size: 239M, 2194 images
1096 train images
1098 validation images
Data format:.jpg images
Note: We provide data/download_Pix2Pix_dataset.sh to download the datasets.
The entire code structure is as following:
.Pix2Pix
├─ README.md # descriptions about Pix2Pix
├─ data
└─download_Pix2Pix_dataset.sh # download dataset
├── scripts
└─run_infer_310.sh # launch ascend 310 inference
└─run_train_ascend.sh # launch ascend training(1 pcs)
└─run_distribute_train_ascend.sh # launch ascend training(8 pcs)
└─run_eval_ascend.sh # launch ascend eval
└─run_train_gpu.sh # launch gpu training(1 pcs)
└─run_distribute_train_gpu.sh # launch gpu training(8 pcs)
└─run_eval_gpu.sh # launch gpu eval
├─ imgs
└─Pix2Pix-examples.jpg # Pix2Pix Imgs
├─ src
├─ __init__.py # init file
├─ dataset
├─ __init__.py # init file
├─ pix2pix_dataset.py # create pix2pix dataset
├─ models
├─ __init__.py # init file
├─ discriminator_model.py # define discriminator model——Patch GAN
├─ generator_model.py # define generator model——Unet-based Generator
├─ init_w.py # initialize network weights
├─ loss.py # define losses
└─ pix2pix.py # define Pix2Pix model
└─ utils
├─ __init__.py # init file
├─ config.py # parse args
├─ tools.py # tools for Pix2Pix model
├─ eval.py # evaluate Pix2Pix Model
├─ train.py # train script
└─ export.py # export mindir script
Major parameters in train.py and config.py as follows:
"device_target": Ascend # run platform, only support Ascend.
"device_num": 1 # device num, default is 1.
"device_id": 0 # device id, default is 0.
"save_graphs": False # whether save graphs, default is False.
"init_type": normal # network initialization, default is normal.
"init_gain": 0.02 # scaling factor for normal, xavier and orthogonal, default is 0.02.
"load_size": 286 # scale images to this size, default is 286.
"batch_size": 1 # batch_size, default is 1.
"LAMBDA_Dis": 0.5 # weight for Discriminator Loss, default is 0.5.
"LAMBDA_GAN": 1 # weight for GAN Loss, default is 1.
"LAMBDA_L1": 100 # weight for L1 Loss, default is 100.
"beta1": 0.5 # adam beta1, default is 0.5.
"beta2": 0.999 # adam beta2, default is 0.999.
"lr": 0.0002 # the initial learning rate, default is 0.0002.
"lr_policy": linear # learning rate policy, default is linear.
"epoch_num": 200 # epoch number for training, default is 200.
"n_epochs": 100 # number of epochs with the initial learning rate, default is 100.
"n_epochs_decay": 100 # number of epochs with the dynamic learning rate, default is 100.
"dataset_size": 400 # for Facade_dataset,the number is 400; for Maps_dataset,the number is 1096.
"train_data_dir": None # the file path of input data during training.
"val_data_dir": None # the file path of input data during validating.
"train_fakeimg_dir": ./results/fake_img/ # during training, the file path of stored fake img.
"loss_show_dir": ./results/loss_show # during training, the file path of stored loss img.
"ckpt_dir": ./results/ckpt # during training, the file path of stored CKPT.
"ckpt": None # during validating, the file path of the CKPT used.
"predict_dir": ./results/predict/ # during validating, the file path of Generated images.
python train.py --device_target [Ascend] --device_id [0] --train_data_dir [./data/facades/train]
bash run_distribute_train_ascend.sh [DEVICE_NUM] [DISTRIBUTE] [RANK_TABLE_FILE] [DATASET_PATH] [DATASET_NAME]
python train.py --device_target [GPU] --run_distribute [1] --device_num [8] --dataset_size 400 --train_data_dir [./data/facades/train] --pad_mode REFLECT
OR
bash scripts/run_train_gpu.sh [DATASET_PATH] [DATASET_NAME]
bash run_distribute_train_gpu.sh [DATASET_PATH] [DATASET_NAME] [DEVICE_NUM]
python eval.py --device_target [Ascend] --device_id [0] --val_data_dir [./data/facades/test] --ckpt [./results/ckpt/Generator_200.ckpt] --pad_mode REFLECT
OR
bash scripts/run_eval_ascend.sh [DATASET_PATH] [DATASET_NAME] [CKPT_PATH] [RESULT_DIR]
python eval.py --device_target [GPU] --device_id [0] --val_data_dir [./data/facades/test] --ckpt [./train/results/ckpt/Generator_200.ckpt] --predict_dir [./train/results/predict/] \
--dataset_size 1096 --pad_mode REFLECT
OR
bash scripts/run_eval_gpu.sh [DATASET_PATH] [DATASET_NAME] [CKPT_PATH] [RESULT_PATH]
Note:: Before training and evaluating, create folders like "./results/...". Then you will get the results as following in "./results/predict".
bash run_infer_310.sh [The path of the MINDIR for 310 infer] [The path of the dataset for 310 infer] y Ascend 0
Note:: Before executing 310 infer, create the MINDIR/AIR model using "python export.py --ckpt [The path of the CKPT for exporting] --train_data_dir [The path of the training dataset]".
Parameters | single Ascend | single GPU |
---|---|---|
Model Version | Pix2Pix | Pix2Pix |
Resource | Ascend 910 | PCIE V100-32G |
MindSpore Version | 1.2 | 1.3.0 |
Dataset | facades | facades |
Training Parameters | epoch=200, steps=400, batch_size=1, lr=0.0002 | epoch=200, steps=400, batch_size=1, lr=0.0002, pad_mode=REFLECT |
Optimizer | Adam | Adam |
Loss Function | SigmoidCrossEntropyWithLogits Loss & L1 Loss | SigmoidCrossEntropyWithLogits Loss & L1 Loss |
outputs | probability | probability |
Speed | 1pc(Ascend): 10 ms/step | 1pc(GPU): 40 ms/step |
Total time | 1pc(Ascend): 0.3h | 1pc(GPU): 0.8 h |
Checkpoint for Fine tuning | 207M (.ckpt file) | 207M (.ckpt file) |
Parameters | single Ascend | single GPU |
---|---|---|
Model Version | Pix2Pix | Pix2Pix |
Resource | Ascend 910 | |
MindSpore Version | 1.2 | 1.3.0 |
Dataset | maps | maps |
Training Parameters | epoch=200, steps=1096, batch_size=1, lr=0.0002 | epoch=200, steps=400, batch_size=1, lr=0.0002, pad_mode=REFLECT |
Optimizer | Adam | Adam |
Loss Function | SigmoidCrossEntropyWithLogits Loss & L1 Loss | SigmoidCrossEntropyWithLogits Loss & L1 Loss |
outputs | probability | probability |
Speed | 1pc(Ascend): 20 ms/step | 1pc(GPU): 90 ms/step |
Total time | 1pc(Ascend): 1.58h | 1pc(GPU): 3.3h |
Checkpoint for Fine tuning | 207M (.ckpt file) | 207M (.ckpt file) |
Parameters | Ascend (8pcs) | GPU (8pcs) |
---|---|---|
Model Version | Pix2Pix | Pix2Pix |
Resource | Ascend 910 | PCIE V100-32G |
MindSpore Version | 1.4.1 | 1.3.0 |
Dataset | facades | facades |
Training Parameters | epoch=200, steps=400, batch_size=1, lr=0.0002 | epoch=200, steps=400, batch_size=1, lr=0.0002, pad_mode=REFLECT |
Optimizer | Adam | Adam |
Loss Function | SigmoidCrossEntropyWithLogits Loss & L1 Loss | SigmoidCrossEntropyWithLogits Loss & L1 Loss |
outputs | probability | probability |
Speed | 8pc(Ascend): 15 ms/step | 8pc(GPU): 30 ms/step |
Total time | 8pc(Ascend): 0.5h | 8pc(GPU): 1 h |
Checkpoint for Fine tuning | 207M (.ckpt file) | 207M (.ckpt file) |
Parameters | Ascend (8pcs) | GPU (8pcs) |
---|---|---|
Model Version | Pix2Pix | Pix2Pix |
Resource | Ascend 910 | PCIE V100-32G |
MindSpore Version | 1.4.1 | 1.3.0 |
Dataset | maps | maps |
Training Parameters | epoch=200, steps=1096, batch_size=1, lr=0.0002 | epoch=200, steps=400, batch_size=1, lr=0.0002, pad_mode=REFLECT |
Optimizer | Adam | Adam |
Loss Function | SigmoidCross55EntropyWithLogits Loss & L1 Loss | SigmoidCrossEntropyWithLogits Loss & L1 Loss |
outputs | probability | probability |
Speed | 8pc(Ascend): 20 ms/step | 8pc(GPU): 40 ms/step |
Total time | 8pc(Ascend): 1.2h | 8pc(GPU): 2.8h |
Checkpoint for Fine tuning | 207M (.ckpt file) | 207M (.ckpt file) |
Parameters | single Ascend | single GPU |
---|---|---|
Model Version | Pix2Pix | Pix2Pix |
Resource | Ascend 910 | PCIE V100-32G |
MindSpore Version | 1.2 | 1.3.0 |
Dataset | facades / maps | facades / maps |
batch_size | 1 | 1 |
outputs | probability | probability |
Please check the official homepage.
Pix2Pix模型的MindSpore实现 Pix2Pix模型基于cGAN(conditional GAN,条件GAN)实现图像风格迁移,因为cGAN可以通过添加条件信息来指导图像生成,因此在图像翻译中可以不使用噪声作为输入条件,而是将输入图像作为条件,学习从输入图像到输出图像之间的映射,从而得到指定的输出图像。
Python Shell C++ Text other