Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
dengjian 0e0e1d1461 | 2 years ago | |
---|---|---|
ascend310_infer | 2 years ago | |
infer | 2 years ago | |
modelarts | 2 years ago | |
scripts | 2 years ago | |
src | 2 years ago | |
Dockerfile | 2 years ago | |
README.md | 2 years ago | |
README_CN.md | 2 years ago | |
eval.py | 2 years ago | |
export.py | 2 years ago | |
postprocess.py | 2 years ago | |
preprocess.py | 2 years ago | |
requirements.txt | 2 years ago | |
train.py | 2 years ago |
Paper: Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le. Learning Transferable Architectures for Scalable Image Recognition. 2017.
The overall network architecture of NASNet is show below:
Dataset used: imagenet
.
└─nasnet
├─README.md
├─README_CN.md
├─scripts
├─run_standalone_train_for_ascend.sh # launch standalone training with Ascend platform(1p)
├─run_distribute_train_for_ascend.sh # launch distributed training with Ascend platform(8p)
├─run_standalone_train_for_gpu.sh # launch standalone training with gpu platform(1p)
├─run_distribute_train_for_gpu.sh # launch distributed training with gpu platform(8p)
└─run_eval_for_ascend # launch evaluating with Ascend platform
└─run_eval_for_gpu.sh # launch evaluating with gpu platform
├─src
├─config.py # parameter configuration
├─dataset.py # data preprocessing
├─loss.py # Customized CrossEntropy loss function
├─lr_generator.py # learning rate generator
├─nasnet_a_mobile.py # network definition
├─eval.py # eval net
├─export.py # convert checkpoint
└─train.py # train net
Parameters for both training and evaluating can be set in config.py.
'random_seed': 1, # fix random seed
'rank': 0, # local rank of distributed
'group_size': 1, # world size of distributed
'work_nums': 8, # number of workers to read the data
'epoch_size': 600, # total epoch numbers
'keep_checkpoint_max': 30, # max numbers to keep checkpoints
'ckpt_path': './', # save checkpoint path
'is_save_on_master': 0 # save checkpoint on rank0, distributed parameters
'train_batch_size': 32, # input batch size for trainning
'val_batch_size': 32, # input batch size for validating
'image_size' : 224, # the size of one image
'num_classes': 1000, # dataset class numbers
'label_smooth_factor': 0.1, # label smoothing factor
'aux_factor': 0.4, # loss factor of aux logit
'lr_init': 0.04*8, # initiate learning rate
'lr_decay_rate': 0.97, # decay rate of learning rate
'num_epoch_per_decay': 2.4, # decay epoch number
'weight_decay': 0.00004, # weight decay
'momentum': 0.9, # momentum
'opt_eps': 1.0, # epsilon
'rmsprop_decay': 0.9, # rmsprop decay
'loss_scale': 1, # loss scale
'cutout': True, # whether to cutout the input data for training
'coutout_leng': 56, # the length of cutout when cutout is True
'random_seed': 1, # fix random seed
'rank': 0, # local rank of distributed
'group_size': 1, # world size of distributed
'work_nums': 8, # number of workers to read the data
'epoch_size': 600, # total epoch numbers
'keep_checkpoint_max': 100, # max numbers to keep checkpoints
'ckpt_path': './checkpoint/', # save checkpoint path
'is_save_on_master': 0 # save checkpoint on rank0, distributed parameters
'train_batch_size': 32, # input batch size for trainning
'val_batch_size': 32, # input batch size for validating
'image_size' : 224, # the size of one image
'num_classes': 1000, # dataset class numbers
'label_smooth_factor': 0.1, # label smoothing factor
'aux_factor': 0.4, # loss factor of aux logit
'lr_init': 0.04*8, # initiate learning rate
'lr_decay_rate': 0.97, # decay rate of learning rate
'num_epoch_per_decay': 2.4, # decay epoch number
'weight_decay': 0.00004, # weight decay
'momentum': 0.9, # momentum
'opt_eps': 1.0, # epsilon
'rmsprop_decay': 0.9, # rmsprop decay
'loss_scale': 1, # loss scale
'cutout': False, # whether to cutout the input data for training
'coutout_leng': 56, # the length of cutout when cutout is True
Ascend:
# distribute training example(8p)
bash run_distribute_train_for_ascend.sh DATA_DIR
# standalone training
bash run_standalone_train_for_ascend.sh DEVICE_ID DATA_DIR
GPU:
# distribute training example(8p)
bash run_distribute_train_for_gpu.sh DATA_DIR
# standalone training
bash run_standalone_train_for_gpu.sh DEVICE_ID DATA_DIR
# distributed training example(8p) for Ascend
bash scripts/run_distribute_train_for_ascend.sh /dataset
# standalone training example for for Ascend
bash scripts/run_standalone_train_for_ascend.sh 0 /dataset
# distributed training example(8p) for GPU
bash scripts/run_distribute_train_for_gpu.sh /dataset/train
# standalone training example for GPU
bash scripts/run_standalone_train_for_gpu.sh 0 /dataset/train
You can find checkpoint file together with result in log.
# Evaluation
bash run_eval_for_ascend.sh DEVICE_ID DATA_DIR PATH_CHECKPOINT
bash run_eval_for_gpu.sh DEVICE_ID DATA_DIR PATH_CHECKPOINT
# Evaluation with checkpoint
bash scripts/run_eval_for_ascend.sh 0 /dataset ./checkpoint/nasnet-a-mobile-rank0-248_10009.ckpt
bash scripts/run_eval_for_gpu.sh 0 /dataset/val ./checkpoint/nasnet-a-mobile-rank0-248_10009.ckpt
Evaluation result will be stored in the scripts path. Under this, you can find result like the followings in log.
acc=74.0%(TOP1,Ascend)
acc=73.5%(TOP1,GPU)
Parameters | Ascend 910 | GPU |
---|---|---|
Model Version | NASNet | NASNet |
Resource | Ascend 910 | NV SMX2 V100-32G |
uploaded Date | 09/08/2021 (month/day/year) | 09/24/2020 |
MindSpore Version | 1.2.0 | 1.0.0 |
Dataset | ImageNet | ImageNet |
Training Parameters | src/config.py | src/config.py |
Optimizer | RMSProp | RMSProp |
Loss Function | SoftmaxCrossEntropyWithLogits | SoftmaxCrossEntropyWithLogits |
Loss | 1.9598 | 1.8965 |
Total time | 564 h 8ps | 144 h 8ps |
Checkpoint for Fine tuning | 89 M(.ckpt file) | 89 M(.ckpt file) |
Parameters | Ascend 910 | GPU |
---|---|---|
Model Version | NASNet | NASNet |
Resource | Ascend 910 | NV SMX2 V100-32G |
uploaded Date | 09/08/2021 (month/day/year) | 09/24/2020 |
MindSpore Version | 1.2.0 | 1.0.0 |
Dataset | ImageNet | ImageNet |
batch_size | 32 | 32 |
outputs | probability | probability |
Accuracy | acc=74.0%(TOP1) | acc=73.5%(TOP1) |
Please check the official homepage.
No Description
Python C++ Shell Markdown Text other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》