Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
zouzoucc 044309d744 | 1 year ago | |
---|---|---|
ascend310_infer | 1 year ago | |
model_utils | 1 year ago | |
scripts | 1 year ago | |
src | 1 year ago | |
README.md | 1 year ago | |
eval.py | 1 year ago | |
export.py | 1 year ago | |
osnet_config.yaml | 1 year ago | |
postprocess.py | 1 year ago | |
preprocess.py | 1 year ago | |
requirements.txt | 1 year ago | |
train.py | 1 year ago |
OSNet is an efficient and accurate neural network architecture for person re-identification. The method proposed a novel CNN architecture designed for learning omni-scale feature representations. The feature representations are captured by multiple convolutional streams with different receptive field sizes and fused by channel-wise weights that are generated by a unified aggregation gate(AG). This idea was proposed in the paper "Omni-Scale Feature Learning for Person Re-Identification.", published in 2019.
Paper Kaiyang Zhou, Yongxin Yang, Andrea Cavallaro, Tao Xiang, University of Surrey, Queen Mary University of London Samsung AI Center, Cambridge, Published in IEEE 2019.
The network structure can be decomposed into two parts: feature extraction and feature merging. Use multiple convolutional streams with different receptive field sizes in the feature extraction layer to obtain feature map. In the feature merging part, the resulting multi-scale feature maps are dynamically fused by channel-wise weights that are generated by a unified aggregation gate(AG).
Dataset used Market1501
Dataset used DukeMTMC-reID
Dataset used CUHK03
Dataset used MSMT17, extraction code: yf3z
In this project, the file organization is recommended as below:
.
├──datasets
├──market1501
├──Market-1501-v15.09.15
├──bounding_box_train
├──query
├──bounding_box_test
├──dukemtmc-reid
├──DukeMTMC-reID
├──bounding_box_train
├──query
├──bounding_box_test
├──cuhk03
├──cuhk03_release
├──cuhk-03.mat
├──cuhk03_new_protocol_config_labeled.mat
├──cuhk03_new_protocol_config_detected.mat
├──msmt17
├──MSMT17_V1
├──train
├──test
├──list_val.txt
├──list_train.txt
├──list_query.txt
├──list_query.txt
.
└─osnet
├─README.md
├─scripts
├─run_train_standalone_ascend.sh # launch standalone training with ascend platform(1p)
├─run_train_distribute_ascend.sh # launch distributed training with ascend platform(8p)
└─run_eval_ascend.sh # launch evaluating with ascend platform
├─src
├─cross_entropy_loss.py # cross entropy loss
├─dataset.py # data preprocessing
├─dataset_define.py # dataset preprocessing
├─lr_generator.py # learning rate scheduler
└─osnet.py # network definition
├─eval.py # eval net
├─export.py # export mindir for ascend 310
├─preprocessing.py # preprocessing data for ascend 310
├─postprocessing.py # calculate metrics for ascend 310
└─train.py # train net
# Add data set path, for example
data_path:/home/osnet/datasets
# distribute training example(8p)
bash run_train_distribute_ascend.sh [RANK_TABLE_FILE] [DATASET] [PRETRAINED_CKPT_PATH](optional)
# example: bash run_distribute_train_ascend.sh ./hccl_8p.json market1501 /home/osnet/checkpoint/market1501/osnet-240_101.ckpt
# standalone training
bash run_train_standalone_ascend.sh [DATASET] [DEVICE_ID] [PRETRAINED_CKPT_PATH](optional)
# example: bash run_train_standalone_ascend.sh market1501 0 /home/osnet/checkpoint/market1501/osnet-240_101.ckpt
# evaluation:
bash run_eval_ascend.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
# example: bash run_eval_ascend.sh market1501 /home/osnet/scripts/output/checkpoint/market1501/osnet-240_101.ckpt 0
Notes:
RANK_TABLE_FILE can refer to Link , and the device_ip can be got as Link. For large models like InceptionV4, it's better to export an external environment variableexport HCCL_CONNECT_TIMEOUT=600
to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size.This is processor cores binding operation regarding the
device_num
and total processor numbers. If you are not expect to do it, remove the operationstaskset
inscripts/run_train_distribute_ascend.sh
The
PRETRAINED_CKPT_PATH
should be a checkpoint saved in the training process on ascend, it will resume from the checkpoint and continue to train.
Training needs to load the parameters pre-trained on ImageNet. You can download the checkpoint file on link, the extraction code is 1961
. After downloading, put it in the ./model_utils
folder. You can also download the .pth
file pre-trained under Pytorch here, and convert it to .ckpt
file through ./model_utils/pth_to_ckpt.py
.
Running on local server
data_path
in osnet_config.yaml
.# training example
shell:
Ascend:
# distribute training example(8p)
bash run_train_distribute_ascend.sh [RANK_TABLE_FILE] [DATASET] [PRETRAINED_CKPT_PATH](optional)
# example: bash run_distribute_train_ascend.sh ./hccl_8p.json market1501 /home/osnet/checkpoint/market1501/osnet-240_101.ckpt
# standalone training
bash run_train_standalone_ascend.sh [DATASET] [DEVICE_ID] [PRETRAINED_CKPT_PATH](optional)
# example: bash run_train_standalone_ascend.sh market1501 0 /home/osnet/checkpoint/market1501/osnet-240_101.ckpt
Running on ModelArts
# Train 1p with Ascend
# (1) Perform a or b.
# a. Set "enable_modelarts=True" on osnet_config.yaml file.
# Set "data_path='/cache/data'" on osnet_config.yaml file.
# Set "load_path='/cache/checkpoint/'" on osnet_config.yaml file.
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on osnet_config.yaml file.
# Set other parameters on osnet_config.yaml file you need.
# b. Add "enable_modelarts=True" on the website UI interface.
# Add "data_path='/cache/data'" on the website UI interface.
# Add "load_path='/cache/checkpoint/'" on the website UI interface.
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
# Add other parameters on the website UI interface.
# (2) Prepare model code.
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
# (4) Upload the original mnist_data dataset to S3 bucket.
# (5) Set the code directory to "/path/osnet" on the website UI interface.
# (6) Set the startup file to "train.py" on the website UI interface.
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
# (8) Create your job.
# Train 8p with Ascend
# (1) Perform a or b.
# a. Set "enable_modelarts=True" on osnet_config.yaml file.
# Set "run_distribute=True" on osnet_config.yaml file.
# Set "data_path='/cache/data'" on osnet_config.yaml file.
# Set "load_path='/cache/checkpoint/'" on osnet_config.yaml file.
# (optional)Set "checkpoint_url='s3://dir_to_your_pretrained/'" on osnet_config.yaml file.
# Set other parameters on osnet_config.yaml file you need.
# b. Add "enable_modelarts=True" on the website UI interface.
# Add "run_distribute=True" on the website UI interface.
# Add "data_path='/cache/data'" on the website UI interface.
# Add "load_path='/cache/checkpoint/'" on the website UI interface.
# (optional)Add "checkpoint_url='s3://dir_to_your_pretrained/'" on the website UI interface.
# Add other parameters on the website UI interface.
# (2) Prepare model code.
# (3) Upload or copy your pretrained model to S3 bucket if you want to finetune.
# (4) Upload the original mnist_data dataset to S3 bucket.
# (5) Set the code directory to "/path/osnet" on the website UI interface.
# (6) Set the startup file to "train.py" on the website UI interface.
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
# (8) Create your job.
Checkpoints will be stored at ./output/checkpoint
by default, and training log will be redirected to ./train.log
(8p)
...
epoch: 90 step: 12, loss is 1.0532682
epoch time: 1779.959 ms, per step time: 148.330 ms
epoch: 91 step: 12, loss is 1.0837934
epoch time: 2229.157 ms, per step time: 185.763 ms
epoch: 92 step: 12, loss is 1.0674114
epoch time: 1607.048 ms, per step time: 133.921 ms
epoch: 93 step: 12, loss is 1.0512338
epoch time: 1764.129 ms, per step time: 147.011 ms
epoch: 94 step: 12, loss is 1.0647253
epoch time: 1782.682 ms, per step time: 148.557 ms
epoch: 95 step: 12, loss is 1.0884073
epoch time: 1755.473 ms, per step time: 146.289 ms
...
(1p)
...
epoch: 245 step: 129, loss is 1.0219252
epoch time: 23841.607 ms, per step time: 184.819 ms
epoch: 246 step: 129, loss is 1.0082468
epoch time: 23109.856 ms, per step time: 179.146 ms
epoch: 247 step: 129, loss is 1.0107011
epoch time: 24086.062 ms, per step time: 186.714 ms
epoch: 248 step: 129, loss is 1.0113524
epoch time: 22814.048 ms, per step time: 176.853 ms
epoch: 249 step: 129, loss is 1.0196884
epoch time: 23689.971 ms, per step time: 183.643 ms
epoch: 250 step: 129, loss is 1.0096855
epoch time: 24795.141 ms, per step time: 192.210 ms
...
You can start evaluating using python or shell scripts. The usage of shell scripts as follows:
bash run_eval_ascend.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
# example: bash run_eval_ascend.sh market1501 /home/osnet/scripts/output/checkpoint/market1501/osnet-240_101.ckpt 0
Running on local server.
data_path
in osnet_config.yaml
and run:# eval example
shell:
Ascend:
bash run_eval_ascend.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]
# example: bash run_eval_ascend.sh market1501 /home/osnet/scripts/output/checkpoint/market1501/osnet-240_101.ckpt 0
Running on ModelArts.
# Eval 1p with Ascend
# (1) Perform a or b.
# a. Set "enable_modelarts=True" on config_config.yaml file.
# Set "data_path='/cache/data'" on config_config.yaml file.
# Set "checkpoint_url='s3://dir_to_your_trained_model/'" on osnet_config.yaml file.
# Set "checkpoint_file_path='/cache/checkpoint/model.ckpt'" on osnet_config.yaml file.
# Set other parameters on default_config.yaml file you need.
# b. Add "enable_modelarts=True" on the website UI interface.
# Add "data_path='/cache/data'" on the website UI interface.
# Add "checkpoint_url='s3://dir_to_your_trained_model/'" on the website UI interface.
# Add "checkpoint_file_path='/cache/checkpoint/model.ckpt'" on the website UI interface.
# Add other parameters on the website UI interface.
# (2) Prepare model code.
# (3) Upload or copy your trained model to S3 bucket.
# (4) Upload the original mnist_data dataset to S3 bucket.
# (5) Set the code directory to "/path/lenet" on the website UI interface.
# (6) Set the startup file to "eval.py" on the website UI interface.
# (7) Set the "Dataset path" and "Output file path" and "Job log path" to your path on the website UI interface.
# (8) Create your job.
checkpoint can be produced in training process.
Evaluation result will be stored in the output file of evaluation script, you can find result like the followings in eval.log
.
** Results **
ckpt=/data/osnet/osnet-240_202.ckpt
mAP: 77.6%
CMC curve
Rank-1 : 91.5%
Rank-5 : 94.8%
Rank-10 : 96.1%
Rank-20 : 96.8%
Before export model, you must modify the config file osnet_config.yaml , The config items you should modify are data_path, target, batch_size_test and ckpt_file.
Current batch_size_test can only be set to 1.
python export.py --data_path [DATA_PATH] --taget [TARGET] --batch_size_test 1 --ckpt_file [CKPT_PATH] --file_name [FILE_NAME] --file_format [FILE_FORMAT]
The data_path, target, batch_size_test and ckpt_file parameter is required, file_name
defaults to osnet
,file_format
should be in ["AIR", "MINDIR"]
Before performing inference, the mindir file must be exported by export.py
script. We only provide an example of inference using MINDIR model.
# Ascend310 inference
bash run_infer_310.sh [MINDIR_PATH] [DATASET] [DATA_PATH] [DEVICE_ID]
Inference result is saved in current path, you can find result like this in acc.log file.
** Results **
Dataset:market1501
mAP: 83.7%
CMC curve
Rank-1 : 93.9%
Rank-5 : 95.8%
Rank-10 : 97.0%
Rank-20 : 97.4%
Parameters | OSNet |
---|---|
Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8 |
uploaded Date | 18/12/2021 (month/day/year) |
MindSpore Version | 1.5.0 |
Dataset | Market1501 |
Training Parameters | epoch=250, batch_size = 128, lr=0.001 |
Optimizer | Adam |
Loss Function | Label Smoothing Cross Entropy Loss |
outputs | probability |
Speed | 1pc: 175.741 ms/step; 8pcs: 181.027 ms/step |
Checkpoint for Fine tuning | 29.4M (.ckpt file) |
Parameters | OSNet |
---|---|
Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8 |
uploaded Date | 18/12/2021 (month/day/year) |
MindSpore Version | 1.5.0 |
Dataset | DukeMTMC-reID |
Training Parameters | epoch=250, batch_size = 128, lr=0.001 |
Optimizer | Adam |
Loss Function | Label Smoothing Cross Entropy Loss |
outputs | probability |
Speed | 1pc: 175.904 ms/step; 8pcs: 180.340 ms/step |
Checkpoint for Fine tuning | 29.11M (.ckpt file) |
Parameters | OSNet |
---|---|
Resource | Ascend 910; CPU 2.60GHz, 192cores; Memory 755G; OS Euler2.8 |
uploaded Date | 18/12/2021 (month/day/year) |
MindSpore Version | 1.5.0 |
Dataset | MSMT17 |
Training Parameters | epoch=250, batch_size = 128, lr=0.001 |
Optimizer | Adam |
Loss Function | Label Smoothing Cross Entropy Loss |
outputs | probability |
Speed | 1pc: 183.783 ms/step; 8pcs: 180.458 ms/step |
Checkpoint for Fine tuning | 31.12M (.ckpt file) |
Parameters | Ascend |
---|---|
Resource | Ascend 910; OS Euler2.8 |
Uploaded Date | 18/12/2021 (month/day/year) |
MindSpore Version | 1.5.0 |
Dataset | Market1501 |
batch_size | 300 |
outputs | probability |
mAP | 1pc: 82.4%; 8pcs:83.7% |
Rank-1 | 1pc: 93.3%; 8pcs:93.9% |
Parameters | Ascend |
---|---|
Resource | Ascend 910; OS Euler2.8 |
Uploaded Date | 18/12/2021 (month/day/year) |
MindSpore Version | 1.5.0 |
Dataset | DukeMTMC-reID |
batch_size | 300 |
outputs | probability |
mAP | 1pc: 69.8%; 8pcs:74.6% |
Rank-1 | 1pc: 86.2%; 8pcs:89.2% |
Parameters | Ascend |
---|---|
Resource | Ascend 910; OS Euler2.8 |
Uploaded Date | 18/12/2021 (month/day/year) |
MindSpore Version | 1.5.0 |
Dataset | MSMT17 |
batch_size | 300 |
outputs | probability |
mAP | 1pc: 43.1%; 8pcs:50.0% |
Rank-1 | 1pc: 71.5%; 8pcs:77.7% |
We set seed to 1 in train.py.
Please check the official homepage.
OSNet是一个用于行人重识别任务的高效神经网络模型,它提出了一种全新的神经网络架构,可以学习图像的全尺度特征表示。
Python Shell C++ Text other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》