Contents
M2Det (Multi-Level Multi-Scale Detector) is an end-to-end one-stage object detection model. It uses Multi-Level Feature Pyramid Network (MLFPN) to extract features from input image, and then produces dense bounding boxes and category scores.
Paper: Q. Zhao, T. Sheng, Y.Wang, Zh. Tang, Y. Chen, L. Cai, H. Ling. M2Det: A Single-Shot Object Detector base on Multi-Level Feature Pyramid Network.
M2Det consists of several modules. Feature Fusion Module (FFM) rescales and concatenates features from several backbone feature layers (VGG, ResNet, etc.) to produce base feature for further modules. Thinned U-shape Modules (TUMs) use encoder-decoder architecture to produce multi-level multi-scale features, which afterwards aggregated by Scale-wise Aggregation Module (SFAM). Resulting Multi-Level Feature Pyramid is used by prediction layers to achieve local bounding box regression and classification.
Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below.
COCO is a large-scale object detection, segmentation, and captioning dataset. The COCO train, validation, and test sets, containing more than 200,000 images and 80 object categories. All object instance are annotated with bounding boxes and detailed segmentation mask.
Dataset organize way
.
└─ coco
├─ annotations
├── captions_train2014.json
├── captions_val2014.json
├── image_info_test2014.json
├── image_info_test2015.json
├── image_info_test-dev2015.json
├── instances_minival2014.json
├── instances_train2014.json
├── instances_val2014.json
└── instances_valminusminival2014.json
├─images
├── test2015
└── COCO_test2015_*.jpg
├── train2014
└── COCO_train2014_*.jpg
└── val2014
└── COCO_val2014_*.jpg
...
- Hardware(GPU)
- Prepare hardware environment with GPU processor.
- Framework
- For more information, please check the resources below:
After installing MindSpore via the official website, specify dataset location in src/config.py
file.
Run Soft-NMS building script with following command:
bash ./scripts/make.sh
You can start training and evaluation as follows:
For GPU training, set device = 'GPU'
in src/config.py
.
# Single GPU training
bash ./scripts/run_standalone_train.sh [DEVICE_ID]
# Multi-GPU training
bash ./scripts/run_distributed_train_gpu.sh [RANK_SIZE] [DEVICE_START]
Example:
# Single GPU training
bash ./scripts/run_standalone_train.sh 0
# Multi-GPU training
bash ./scripts/run_distributed_train_gpu.sh 8 0
bash ./scripts/run_eval.sh [DEVICE_ID]
Example:
bash ./scripts/run_eval.sh 0
|-- README.md # English README
|-- eval.py # Evaluation
|-- export.py # MINDIR model export
|-- requirements.txt # pip dependencies
|-- scripts
| |-- make.sh # Script for building Soft-NMS function
| |-- run_distributed_train_gpu.sh # GPU distributed training script
| |-- run_eval.sh # Evaluation script
| |-- run_export.sh # MINDIR model export script
| `-- run_standalone_train.sh # Single-device training script
|-- src
| |-- nms
`-- cpu_nms.pyx # Soft-NMS algorithm
| |-- box_utils.py # Function for bounding boxes processing
| |-- build.py # Script for building Soft-NMS function
| |-- callback.py # Custom callback functions
| |-- coco_utils.py # COCO dataset functions
| |-- config.py # Configuration file
| |-- dataset.py # Dataset loader
| |-- detector.py # Bounding box detector
| |-- loss.py # Multibox loss function
| |-- lr_scheduler.py # Learning rate scheduler utilities
| |-- model.py # M2Det model architecture
| |-- priors.py # SSD prior boxes definition
| `-- utils.py # General utilities
`-- train.py # Training
Parameters for both training and evaluation can be set in src/config.py
.
random_seed = 1
experiment_tag = 'm2det512_vgg16_lr_7.5e-4'
train_cfg = dict(
lr = 7.5e-4,
warmup = 5,
per_batch_size = 7,
gamma = [0.5, 0.2, 0.1, 0.1],
lr_epochs = [90, 110, 130, 150, 160],
total_epochs = 160,
print_epochs = 10,
num_workers = 3,
)
test_cfg = dict(
cuda = True,
topk = 0,
iou = 0.45,
soft_nms = True,
score_threshold = 0.1,
keep_per_class = 50,
save_folder = 'eval'
)
optimizer = dict(
type='SGD',
momentum=0.9,
weight_decay=0.00005,
loss_scale=1,
dampening=0.0,
clip_grad_norm=5.)
Training
Run M2Det on GPU
For GPU training, set device = 'GPU'
in src/config.py
.
- Training using single device (1p)
bash ./scripts/run_standalone_train.sh 0
- Distributed Training (8p)
bash ./scripts/run_distributed_train_gpu.sh 8 0
Checkpoints will be saved in ./checkpoints/[EXPERIMENT_TAG]
folder. Checkpoint filename format: [MODEL.M2DET_CONFIG.BACKBONE]_[MODEL.INPUT_SIZE]-[EPOCH]_[ITERATION].ckpt
. Final checkpoint filename format: [MODEL.M2DET_CONFIG.BACKBONE]_[MODEL.INPUT_SIZE]-final.ckpt
Evaluation
Evaluation script uses checkpoint file with name [MODEL.M2DET_CONFIG.BACKBONE]_[MODEL.INPUT_SIZE]-final.ckpt
specified in ./src/config.py
. To start evaluation, run the following command:
bash ./scripts/run_eval.sh [DEVICE_ID]
# Example:
bash ./scripts/run_eval.sh 0
Training Performance
Training performance in the following tables is obtained by the M2Det-512-VGG16 model based on the COCO dataset:
Parameters |
M2Det-512-VGG16 (8GPU) |
Model Version |
M2Det-512-VGG16 |
Resource |
Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz, 8x V100-PCIE |
Uploaded Date |
2022-02-22 |
MindSpore version |
1.5.0 |
Dataset |
COCO |
Training Parameters |
seed=None;epoch=210;batch_size = 64;lr=5e-5;weight_decay = 1e-6, loss_scale = 2 ^ 16 |
Optimizer |
Adam with Weight Decay |
Loss Function |
Multibox MSE loss |
Outputs |
Bounding boxes and class scores |
Loss value |
<...> |
Average checkpoint (.ckpt file) size |
<...> |
Speed |
<...> ms/step, <...> s/epoch |
Total time |
<...> hours <...> minutes |
Scripts |
M2Det training script |
Evaluation Performance
- Evaluation performance in the following tables is obtained by the M2Det-512-VGG16 model based on the COCO dataset:
Parameters |
M2Det-512-VGG16 (8GPU) |
Model Version |
M2Det-512-VGG16 |
Resource |
Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz, 8x V100-PCIE |
Uploaded Date |
2022-02-22 |
MindSpore version |
1.5.0 |
Dataset |
COCO |
Loss Function |
Multibox MSE loss |
AP |
<...> |
Scripts |
M2Det evaluation script |
Global training random seed is fixed in src/config.py
with random_seed
parameter. 'None' value will execute training without dataset shuffle.
Please check the official homepage.