FPENet
Model description
A lightweight feature pyramid encoding network (FPENet) to make a good trade-off between accuracy and speed.
Specifically, use a feature pyramid encoding block to encode multi-scale contextual features with depthwise dilated convolutions in all stages of the encoder.
A mutual embedding upsample module is introduced in the decoder to aggregate the high-level semantic features and low-level spatial details efficiently.
Step 1: Installing
Install packages
pip3 install 'scipy' 'matplotlib' 'pycocotools' 'opencv-python' 'easydict' 'tqdm'
Step 2: Training
Preparing datasets
Go to visit COCO official website, then select the COCO dataset you want to download.
Take coco2017 dataset as an example, specify /path/to/coco2017
to your COCO path in later training process, the unzipped dataset path structure sholud look like:
coco2017
├── annotations
│ ├── instances_train2017.json
│ ├── instances_val2017.json
│ └── ...
├── train2017
│ ├── 000000000009.jpg
│ ├── 000000000025.jpg
│ └── ...
├── val2017
│ ├── 000000000139.jpg
│ ├── 000000000285.jpg
│ └── ...
├── train2017.txt
├── val2017.txt
└── ...
Training on COCO dataset
bash train_fpenet_dist.sh --data-path /path/to/coco2017/ --dataset coco
Reference
Ref: https://github.com/LikeLy-Journey/SegmenTron
Ref: torchvision