History

Asthestarsfalll 073b3083ce init		1 year ago
..
README.md	init	1 year ago

lpsnet_l_cityscapes_1536x1024_200k.yml	init	1 year ago

lpsnet_m_cityscapes_1536x1024_200k.yml	init	1 year ago

lpsnet_s_cityscapes_1536x1024_200k.yml	init	1 year ago

README.md

Lightweight and Progressively-Scalable Networks for Semantic Segmentation

Lightweight and Progressively-Scalable Networks for Semantic Segmentation

Reference

Zhang, Yiheng and Yao, Ting and Qiu, Zhaofan and Mei, Tao. "Lightweight and Progressively-Scalable Networks for Semantic Segmentation."
arXiv preprint arXiv:2207.13600 (2022).

Performance

Cityscapes

Model	Backbone	Resolution	Training Iters	mIoU	mIoU (flip)	mIoU (ms+flip)	Links
lpsnet_s	-	1536x769	200000	71.73%	72.71%	73.76%	model \| log \| vdl
lpsnet_m	-	1536x769	200000	75.29%	76.03%	77.03%	model \| log \| vdl
lpsnet_l	-	1536x769	200000	75.72%	76.53%	77.23%	model \| log \| vdl

Note that: Since the original paper does not provide all the implementation details, nor release the official training code, we first pre-trained the models on the ImageNet dataset, and then fine-tuned the models on the Cityscapes dataset. Specifically, compared with the experimental settings in the original paper, we made two significant changes during fine-tuning to obtain the above results:

Extend iterations to 200k.
Adjust the parameters used in color jittering.

For the pre-training results of the models on the ImageNet dataset, please see the ImageNet section, where we also have a discussion on the impact of the pre-trained model.

ImageNet

Model	Epoch	Top-1 accuracy	Top-5 accuracy
lpsnet_s	120	0.403	0.666
lpsnet_m	100	0.543	0.786
lpsnet_l	120	0.525	0.773

Training settings are as following.

Optimizer

Optimizer	LR Scheduler	Regularizer
type: Momentum	type: Piecewice	type: L2
momentum: 0.9	lr: 0.1	coeff:0.0001
use_nesterov: true	decay_epochs: [30, 60, 90]	-
-	values: [0.1, 0.01, 0.001, 0.0001]	-

Data Augmentation

RandCropImage	RandFlipImage	Normalize
size: 224	flip horizontal	ImageNet mean and std

With the pre-trained model, lpsnet_m is able to obtain 74.28% mIoU on Cityscapes val set under the same experimental settings as the original paper.

Further more, we also tried more complicated data augmentation strategies, e.g. random erasing and batched mix-up. The performance of classification improved by a considerable margin: on the ImageNet dataset, the top-1 accuracy increased to 0.564 and the top-5 accuracy increased to 0.805. However, we did not observe better segmentation performance of the model on the Cityscapes dataset using these pre-trained models. Therefore, these data augmentation strategies were not used.

The paddlepaddle version of LPSNet

Python Shell Markdown Java C++ other

How to access data resources in code