History

MengzhangLI 5b2f19aae4 [Fix] Fix Coco-stuff164k on BiSeNetV1 config error (#1893 )		1 year ago
..
README.md	[Docs] Replace markdownlint with mdformat for avoiding installing ruby (#1591)	1 year ago

bisenetv1.yml	[Fix] Update correct `In Collection` in metafile of each configs. (#1239)	2 years ago

bisenetv1_r18-d32_4x4_1024x1024_160k_cityscapes.py	[Feature] Support BiSeNetV1 (#851)	2 years ago

bisenetv1_r18-d32_in1k-pre_4x4_1024x1024_160k_cityscapes.py	[Feature] Support BiSeNetV1 (#851)	2 years ago

bisenetv1_r18-d32_in1k-pre_4x8_1024x1024_160k_cityscapes.py	[Fix] Revise high workers_per_gpus (#1506)	2 years ago

bisenetv1_r18-d32_in1k-pre_lr5e-3_4x4_512x512_160k_coco-stuff164k.py	[Benchmark] Add BiSeNetV1 COCO-Stuff 164k benchmark (#1019)	2 years ago

bisenetv1_r18-d32_lr5e-3_4x4_512x512_160k_coco-stuff164k.py	[Fix] Fix Coco-stuff164k on BiSeNetV1 config error (#1893)	1 year ago

bisenetv1_r50-d32_4x4_1024x1024_160k_cityscapes.py	first commit (#946)	2 years ago

bisenetv1_r50-d32_in1k-pre_4x4_1024x1024_160k_cityscapes.py	[Feature] Support BiSeNetV1 (#851)	2 years ago

bisenetv1_r50-d32_in1k-pre_lr5e-3_4x4_512x512_160k_coco-stuff164k.py	[Benchmark] Add BiSeNetV1 COCO-Stuff 164k benchmark (#1019)	2 years ago

bisenetv1_r50-d32_lr5e-3_4x4_512x512_160k_coco-stuff164k.py	[Fix] Fix Coco-stuff164k on BiSeNetV1 config error (#1893)	1 year ago

bisenetv1_r101-d32_in1k-pre_lr5e-3_4x4_512x512_160k_coco-stuff164k.py	[Benchmark] Add BiSeNetV1 COCO-Stuff 164k benchmark (#1019)	2 years ago

bisenetv1_r101-d32_lr5e-3_4x4_512x512_160k_coco-stuff164k.py	[Fix] Fix Coco-stuff164k on BiSeNetV1 config error (#1893)	1 year ago

README.md

BiSeNetV1

BiSeNetV1

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation

Introduction

Official Repo

Code Snippet

Abstract

Semantic segmentation requires both rich spatial information and sizeable receptive field. However, modern approaches usually compromise spatial resolution to achieve real-time inference speed, which leads to poor performance. In this paper, we address this dilemma with a novel Bilateral Segmentation Network (BiSeNet). We first design a Spatial Path with a small stride to preserve the spatial information and generate high-resolution features. Meanwhile, a Context Path with a fast downsampling strategy is employed to obtain sufficient receptive field. On top of the two paths, we introduce a new Feature Fusion Module to combine features efficiently. The proposed architecture makes a right balance between the speed and segmentation performance on Cityscapes, CamVid, and COCO-Stuff datasets. Specifically, for a 2048x1024 input, we achieve 68.4% Mean IOU on the Cityscapes test dataset with speed of 105 FPS on one NVIDIA Titan XP card, which is significantly faster than the existing methods with comparable performance.

Citation

@inproceedings{yu2018bisenet,
  title={Bisenet: Bilateral segmentation network for real-time semantic segmentation},
  author={Yu, Changqian and Wang, Jingbo and Peng, Chao and Gao, Changxin and Yu, Gang and Sang, Nong},
  booktitle={Proceedings of the European conference on computer vision (ECCV)},
  pages={325--341},
  year={2018}
}

Results and models

Cityscapes

Method	Backbone	Crop Size	Lr schd	Mem (GB)	Inf time (fps)	mIoU	mIoU(ms+flip)	config	download
BiSeNetV1 (No Pretrain)	R-18-D32	1024x1024	160000	5.69	31.77	74.44	77.05	config	model \| log
BiSeNetV1	R-18-D32	1024x1024	160000	5.69	31.77	74.37	76.91	config	model \| log
BiSeNetV1 (4x8)	R-18-D32	1024x1024	160000	11.17	31.77	75.16	77.24	config	model \| log
BiSeNetV1 (No Pretrain)	R-50-D32	1024x1024	160000	15.39	7.71	76.92	78.87	config	model \| log
BiSeNetV1	R-50-D32	1024x1024	160000	15.39	7.71	77.68	79.57	config	model \| log

COCO-Stuff 164k

Method	Backbone	Crop Size	Lr schd	Mem (GB)	Inf time (fps)	mIoU	mIoU(ms+flip)	config	download
BiSeNetV1 (No Pretrain)	R-18-D32	512x512	160000	-	-	25.45	26.15	config	model \| log
BiSeNetV1	R-18-D32	512x512	160000	6.33	74.24	28.55	29.26	config	model \| log
BiSeNetV1 (No Pretrain)	R-50-D32	512x512	160000	-	-	29.82	30.33	config	model \| log
BiSeNetV1	R-50-D32	512x512	160000	9.28	32.60	34.88	35.37	config	model \| log
BiSeNetV1 (No Pretrain)	R-101-D32	512x512	160000	-	-	31.14	31.76	config	model \| log
BiSeNetV1	R-101-D32	512x512	160000	10.36	25.25	37.38	37.99	config	model \| log

Note:

4x8: Using 4 GPUs with 8 samples per GPU in training.
For BiSeNetV1 on Cityscapes dataset, default setting is 4 GPUs with 4 samples per GPU in training.
No Pretrain means the model is trained from scratch.

No Description

Python Markdown Shell Dockerfile other

mcmong@pku.edu.cn xvjiarui0826@gmail.com hejunjun@sjtu.edu.cn 76149310+MeowZheng@users.noreply.github.com 41846794+RockeyCoss@users.noreply.github.com xiexinch@outlook.com xinchen.xie@qq.com 58427300+sennnnn@users.noreply.github.com 49829199+yamengxi@users.noreply.github.com 93248678+linfangjian01@users.noreply.github.com sshuair@gmail.com daviddelaiglesiacastro@gmail.com

wangwxyz@qq.com 31381602+johnzja@users.noreply.github.com 30465912+lkm2835@users.noreply.github.com yaoqian@sensetime.com 61701369+Nourollah@users.noreply.github.com miguelmndez@gmail.com siddancha@users.noreply.github.com dazitu616@gmail.com fehlner@arcor.de drcut@users.noreply.github.com jinwonkim93@users.noreply.github.com test767803@foxmail.com

How to access data resources in code