ResNet50

Model description

Residual Networks, or ResNets, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping.

Step 1: Installing

pip3 install torch torchvision

Sign up and login in ImageNet official website, then choose 'Download' to download the whole ImageNet dataset. Specify /path/to/imagenet to your ImageNet path in later training process.

The ImageNet dataset path structure should look like:

imagenet
├── train
│   └── n01440764
│       ├── n01440764_10026.JPEG
│       └── ...
├── train_list.txt
├── val
│   └── n01440764
│       ├── ILSVRC2012_val_00000293.JPEG
│       └── ...
└── val_list.txt

🍻 Done!

Step 2: Training

One single GPU

bash scripts/fp32_1card.sh --data-path /path/to/imagenet

One single GPU (AMP)

bash scripts/amp_1card.sh --data-path /path/to/imagenet

Multiple GPUs on one machine

bash scripts/fp32_4cards.sh --data-path /path/to/imagenet
bash scripts/fp32_8cards.sh --data-path /path/to/imagenet

Multiple GPUs on one machine (AMP)

bash scripts/amp_4cards.sh --data-path /path/to/imagenet
bash scripts/amp_8cards.sh --data-path /path/to/imagenet

Multiple GPUs on two machines

bash scripts/fp32_16cards.sh --data-path /path/to/imagenet

Results on BI-V100

	FP32	AMP+NHWC
single card	Acc@1=76.02,FPS=330,Time=4d3h，BatchSize=280	Acc@1=75.56,FPS=550,Time=2d13h，BatchSize=300
4 cards	Acc@1=75.89,FPS=1233,Time=1d2h，BatchSize=300	Acc@1=79.04,FPS=2400,Time=11h，BatchSize=512
8 cards	Acc@1=74.98,FPS=2150,Time=12h43m，BatchSize=300	Acc@1=76.43,FPS=4200,Time=8h，BatchSize=480

Convergence criteria	Configuration (x denotes number of GPUs)	Performance	Accuracy	Power（W）	Scalability	Memory utilization（G）	Stability
top1 75.9%	SDK V2.2,bs:512,8x,AMP	5221	76.43%	128*8	0.97	29.1*8	1

Reference

torchvision

2.7 KiB Raw Permalink Blame History