关于GCU、沐曦GPGPU、MLU、0卡V100资源4月7日恢复上架的公告>>> 关于共建具身智能开源数据集的倡议>>> 关于云脑任务中统一路径访问方式的公告>>> 关于将启智集群GPU资源迁移至智算集群的公告>>>

2.6 KiB

Raw Permalink Blame History

OPENI compression

OPENI compression

Introduction

This is the baseline of deep neural networks compression, including prune, quantization and encoding.

Parameters

--model: Name of the model (eg. alexnet, vgg16)

Quantization Parameters

--quant_method: method of quantization
--bits: quantization bit width
--overflow_rate: overflow rate

Encoding Parameters

--codec: encode or decode
--coded: encode/decode file

Prune Parameters

--prune_method: method of prune
--rate: prune rate
--epochs: retrain epochs
--lr: learning rate
--lr_adjust: epochs of adjusting learning rate
--layer_begin: the beginning layer for prune
--layer_inter: internal of two prune (vgg:2; resnet50:3; inception_v3:3)
--layer_end: the end layer for prune (vgg:28; resnet50:156; inception_v3:287)
--skip_downsample: skip downsample layer
--save_dir: save path to pruned model
--eval: test path for pruned model

Other Parameters

--batch_size: batch size
--gpu: gpu ID
--train_data_root: dataset for train
--test_data_root: dataset for test
--print-freq: print frequency

Quantization and Encoding

Examples

Vgg16 8bit quantization and encoding

python main.py --model vgg16 --quant_method linear --codec encode --model_save ./vgg16.torch -coded ./vgg16_coded.pkl --batch_size 16 --bits 8

Vgg16 decoding

python main.py --model vgg16 --codec decode --gpu 1 --model_save ./vgg16_de.torch --coded vgg16_coded.pkl

Prune(including retrain)

Examples

Vgg16 (pruning rate=70%)

python main.py --model vgg16 --prune_method norm2 --rate 0.7 --epochs 12 --lr 0.001 --lr_adjust 3 --layer_begin 0 --layer_end 28 --layer_inter 2 --skip_downsample 1 --save_dir ${dir to vgg16.torch} --batch_size 64 --gpu 1,2 --train_data_root ${dir to train dataset} --test_data_root ${dir to test dataset} --print-freq 400

Vgg16 model test

python main.py --model vgg16 --eval ${dir to best.vgg16.torch} --gpu 1,2

Results

Quantization and Encoding

network	bit-width	top1-accuracy	top5-accuracy	model size
VGG16	8	0.7157	0.9038	90.18MB
ResNet18	8	0.6943	0.8891	6.26MB
ResNet50	8	0.7573	0.9269	13.38MB
Inception-v3	8	0.7645	0.9301	15.43MB

Prune

network	prune	top1-accuracy	top5-accuracy
VGG16	0.7	0.6765	0.8821
ResNet50	0.7	0.7362	0.9177
Inception-v3	0.7	0.7330	0.9183

2.6 KiB Raw Permalink Blame History

OPENI compression

Introduction

Parameters

Quantization Parameters

Encoding Parameters

Prune Parameters

Other Parameters

Quantization and Encoding

Examples

Vgg16 8bit quantization and encoding

Vgg16 decoding

Prune(including retrain)

Examples

Vgg16 (pruning rate=70%)

Vgg16 model test

Results

Quantization and Encoding

Prune

2.6 KiB

Raw Permalink Blame History