XCompression
Introduction
This is an Basic compression library that integrates some pruning and quantization algorithms.
Included method
Pruning
Quantization
Type |
Name |
Paper |
QAT |
Default |
|
QAT |
DoReFa |
DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients |
QAT |
PACT |
Parameterized Clipping Activation for Quantized Neural Networks |
QAT |
TQT |
Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks |
QAT |
LSQ |
Learned Step Size Quantization |
QAT |
LSQ+ |
LSQ+: Improving low-bit quantization through learnable offsets and better initialization |
QAT |
DSQ |
Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks |
Results and Models
Here are some experiment results.
We will release more quantized models with different configurations soon.
All these models can be downloaded from Dropbox.
Network |
Config. File |
Model |
Bitwidth (W/A) |
Top-1 Acc. (%) |
Top-5 Acc. (%) |
ResNet-18 |
link |
link |
3/2 |
66.9 |
87.2 |
User Guide
Install Dependencies
First install library dependencies within an Anaconda environment.
# Create a environment with Python 3.8
conda create -n xcom python=3.8
# PyTorch GPU version >= 1.5
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
# Tensorboard visualization tool
conda install tensorboard
# Miscellaneous
conda install scikit-learn pyyaml munch
Run Scripts with Your Configurations
Run the main.py
with your modified configuration file.
python main.py /path/to/your/config/file.yaml
The quantization parameters are defined as follows:
w_bits -->bit width of quantized weight
a_bits -->bit width of quantized activation
q_type -->quant_type:0-fixed
q_level -->quant_level:0-per_channel, 1-per_layer (only for weight, activation uses per-layer by default)
w_symmetric -->symmetric for weight
a_symmetric -->symmetric for activation
weight_observer -->quant_weight_observer:0-MinMaxObserver, 1-MovingAverageMinMaxObserver
bn_fuse -->batch-normalization fuse
bn_fuse_calib -->batch-normalization fuse calibration
qaft -->quantization-aware-finetune
ptq -->use post-training-quantization
ptq_control -->ptq control flag
ptq_batch -->the batch of ptq
ptq_percentile -->the percentile of ptq
You can find some example configuration files in the test folder.