Accelerate Convolutional Neural Networks

This tool aims to accelerate the test-time computation and decrease number of parameters of deep CNNs.

How to use

Use accnn.py to get a new model by specifying an original model and the speeding-up ratio.

You may provide a json to explicitly control the architecture of the new model, otherwise the rank-selection algorithm would be used to do it automatically and the configuration would be saved to file config.json.

acc_conv.py and acc_fc.py would be involved automatically when using accnn.py while acc_conv.py and acc_fc.py can also be used seperately.

Example

###Speedup whole network

Speed up a model by 2 times and use rank-selection to determine ranks of each layer automatically
```
python accnn.py -m MODEL-PREFIX --save-model new-vgg16 --ratio 2
```

Use your own configuration file without rank-selection

python accnn.py -m MODEL-PREFIX --save-model new-model --config YOUR-CONFIG_JSON

###Speedup a single layer

Decompose a convolutional layer:

python acc_conv.py -m MODEL-PREFIX --layer LAYER-NAME --K NUM-FILTER --save-model new-model

Decompose a fullyconnected layer:

python acc_fc.py -m MODEL-PREFIX --layer LAYER-NAME --K NUM-HIDDEN --save-model new-model

uses --help to see more options

Results

The experiments are carried on a single machine with four Nvidia Titan X GPUs. The top-5 accuracy is evaluated on ImageNet validation dataset.

Model	Top-5 accuracy	Theoretical speed up	CPU speed up	GPU speed up
model0	89.6%	1x	1x	1x
model1	88.6%	2.4x	2.2x	1.1x
model2	89.8%	2.4x	2.2x	1.1x
model3	87.5%	3x	2.6x	1.2x
model4	89.6%	3x	2.6x	1.2x

model0 is the original VGG16 model directly converted from Caffe Model Zoo
model1 is the accelerated model based on config.json
model2 is the same as model1 but is fine-tuned on ImageNet training dataset for 5 epochs
model3 is the accelerated model based on rank-selection with 3 times speeding up
model4 is the same as model3 but is fine-tuned on ImageNet training dataset for 5 epochs
The experiments in GPU are carried with cuDNN 4

Notes

This tool is verified on the VGG-16 model converted from Caffe by caffe_converter tool.
accnn.py tool only supports single input and output
This tool mainly implements the algorithm of Cheng et al. [2] to decompose a convolutional layer to two convolutional layers both in spatial dimensions and across channels. acc_conv.py provides the function to replace a (N,d,d) conv. layer by two (K,d,1) and (N,1,d) conv. layers.
The idea of rank-selection tool is based on the related work of Zhang et al [1] that we could use the product of PCA energy to determine the rank for each layer.

Reference Paper

[1] Zhang, Xiangyu, et al. “Efficient and accurate approximations of nonlinear convolutional networks.” arXiv preprint arXiv:1411.4229 (2014).

[2] Tai, Cheng, et al. “Convolutional neural networks with low-rank regularization.” arXiv preprint arXiv:1511.06067 (2015).