tree: 2d01b73a759d68e0096abee135898b9b70d94948 [path history] [tgz]
  1. acc_conv.py
  2. acc_fc.py
  3. accnn.py
  4. config.json
  5. rank_selection.py
  6. README.md
  7. utils.py
tools/accnn/README.md

Accelerate Convolutional Neural Networks

This tool aims to accelerate the test-time computation and decrease number of parameters of deep CNNs.

How to use

Use accnn.py to get a new model by specifying an original model and the speeding-up ratio.

You may provide a json to explicitly control the architecture of the new model, otherwise the rank-selection algorithm would be used to do it automatically and the configuration would be saved to file config.json.

acc_conv.py and acc_fc.py would be involved automatically when using accnn.py while acc_conv.py and acc_fc.py can also be used seperately.

Example

###Speedup whole network

  • Speed up a model by 2 times and use rank-selection to determine ranks of each layer automatically

    python accnn.py -m MODEL-PREFIX --save-model new-vgg16 --ratio 2
    
  • Use your own configuration file without rank-selection

    python accnn.py -m MODEL-PREFIX --save-model new-model --config YOUR-CONFIG_JSON
    

###Speedup a single layer

  • Decompose a convolutional layer:

    python acc_conv.py -m MODEL-PREFIX --layer LAYER-NAME --K NUM-FILTER --save-model new-model
    
  • Decompose a fullyconnected layer:

    python acc_fc.py -m MODEL-PREFIX --layer LAYER-NAME --K NUM-HIDDEN --save-model new-model
    
  • uses --help to see more options

Results

The experiments are carried on a single machine with four Nvidia Titan X GPUs. The top-5 accuracy is evaluated on ImageNet validation dataset.

ModelTop-5 accuracyTheoretical speed upCPU speed upGPU speed up
model089.6%1x1x1x
model188.6%2.4x2.2x1.1x
model289.8%2.4x2.2x1.1x
model387.5%3x2.6x1.2x
model489.6%3x2.6x1.2x
  • model0 is the original VGG16 model directly converted from Caffe Model Zoo
  • model1 is the accelerated model based on config.json
  • model2 is the same as model1 but is fine-tuned on ImageNet training dataset for 5 epochs
  • model3 is the accelerated model based on rank-selection with 3 times speeding up
  • model4 is the same as model3 but is fine-tuned on ImageNet training dataset for 5 epochs
  • The experiments in GPU are carried with cuDNN 4

Notes

  • This tool is verified on the VGG-16 model converted from Caffe by caffe_converter tool.

  • accnn.py tool only supports single input and output

  • This tool mainly implements the algorithm of Cheng et al. [2] to decompose a convolutional layer to two convolutional layers both in spatial dimensions and across channels. acc_conv.py provides the function to replace a (N,d,d) conv. layer by two (K,d,1) and (N,1,d) conv. layers.

  • The idea of rank-selection tool is based on the related work of Zhang et al [1] that we could use the product of PCA energy to determine the rank for each layer.

Reference Paper

[1] Zhang, Xiangyu, et al. “Efficient and accurate approximations of nonlinear convolutional networks.” arXiv preprint arXiv:1411.4229 (2014).

[2] Tai, Cheng, et al. “Convolutional neural networks with low-rank regularization.” arXiv preprint arXiv:1511.06067 (2015).