tree: 537e0ff21c68951bf2a5f4aaeb2994d126892af2 [path history] [tgz]

tools/accnn/README.md

This tool aims to accelerate the test-time computation and decrease number of parameters of deep CNNs.

Use `accnn.py`

to get a new model by specifying an original model and the speeding-up ratio.

You may provide a json to explicitly control the architecture of the new model, otherwise the rank-selection algorithm would be used to do it automatically and the configuration would be saved to file `config.json`

.

`acc_conv.py`

and `acc_fc.py`

would be involved automatically when using `accnn.py`

while `acc_conv.py`

and `acc_fc.py`

can also be used seperately.

###Speedup whole network

Speed up a model by 2 times and use

`rank-selection`

to determine ranks of each layer automaticallypython accnn.py -m MODEL-PREFIX --save-model new-vgg16 --ratio 2

Use your own configuration file without

`rank-selection`

python accnn.py -m MODEL-PREFIX --save-model new-model --config YOUR-CONFIG_JSON

###Speedup a single layer

Decompose a convolutional layer:

python acc_conv.py -m MODEL-PREFIX --layer LAYER-NAME --K NUM-FILTER --save-model new-model

Decompose a fullyconnected layer:

python acc_fc.py -m MODEL-PREFIX --layer LAYER-NAME --K NUM-HIDDEN --save-model new-model

uses

`--help`

to see more options

The experiments are carried on a single machine with four Nvidia Titan X GPUs. The top-5 accuracy is evaluated on ImageNet validation dataset.

Model | Top-5 accuracy | Theoretical speed up | CPU speed up | GPU speed up |
---|---|---|---|---|

model0 | 89.6% | 1x | 1x | 1x |

model1 | 88.6% | 2.4x | 2.2x | 1.1x |

model2 | 89.8% | 2.4x | 2.2x | 1.1x |

model3 | 87.5% | 3x | 2.6x | 1.2x |

model4 | 89.6% | 3x | 2.6x | 1.2x |

`model0`

is the original VGG16 model directly converted from Caffe Model Zoo`model1`

is the accelerated model based on`config.json`

`model2`

is the same as`model1`

but is fine-tuned on ImageNet training dataset for 5 epochs`model3`

is the accelerated model based on rank-selection with 3 times speeding up`model4`

is the same as`model3`

but is fine-tuned on ImageNet training dataset for 5 epochs- The experiments in GPU are carried with cuDNN 4

This tool is verified on the VGG-16 model converted from Caffe by

`caffe_converter`

tool.`accnn.py`

tool only supports single input and outputThis tool mainly implements the algorithm of Cheng

*et al.*[2] to decompose a convolutional layer to two convolutional layers both in spatial dimensions and across channels.`acc_conv.py`

provides the function to replace a`(N,d,d)`

conv. layer by two`(K,d,1)`

and`(N,1,d)`

conv. layers.The idea of

`rank-selection`

tool is based on the related work of Zhang*et al*[1] that we could use the product of PCA energy to determine the rank for each layer.

[1] Zhang, Xiangyu, et al. “Efficient and accurate approximations of nonlinear convolutional networks.” arXiv preprint arXiv:1411.4229 (2014).

[2] Tai, Cheng, et al. “Convolutional neural networks with low-rank regularization.” arXiv preprint arXiv:1511.06067 (2015).