Model | Training data | Test data | mAP |
---|---|---|---|
VGG16_reduced 300x300 | VOC07+12 trainval | VOC07 test | 71.57 |
Model | GPU | CUDNN | Batch-size | FPS* |
---|---|---|---|---|
VGG16_reduced 300x300 | TITAN X(Maxwell) | v5.1 | 16 | 95 |
VGG16_reduced 300x300 | TITAN X(Maxwell) | v5.1 | 8 | 95 |
VGG16_reduced 300x300 | TITAN X(Maxwell) | v5.1 | 1 | 64 |
VGG16_reduced 300x300 | TITAN X(Maxwell) | N/A | 8 | 36 |
VGG16_reduced 300x300 | TITAN X(Maxwell) | N/A | 1 | 28 |
easydict
, cv2
, matplotlib
and numpy
. You can install them via pip or package managers, such as apt-get
:sudo apt-get install python-opencv python-matplotlib python-numpy sudo pip install easydict
# for Ubuntu/Debian cp make/config.mk ./config.mk # modify it if with vim or whatever editor EXTRA_OPERATORS = example/ssd/operator # or add a line if you have other EXTRA_OPERATORS directory EXTRA_OPERATORS += example/ssd/operator
Remember to enable CUDA if you want to be able to train, since CPU training is insanely slow. Using CUDNN is optional.
ssd_300.zip
, and extract to model/
directory. (This model is converted from VGG_VOC0712_SSD_300x300_iter_60000.caffemodel provided by paper author).# cd /path/to/mxnet/example/ssd/ # grab demo images python data/demo/download_demo_images.py # run demo.py with defaults python demo.py # play with examples: python demo.py --epoch 0 --images ./data/demo/dog.jpg --thresh 0.5
python demo.py --help
for more options.This example only covers training on Pascal VOC dataset. Other datasets should be easily supported by adding subclass derived from class Imdb
in dataset/imdb.py
. See example of dataset/pascal_voc.py
for details.
vgg16_reduced
model here, unzip .param
and .json
files into model/
directory by default.cd /path/to/where_you_store_datasets/ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar # Extract the data. tar -xvf VOCtrainval_11-May-2012.tar tar -xvf VOCtrainval_06-Nov-2007.tar tar -xvf VOCtest_06-Nov-2007.tar
trainval
set in VOC2007/2012 as a common strategy. The suggested directory structure is to store VOC2007
and VOC2012
directories in the same VOCdevkit
folder.VOCdevkit
folder to data/VOCdevkit
by default:ln -s /path/to/VOCdevkit /path/to/mxnet/example/ssd/data/VOCdevkit
Use hard link instead of copy could save us a bit disk space.
# cd /path/to/mxnet/example/ssd python train.py
batch-size=32
and learning_rate=0.001
. You might need to change the parameters a bit if you have different configurations. Check python train.py --help
for more training options. For example, if you have 4 GPUs, use:# note that a perfect training parameter set is yet to be discovered for multi-GPUs python train.py --gpus 0,1,2,3 --batch-size 128 --lr 0.0005
VGG16_reduced
model with batch-size
32 takes around 4684MB without CUDNN.Again, currently we only support evaluation on PASCAL VOC Use:
# cd /path/to/mxnet/example/ssd python evaluate.py --gpus 0,1 --batch-size 128 --epoch 0
This simply removes all loss layers, and attach a layer for merging results and non-maximum suppression. Useful when loading python symbol is not available.
# cd /path/to/mxnet/example/ssd python deploy.py --num-class 20