example/fcn-xs/README.md - mxnet-test - Git at Google

 FCN-xs EXAMPLES
 ---------------
 This folder contains the examples of image segmentation in MXNet.

 ## Sample results
 ![fcn-xs pasval_voc result](https://github.com/dmlc/web-data/blob/master/mxnet/image/fcnxs-example-result.jpg)

 We have trained a simple fcn-xs model, the hyper-parameters are below:

 | model   | lr (fixed) | epoch |
 | ------- | ---------: | ----: |
 | fcn-32s |      1e-10 |    31 |
 | fcn-16s |      1e-12 |    27 |
 | fcn-8s  |      1e-14 |    19 |

 (```when using the newest mxnet, you'd better using larger learning rate, such as 1e-4, 1e-5, 1e-6 instead, because the newest mxnet will do gradient normalization in SoftmaxOutput```)

 The training dataset size is only 2027, and the validation dataset size is 462.

 ## How to train fcn-xs in mxnet
 #### Getting Started

 - Install python package `Pillow` (required by `image_segment.py`).
 ```shell
 [sudo] pip install Pillow
 ```
 - Assume that we are in a working directory, such as `~/train_fcn_xs`, and MXNet is built as `~/mxnet`. Now, copy example scripts into working directory.
 ```shell
 cp ~/mxnet/example/fcn-xs/* .
 ```
 #### Step1: Download the vgg16fc model and experiment data
 * vgg16fc model : you can download the ```VGG_FC_ILSVRC_16_layers-symbol.json``` and ```VGG_FC_ILSVRC_16_layers-0074.params```   [baidu yun](http://pan.baidu.com/s/1bgz4PC), [dropbox](https://www.dropbox.com/sh/578n5cxej7ofd6m/AACuSeSYGcKQDi1GoB72R5lya?dl=0).
 this is the fully convolution style of the origin
 [VGG_ILSVRC_16_layers.caffemodel](http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel), and the corresponding [VGG_ILSVRC_16_layers_deploy.prototxt](https://gist.github.com/ksimonyan/211839e770f7b538e2d8#file-vgg_ilsvrc_16_layers_deploy-prototxt), the vgg16 model has [license](http://creativecommons.org/licenses/by-nc/4.0/) for non-commercial use only.
 * experiment data : you can download the ```VOC2012.rar```  [robots.ox.ac.uk](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar), and extract it. the file/folder will be like:
 ```JPEGImages folder```, ```SegmentationClass folder```, ```train.lst```, ```val.lst```, ```test.lst```

 #### Step2: Train fcn-xs model
 * Configure GPU/CPU for training in `fcn_xs.py`.
 ```python
 # ctx = mx.cpu(0)
 ctx = mx.gpu(0)
 ```
 * If you want to train the fcn-8s model, it's better for you trained the fcn-32s and fcn-16s model firstly.
 when training the fcn-32s model, run in shell ```./run_fcnxs.sh```, the script in it is:
 ```shell
 python -u fcn_xs.py --model=fcn32s --prefix=VGG_FC_ILSVRC_16_layers --epoch=74 --init-type=vgg16
 ```
 * In the fcn_xs.py, you may need to change the directory ```root_dir```, ```flist_name```, ``fcnxs_model_prefix``` for your own data.
 * When you train fcn-16s or fcn-8s model, you should change the code in ```run_fcnxs.sh``` corresponding, such as when train fcn-16s, comment out the fcn32s script, then it will like this:
 ```shell
  python -u fcn_xs.py --model=fcn16s --prefix=FCN32s_VGG16 --epoch=31 --init-type=fcnxs
 ```
 * The output log may look like this(when training fcn-8s):
 ```c++
 INFO:root:Start training with gpu(3)
 INFO:root:Epoch[0] Batch [50]   Speed: 1.16 samples/sec Train-accuracy=0.894318
 INFO:root:Epoch[0] Batch [100]  Speed: 1.11 samples/sec Train-accuracy=0.904681
 INFO:root:Epoch[0] Batch [150]  Speed: 1.13 samples/sec Train-accuracy=0.908053
 INFO:root:Epoch[0] Batch [200]  Speed: 1.12 samples/sec Train-accuracy=0.912219
 INFO:root:Epoch[0] Batch [250]  Speed: 1.13 samples/sec Train-accuracy=0.914238
 INFO:root:Epoch[0] Batch [300]  Speed: 1.13 samples/sec Train-accuracy=0.912170
 INFO:root:Epoch[0] Batch [350]  Speed: 1.12 samples/sec Train-accuracy=0.912080
 ```

 ## Using the pre-trained model for image segmentation
 * Similarly, you should first download the pre-trained model from  [yun.baidu](http://pan.baidu.com/s/1bgz4PC), the symbol and model file is ```FCN8s_VGG16-symbol.json```, ```FCN8s_VGG16-0019.params```
 * Then put the image in your directory for segmentation, and change the ```img = YOUR_IMAGE_NAME``` in ```image_segmentaion.py```
 * At last, use ```image_segmentaion.py``` to segmentation one image by running in shell ```python image_segmentaion.py```, then you will get the segmentation image like the sample results above.

 ## Tips
 * This is the whole image size training, that is to say, we do not need resize/crop the image to the same size, so the batch_size during training is set to 1.
 * The fcn-xs model is based on vgg16 model, with some crop, deconv, element-sum layer added, so the model is quite big, moreover, the example is using whole image size training, if the input image is large(such as 700*500), then it may consume lots of memories, so I suggest you using the GPU with 12G memory.
 * If you don't have GPU with 12G memory, maybe you should change the ```cut_off_size``` to a small value when you construct your FileIter, like this:
 ```python
 train_dataiter = FileIter(
       root_dir             = "./VOC2012",
       flist_name           = "train.lst",
       cut_off_size         = 400,
       rgb_mean             = (123.68, 116.779, 103.939),
       )
 ```
 * We are looking forward you to making this example more powerful, thanks.
	FCN-xs EXAMPLES
	---------------
	This folder contains the examples of image segmentation in MXNet.

	## Sample results
	![fcn-xs pasval_voc result](https://github.com/dmlc/web-data/blob/master/mxnet/image/fcnxs-example-result.jpg)

	We have trained a simple fcn-xs model, the hyper-parameters are below:

	\| model \| lr (fixed) \| epoch \|
	\| ------- \| ---------: \| ----: \|
	\| fcn-32s \| 1e-10 \| 31 \|
	\| fcn-16s \| 1e-12 \| 27 \|
	\| fcn-8s \| 1e-14 \| 19 \|

	(```when using the newest mxnet, you'd better using larger learning rate, such as 1e-4, 1e-5, 1e-6 instead, because the newest mxnet will do gradient normalization in SoftmaxOutput```)

	The training dataset size is only 2027, and the validation dataset size is 462.

	## How to train fcn-xs in mxnet
	#### Getting Started

	- Install python package `Pillow` (required by `image_segment.py`).
	```shell
	[sudo] pip install Pillow
	```
	- Assume that we are in a working directory, such as `~/train_fcn_xs`, and MXNet is built as `~/mxnet`. Now, copy example scripts into working directory.
	```shell
	cp ~/mxnet/example/fcn-xs/* .
	```
	#### Step1: Download the vgg16fc model and experiment data
	* vgg16fc model : you can download the ```VGG_FC_ILSVRC_16_layers-symbol.json``` and ```VGG_FC_ILSVRC_16_layers-0074.params``` [baidu yun](http://pan.baidu.com/s/1bgz4PC), [dropbox](https://www.dropbox.com/sh/578n5cxej7ofd6m/AACuSeSYGcKQDi1GoB72R5lya?dl=0).
	this is the fully convolution style of the origin
	[VGG_ILSVRC_16_layers.caffemodel](http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel), and the corresponding [VGG_ILSVRC_16_layers_deploy.prototxt](https://gist.github.com/ksimonyan/211839e770f7b538e2d8#file-vgg_ilsvrc_16_layers_deploy-prototxt), the vgg16 model has [license](http://creativecommons.org/licenses/by-nc/4.0/) for non-commercial use only.
	* experiment data : you can download the ```VOC2012.rar``` [robots.ox.ac.uk](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar), and extract it. the file/folder will be like:
	```JPEGImages folder```, ```SegmentationClass folder```, ```train.lst```, ```val.lst```, ```test.lst```

	#### Step2: Train fcn-xs model
	* Configure GPU/CPU for training in `fcn_xs.py`.
	```python
	# ctx = mx.cpu(0)
	ctx = mx.gpu(0)
	```
	* If you want to train the fcn-8s model, it's better for you trained the fcn-32s and fcn-16s model firstly.
	when training the fcn-32s model, run in shell ```./run_fcnxs.sh```, the script in it is:
	```shell
	python -u fcn_xs.py --model=fcn32s --prefix=VGG_FC_ILSVRC_16_layers --epoch=74 --init-type=vgg16
	```
	* In the fcn_xs.py, you may need to change the directory ```root_dir```, ```flist_name```, ``fcnxs_model_prefix``` for your own data.
	* When you train fcn-16s or fcn-8s model, you should change the code in ```run_fcnxs.sh``` corresponding, such as when train fcn-16s, comment out the fcn32s script, then it will like this:
	```shell
	python -u fcn_xs.py --model=fcn16s --prefix=FCN32s_VGG16 --epoch=31 --init-type=fcnxs
	```
	* The output log may look like this(when training fcn-8s):
	```c++
	INFO:root:Start training with gpu(3)
	INFO:root:Epoch[0] Batch [50] Speed: 1.16 samples/sec Train-accuracy=0.894318
	INFO:root:Epoch[0] Batch [100] Speed: 1.11 samples/sec Train-accuracy=0.904681
	INFO:root:Epoch[0] Batch [150] Speed: 1.13 samples/sec Train-accuracy=0.908053
	INFO:root:Epoch[0] Batch [200] Speed: 1.12 samples/sec Train-accuracy=0.912219
	INFO:root:Epoch[0] Batch [250] Speed: 1.13 samples/sec Train-accuracy=0.914238
	INFO:root:Epoch[0] Batch [300] Speed: 1.13 samples/sec Train-accuracy=0.912170
	INFO:root:Epoch[0] Batch [350] Speed: 1.12 samples/sec Train-accuracy=0.912080
	```

	## Using the pre-trained model for image segmentation
	* Similarly, you should first download the pre-trained model from [yun.baidu](http://pan.baidu.com/s/1bgz4PC), the symbol and model file is ```FCN8s_VGG16-symbol.json```, ```FCN8s_VGG16-0019.params```
	* Then put the image in your directory for segmentation, and change the ```img = YOUR_IMAGE_NAME``` in ```image_segmentaion.py```
	* At last, use ```image_segmentaion.py``` to segmentation one image by running in shell ```python image_segmentaion.py```, then you will get the segmentation image like the sample results above.

	## Tips
	* This is the whole image size training, that is to say, we do not need resize/crop the image to the same size, so the batch_size during training is set to 1.
	* The fcn-xs model is based on vgg16 model, with some crop, deconv, element-sum layer added, so the model is quite big, moreover, the example is using whole image size training, if the input image is large(such as 700*500), then it may consume lots of memories, so I suggest you using the GPU with 12G memory.
	* If you don't have GPU with 12G memory, maybe you should change the ```cut_off_size``` to a small value when you construct your FileIter, like this:
	```python
	train_dataiter = FileIter(
	root_dir = "./VOC2012",
	flist_name = "train.lst",
	cut_off_size = 400,
	rgb_mean = (123.68, 116.779, 103.939),
	)
	```
	* We are looking forward you to making this example more powerful, thanks.