R-package/vignettes/classifyRealImageWithPretrainedModel.Rmd - mxnet - Git at Google

 # Classify Real-world Images with Pre-trained Model


 MXNet is a flexible and efficient deep learning framework. One of the cool things that a deep learning
 algorithm can do is to classify real world images.

 In this example we will show how to use a pretrained Inception-BatchNorm network to predict the content of
 real world image. The network architecture is described in [1].

 The pre-trained Inception-BatchNorm network can be downloaded from [this link](http://data.mxnet.io/mxnet/data/Inception.zip).
 This model gives the recent state-of-art prediction accuracy on the image net dataset.

 ## Package Loading

 To get started, we load the `mxnet` package first.

 ```{r}
 require(mxnet)
 ```

 In this example, we also need the imager package to load and preprocess the images in R.

 ```{r}
 require(imager)
 ```

 ## Load the Pretrained Model


 Make sure you unzip the pre-trained model in current folder. And we can use the model
 loading function to load the model into R.

 ```{r}
 download.file('http://data.dmlc.ml/data/Inception.zip', destfile = 'Inception.zip')
 unzip("Inception.zip")
 model <- mx.model.load("Inception/Inception_BN", iteration = 39)
 ```

 We also need to load in the mean image, which is used for preprocessing using ```mx.nd.load```.

 ```{r}
 mean.img <- as.array(mx.nd.load("Inception/mean_224.nd")[["mean_img"]])
 ```

 ## Load and Preprocess the Image

 Now we are ready to classify a real image. In this example, we simply take the parrots image
 from imager package. But you can always change it to other images.

 Load and plot the image:

 ```{r, fig.align='center'}
 im <- load.image(system.file("extdata/parrots.png", package = "imager"))
 plot(im)
 ```

 Before feeding the image to the deep net, we need to do some preprocessing
 to make the image fit the input requirement of deepnet. The preprocessing
 include cropping, and subtraction of the mean.
 Because mxnet is deeply integerated with R, we can do all the processing in R function.

 The preprocessing function:

 ```{r}
 preproc.image <- function(im, mean.image) {
   # crop the image
   shape <- dim(im)
   short.edge <- min(shape[1:2])
   xx <- floor((shape[1] - short.edge) / 2)
   yy <- floor((shape[2] - short.edge) / 2)
   croped <- crop.borders(im, xx, yy)
   # resize to 224 x 224, needed by input of the model.
   resized <- resize(croped, 224, 224)
   # convert to array (x, y, channel)
   arr <- as.array(resized) * 255
   dim(arr) <- c(224, 224, 3)
   # subtract the mean
   normed <- arr - mean.img
   # Reshape to format needed by mxnet (width, height, channel, num)
   dim(normed) <- c(224, 224, 3, 1)
   return(normed)
 }
 ```

 We use the defined preprocessing function to get the normalized image.

 ```{r}
 normed <- preproc.image(im, mean.img)
 ```

 ## Classify the Image

 Now we are ready to classify the image! We can use the predict function
 to get the probability over classes.

 ```{r}
 prob <- predict(model, X = normed)
 dim(prob)
 ```

 As you can see ```prob``` is a 1 times 1000 array, which gives the probability
 over the 1000 image classes of the input.

 We can use the ```max.col``` on the transpose of prob. get the class index.

 ```{r}
 max.idx <- max.col(t(prob))
 max.idx
 ```

 The index do not make too much sense. So let us see what it really corresponds to.
 We can read the names of the classes from the following file.

 ```{r}
 synsets <- readLines("Inception/synset.txt")
 ```

 And let us see what it really is

 ```{r}
 print(paste0("Predicted Top-class: ", synsets[[max.idx]]))
 ```

 Actually I do not know what does the word mean when I saw it.
 So I searched on the web to check it out.. and hmm it does get the right answer :)

 ## Extract features


 Besides the final classification results, we can also extract the internal features.
 We need to get feature layer symbol out of internals first. Here we use `global_pool_output`
 as an example.

 ```{r}
 internals = model$symbol$get.internals()
 fea_symbol = internals[[match("global_pool_output", internals$outputs)]]
 ```

 Next, we rebuild a new model using the feature symbol

 ```{r}
 model2 <- list(symbol = fea_symbol,
               arg.params = model$arg.params,
               aux.params = model$aux.params)

 class(model2) <- "MXFeedForwardModel"
 ```

 Then we can do the `predict` using the new model to get the internal results.
 You need to set `allow.extra.params = TRUE` since some parameters are not used this time.

 ```{r}
 global_pooling_feature <- predict(model2, X = normed, allow.extra.params = TRUE)
 dim(global_pooling_feature)
 ```


 ## Reference


 [1] Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).


 <!-- INSERT SOURCE DOWNLOAD BUTTONS -->
	# Classify Real-world Images with Pre-trained Model


	MXNet is a flexible and efficient deep learning framework. One of the cool things that a deep learning
	algorithm can do is to classify real world images.

	In this example we will show how to use a pretrained Inception-BatchNorm network to predict the content of
	real world image. The network architecture is described in [1].

	The pre-trained Inception-BatchNorm network can be downloaded from [this link](http://data.mxnet.io/mxnet/data/Inception.zip).
	This model gives the recent state-of-art prediction accuracy on the image net dataset.

	## Package Loading

	To get started, we load the `mxnet` package first.

	```{r}
	require(mxnet)
	```

	In this example, we also need the imager package to load and preprocess the images in R.

	```{r}
	require(imager)
	```

	## Load the Pretrained Model


	Make sure you unzip the pre-trained model in current folder. And we can use the model
	loading function to load the model into R.

	```{r}
	download.file('http://data.dmlc.ml/data/Inception.zip', destfile = 'Inception.zip')
	unzip("Inception.zip")
	model <- mx.model.load("Inception/Inception_BN", iteration = 39)
	```

	We also need to load in the mean image, which is used for preprocessing using ```mx.nd.load```.

	```{r}
	mean.img <- as.array(mx.nd.load("Inception/mean_224.nd")[["mean_img"]])
	```

	## Load and Preprocess the Image

	Now we are ready to classify a real image. In this example, we simply take the parrots image
	from imager package. But you can always change it to other images.

	Load and plot the image:

	```{r, fig.align='center'}
	im <- load.image(system.file("extdata/parrots.png", package = "imager"))
	plot(im)
	```

	Before feeding the image to the deep net, we need to do some preprocessing
	to make the image fit the input requirement of deepnet. The preprocessing
	include cropping, and subtraction of the mean.
	Because mxnet is deeply integerated with R, we can do all the processing in R function.

	The preprocessing function:

	```{r}
	preproc.image <- function(im, mean.image) {
	# crop the image
	shape <- dim(im)
	short.edge <- min(shape[1:2])
	xx <- floor((shape[1] - short.edge) / 2)
	yy <- floor((shape[2] - short.edge) / 2)
	croped <- crop.borders(im, xx, yy)
	# resize to 224 x 224, needed by input of the model.
	resized <- resize(croped, 224, 224)
	# convert to array (x, y, channel)
	arr <- as.array(resized) * 255
	dim(arr) <- c(224, 224, 3)
	# subtract the mean
	normed <- arr - mean.img
	# Reshape to format needed by mxnet (width, height, channel, num)
	dim(normed) <- c(224, 224, 3, 1)
	return(normed)
	}
	```

	We use the defined preprocessing function to get the normalized image.

	```{r}
	normed <- preproc.image(im, mean.img)
	```

	## Classify the Image

	Now we are ready to classify the image! We can use the predict function
	to get the probability over classes.

	```{r}
	prob <- predict(model, X = normed)
	dim(prob)
	```

	As you can see ```prob``` is a 1 times 1000 array, which gives the probability
	over the 1000 image classes of the input.

	We can use the ```max.col``` on the transpose of prob. get the class index.

	```{r}
	max.idx <- max.col(t(prob))
	max.idx
	```

	The index do not make too much sense. So let us see what it really corresponds to.
	We can read the names of the classes from the following file.

	```{r}
	synsets <- readLines("Inception/synset.txt")
	```

	And let us see what it really is

	```{r}
	print(paste0("Predicted Top-class: ", synsets[[max.idx]]))
	```

	Actually I do not know what does the word mean when I saw it.
	So I searched on the web to check it out.. and hmm it does get the right answer :)

	## Extract features


	Besides the final classification results, we can also extract the internal features.
	We need to get feature layer symbol out of internals first. Here we use `global_pool_output`
	as an example.

	```{r}
	internals = model$symbol$get.internals()
	fea_symbol = internals[[match("global_pool_output", internals$outputs)]]
	```

	Next, we rebuild a new model using the feature symbol

	```{r}
	model2 <- list(symbol = fea_symbol,
	arg.params = model$arg.params,
	aux.params = model$aux.params)

	class(model2) <- "MXFeedForwardModel"
	```

	Then we can do the `predict` using the new model to get the internal results.
	You need to set `allow.extra.params = TRUE` since some parameters are not used this time.

	```{r}
	global_pooling_feature <- predict(model2, X = normed, allow.extra.params = TRUE)
	dim(global_pooling_feature)
	```


	## Reference


	[1] Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).


	<!-- INSERT SOURCE DOWNLOAD BUTTONS -->