apache / incubator-mxnet / refs/heads/v1.5.x / . / R-package / vignettes / classifyRealImageWithPretrainedModel.Rmd

# Classify Real-world Images with Pre-trained Model | |

MXNet is a flexible and efficient deep learning framework. One of the cool things that a deep learning | |

algorithm can do is to classify real world images. | |

In this example we will show how to use a pretrained Inception-BatchNorm network to predict the content of | |

real world image. The network architecture is described in [1]. | |

The pre-trained Inception-BatchNorm network can be downloaded from [this link](http://data.mxnet.io/mxnet/data/Inception.zip). | |

This model gives the recent state-of-art prediction accuracy on the image net dataset. | |

## Package Loading | |

To get started, we load the `mxnet` package first. | |

```{r} | |

require(mxnet) | |

``` | |

In this example, we also need the imager package to load and preprocess the images in R. | |

```{r} | |

require(imager) | |

``` | |

## Load the Pretrained Model | |

Make sure you unzip the pre-trained model in current folder. And we can use the model | |

loading function to load the model into R. | |

```{r} | |

download.file('http://data.mxnet.io/mxnet/data/Inception.zip', destfile = 'Inception.zip') | |

unzip("Inception.zip") | |

model <- mx.model.load("Inception/Inception_BN", iteration = 39) | |

``` | |

We also need to load in the mean image, which is used for preprocessing using ```mx.nd.load```. | |

```{r} | |

mean.img <- as.array(mx.nd.load("Inception/mean_224.nd")[["mean_img"]]) | |

``` | |

## Load and Preprocess the Image | |

Now we are ready to classify a real image. In this example, we simply take the parrots image | |

from imager package. But you can always change it to other images. | |

Load and plot the image: | |

```{r, fig.align='center'} | |

im <- load.image(system.file("extdata/parrots.png", package = "imager")) | |

plot(im) | |

``` | |

Before feeding the image to the deep net, we need to do some preprocessing | |

to make the image fit the input requirement of deepnet. The preprocessing | |

include cropping, and subtraction of the mean. | |

Because mxnet is deeply integerated with R, we can do all the processing in R function. | |

The preprocessing function: | |

```{r} | |

preproc.image <- function(im, mean.image) { | |

# crop the image | |

shape <- dim(im) | |

short.edge <- min(shape[1:2]) | |

xx <- floor((shape[1] - short.edge) / 2) | |

yy <- floor((shape[2] - short.edge) / 2) | |

croped <- crop.borders(im, xx, yy) | |

# resize to 224 x 224, needed by input of the model. | |

resized <- resize(croped, 224, 224) | |

# convert to array (x, y, channel) | |

arr <- as.array(resized) * 255 | |

dim(arr) <- c(224, 224, 3) | |

# subtract the mean | |

normed <- arr - mean.img | |

# Reshape to format needed by mxnet (width, height, channel, num) | |

dim(normed) <- c(224, 224, 3, 1) | |

return(normed) | |

} | |

``` | |

We use the defined preprocessing function to get the normalized image. | |

```{r} | |

normed <- preproc.image(im, mean.img) | |

``` | |

## Classify the Image | |

Now we are ready to classify the image! We can use the predict function | |

to get the probability over classes. | |

```{r} | |

prob <- predict(model, X = normed) | |

dim(prob) | |

``` | |

As you can see ```prob``` is a 1 times 1000 array, which gives the probability | |

over the 1000 image classes of the input. | |

We can use the ```max.col``` on the transpose of prob. get the class index. | |

```{r} | |

max.idx <- max.col(t(prob)) | |

max.idx | |

``` | |

The index do not make too much sense. So let us see what it really corresponds to. | |

We can read the names of the classes from the following file. | |

```{r} | |

synsets <- readLines("Inception/synset.txt") | |

``` | |

And let us see what it really is | |

```{r} | |

print(paste0("Predicted Top-class: ", synsets[[max.idx]])) | |

``` | |

Actually I do not know what does the word mean when I saw it. | |

So I searched on the web to check it out.. and hmm it does get the right answer :) | |

## Extract features | |

Besides the final classification results, we can also extract the internal features. | |

We need to get feature layer symbol out of internals first. Here we use `global_pool_output` | |

as an example. | |

```{r} | |

internals = model$symbol$get.internals() | |

fea_symbol = internals[[match("global_pool_output", internals$outputs)]] | |

``` | |

Next, we rebuild a new model using the feature symbol | |

```{r} | |

model2 <- list(symbol = fea_symbol, | |

arg.params = model$arg.params, | |

aux.params = model$aux.params) | |

class(model2) <- "MXFeedForwardModel" | |

``` | |

Then we can do the `predict` using the new model to get the internal results. | |

You need to set `allow.extra.params = TRUE` since some parameters are not used this time. | |

```{r} | |

global_pooling_feature <- predict(model2, X = normed, allow.extra.params = TRUE) | |

dim(global_pooling_feature) | |

``` | |

## Reference | |

[1] Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015). | |

<!-- INSERT SOURCE DOWNLOAD BUTTONS --> |