| # Linear Regression |
| |
| In this tutorial we'll walk through how one can implement *linear regression* using MXNet APIs. |
| |
| The function we are trying to learn is: *y = x<sub>1</sub> + 2x<sub>2</sub>*, where *(x<sub>1</sub>,x<sub>2</sub>)* are input features and *y* is the corresponding label. |
| |
| ## Prerequisites |
| |
| To complete this tutorial, we need: |
| |
| - MXNet. See the instructions for your operating system in [Setup and Installation](http://mxnet.io/install/index.html). |
| |
| - [Jupyter Notebook](http://jupyter.org/index.html). |
| |
| ``` |
| $ pip install jupyter |
| ``` |
| |
| To begin, the following code imports the necessary packages we'll need for this exercise. |
| |
| ```python |
| import mxnet as mx |
| import numpy as np |
| |
| # Fix the random seed |
| mx.random.seed(42) |
| |
| import logging |
| logging.getLogger().setLevel(logging.DEBUG) |
| ``` |
| |
| ## Preparing the Data |
| |
| In MXNet, data is input via **Data Iterators**. Here we will illustrate |
| how to encode a dataset into an iterator that MXNet can use. The data used in the example is made up of 2D data points with corresponding integer labels. |
| |
| ```python |
| #Training data |
| train_data = np.random.uniform(0, 1, [100, 2]) |
| train_label = np.array([train_data[i][0] + 2 * train_data[i][1] for i in range(100)]) |
| batch_size = 1 |
| |
| #Evaluation Data |
| eval_data = np.array([[7,2],[6,10],[12,2]]) |
| eval_label = np.array([11,26,16]) |
| ``` |
| |
| Once we have the data ready, we need to put it into an iterator and specify |
| parameters such as `batch_size` and `shuffle`. `batch_size` specifies the number |
| of examples shown to the model each time we update its parameters and `shuffle` |
| tells the iterator to randomize the order in which examples are shown to the model. |
| |
| |
| ```python |
| train_iter = mx.io.NDArrayIter(train_data, train_label, batch_size, shuffle=True, label_name='lin_reg_label') |
| eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False, label_name='lin_reg_label') |
| ``` |
| |
| In the above example, we have made use of `NDArrayIter`, which is useful for iterating |
| over both numpy ndarrays and MXNet NDArrays. In general, there are different types of iterators in |
| MXNet and you can use one based on the type of data you are processing. |
| Documentation for iterators can be found [here](http://mxnet.io/api/python/io/io.html). |
| |
| ## MXNet Classes |
| |
| 1. **IO:** The IO class as we already saw works on the data and carries out |
| operations such as feeding data in batches and shuffling. |
| |
| 2. **Symbol:** The actual MXNet neural network is composed using symbols. MXNet has |
| different types of symbols, including variable placeholders for input data, |
| neural network layers, and operators that manipulate NDArrays. |
| |
| 3. **Module:** The module class in MXNet is used to define the overall computation. |
| It is initialized with the model we want to train, the training inputs (data and labels) |
| and some additional parameters such as learning rate and the optimization |
| algorithm to use. |
| |
| ## Defining the Model |
| |
| MXNet uses **Symbols** for defining a model. Symbols are the building blocks |
| and make up various components of the model. Symbols are used to define: |
| |
| 1. **Variables:** A variable is a placeholder for future data. This symbol is used |
| to define a spot which will be filled with training data/labels in the future |
| when we commence training. |
| 2. **Neural Network Layers:** The layers of a network or any other type of model are |
| also defined by Symbols. Such a symbol takes one or more previous symbols as |
| inputs, performs some transformations on them, and creates one or more outputs. |
| One such example is the `FullyConnected` symbol which specifies a fully connected |
| layer of a neural network. |
| 3. **Outputs:** Output symbols are MXNet's way of defining a loss. They are |
| suffixed with the word "Output" (eg. the `SoftmaxOutput` layer). You can also |
| [create your own loss function](https://github.com/dmlc/mxnet/blob/master/docs/tutorials/r/CustomLossFunction.md#how-to-use-your-own-loss-function). |
| Some examples of existing losses are: `LinearRegressionOutput`, which computes |
| the l2-loss between it's input symbol and the labels provided to it; |
| `SoftmaxOutput`, which computes the categorical cross-entropy. |
| |
| The ones described above and other symbols are chained together with the output of |
| one symbol serving as input to the next to build the network topology. More information |
| about the different types of symbols can be found [here](http://mxnet.io/api/python/symbol/symbol.html). |
| |
| ```python |
| X = mx.sym.Variable('data') |
| Y = mx.symbol.Variable('lin_reg_label') |
| fully_connected_layer = mx.sym.FullyConnected(data=X, name='fc1', num_hidden = 1) |
| lro = mx.sym.LinearRegressionOutput(data=fully_connected_layer, label=Y, name="lro") |
| ``` |
| |
| The above network uses the following layers: |
| |
| 1. `FullyConnected`: The fully connected symbol represents a fully connected layer |
| of a neural network (without any activation being applied), which in essence, |
| is just a linear regression on the input attributes. It takes the following |
| parameters: |
| |
| - `data`: Input to the layer (specifies the symbol whose output should be fed here) |
| - `num_hidden`: Number of hidden neurons in the layer, which is same as the dimensionality |
| of the layer's output |
| |
| 2. `LinearRegressionOutput`: Output layers in MXNet compute training loss, which is |
| the measure of inaccuracy in the model's predictions. The goal of training is to minimize the |
| training loss. In our example, the `LinearRegressionOutput` layer computes the *l2* loss against |
| its input and the labels provided to it. The parameters to this layer are: |
| |
| - `data`: Input to this layer (specifies the symbol whose output should be fed here) |
| - `label`: The training labels against which we will compare the input to the layer for calculation of l2 loss |
| |
| **Note on naming convention:** the label variable's name should be the same as the |
| `label_name` parameter passed to your training data iterator. The default value of |
| this is `softmax_label`, but we have updated it to `lin_reg_label` in this |
| tutorial as you can see in `Y = mx.symbol.Variable('lin_reg_label')` and |
| `train_iter = mx.io.NDArrayIter(..., label_name='lin_reg_label')`. |
| |
| Finally, the network is input to a *Module*, where we specify the symbol |
| whose output needs to be minimized (in our case, `lro` or the `lin_reg_output`), the |
| learning rate to be used while optimization and the number of epochs we want to |
| train our model for. |
| |
| ```python |
| model = mx.mod.Module( |
| symbol = lro , |
| data_names=['data'], |
| label_names = ['lin_reg_label']# network structure |
| ) |
| ``` |
| |
| We can visualize the network we created by plotting it: |
| |
| ```python |
| mx.viz.plot_network(symbol=lro) |
| ``` |
| |
| ## Training the model |
| |
| Once we have defined the model structure, the next step is to train the |
| parameters of the model to fit the training data. This is accomplished using the |
| `fit()` function of the `Module` class. |
| |
| ```python |
| model.fit(train_iter, eval_iter, |
| optimizer_params={'learning_rate':0.01, 'momentum': 0.9}, |
| num_epoch=20, |
| eval_metric='mse', |
| batch_end_callback = mx.callback.Speedometer(batch_size, 2)) |
| ``` |
| |
| ## Using a trained model: (Testing and Inference) |
| |
| Once we have a trained model, we can do a couple of things with it - we can either |
| use it for inference or we can evaluate the trained model on test data. The latter is shown below: |
| |
| ```python |
| model.predict(eval_iter).asnumpy() |
| ``` |
| |
| We can also evaluate our model according to some metric. In this example, we are |
| evaluating our model's mean squared error (MSE) on the evaluation data. |
| |
| ```python |
| metric = mx.metric.MSE() |
| mse = model.score(eval_iter, metric) |
| print("Achieved {0:.6f} validation MSE".format(mse[0][1])) |
| assert model.score(eval_iter, metric)[0][1] < 0.01001, "Achieved MSE (%f) is larger than expected (0.01001)" % model.score(eval_iter, metric)[0][1] |
| ``` |
| |
| Let us try and add some noise to the evaluation data and see how the MSE changes: |
| |
| ```python |
| eval_data = np.array([[7,2],[6,10],[12,2]]) |
| eval_label = np.array([11.1,26.1,16.1]) #Adding 0.1 to each of the values |
| eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False, label_name='lin_reg_label') |
| model.score(eval_iter, metric) |
| ``` |
| |
| We can also create a custom metric and use it to evaluate a model. More |
| information on metrics can be found in the [API documentation](http://mxnet.incubator.apache.org/api/python/metric/metric.html). |
| |
| <!-- INSERT SOURCE DOWNLOAD BUTTONS --> |