blob: f9656844052d45c05f7de219eec9f06d138aee79 [file] [log] [blame] [view]
# Linear Regression
In this tutorial we'll walk through how one can implement *linear regression* using MXNet APIs.
The function we are trying to learn is: *y = x<sub>1</sub> + 2x<sub>2</sub>*, where *(x<sub>1</sub>,x<sub>2</sub>)* are input features and *y* is the corresponding label.
## Prerequisites
To complete this tutorial, we need:
- MXNet. See the instructions for your operating system in [Setup and Installation](http://mxnet.io/install/index.html).
- [Jupyter Notebook](http://jupyter.org/index.html).
```
$ pip install jupyter
```
To begin, the following code imports the necessary packages we'll need for this exercise.
```python
import mxnet as mx
import numpy as np
# Fix the random seed
mx.random.seed(42)
import logging
logging.getLogger().setLevel(logging.DEBUG)
```
## Preparing the Data
In MXNet, data is input via **Data Iterators**. Here we will illustrate
how to encode a dataset into an iterator that MXNet can use. The data used in the example is made up of 2D data points with corresponding integer labels.
```python
#Training data
train_data = np.random.uniform(0, 1, [100, 2])
train_label = np.array([train_data[i][0] + 2 * train_data[i][1] for i in range(100)])
batch_size = 1
#Evaluation Data
eval_data = np.array([[7,2],[6,10],[12,2]])
eval_label = np.array([11,26,16])
```
Once we have the data ready, we need to put it into an iterator and specify
parameters such as `batch_size` and `shuffle`. `batch_size` specifies the number
of examples shown to the model each time we update its parameters and `shuffle`
tells the iterator to randomize the order in which examples are shown to the model.
```python
train_iter = mx.io.NDArrayIter(train_data, train_label, batch_size, shuffle=True, label_name='lin_reg_label')
eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False, label_name='lin_reg_label')
```
In the above example, we have made use of `NDArrayIter`, which is useful for iterating
over both numpy ndarrays and MXNet NDArrays. In general, there are different types of iterators in
MXNet and you can use one based on the type of data you are processing.
Documentation for iterators can be found [here](http://mxnet.io/api/python/io/io.html).
## MXNet Classes
1. **IO:** The IO class as we already saw works on the data and carries out
operations such as feeding data in batches and shuffling.
2. **Symbol:** The actual MXNet neural network is composed using symbols. MXNet has
different types of symbols, including variable placeholders for input data,
neural network layers, and operators that manipulate NDArrays.
3. **Module:** The module class in MXNet is used to define the overall computation.
It is initialized with the model we want to train, the training inputs (data and labels)
and some additional parameters such as learning rate and the optimization
algorithm to use.
## Defining the Model
MXNet uses **Symbols** for defining a model. Symbols are the building blocks
and make up various components of the model. Symbols are used to define:
1. **Variables:** A variable is a placeholder for future data. This symbol is used
to define a spot which will be filled with training data/labels in the future
when we commence training.
2. **Neural Network Layers:** The layers of a network or any other type of model are
also defined by Symbols. Such a symbol takes one or more previous symbols as
inputs, performs some transformations on them, and creates one or more outputs.
One such example is the `FullyConnected` symbol which specifies a fully connected
layer of a neural network.
3. **Outputs:** Output symbols are MXNet's way of defining a loss. They are
suffixed with the word "Output" (eg. the `SoftmaxOutput` layer). You can also
[create your own loss function](https://github.com/dmlc/mxnet/blob/master/docs/tutorials/r/CustomLossFunction.md#how-to-use-your-own-loss-function).
Some examples of existing losses are: `LinearRegressionOutput`, which computes
the l2-loss between it's input symbol and the labels provided to it;
`SoftmaxOutput`, which computes the categorical cross-entropy.
The ones described above and other symbols are chained together with the output of
one symbol serving as input to the next to build the network topology. More information
about the different types of symbols can be found [here](http://mxnet.io/api/python/symbol/symbol.html).
```python
X = mx.sym.Variable('data')
Y = mx.symbol.Variable('lin_reg_label')
fully_connected_layer = mx.sym.FullyConnected(data=X, name='fc1', num_hidden = 1)
lro = mx.sym.LinearRegressionOutput(data=fully_connected_layer, label=Y, name="lro")
```
The above network uses the following layers:
1. `FullyConnected`: The fully connected symbol represents a fully connected layer
of a neural network (without any activation being applied), which in essence,
is just a linear regression on the input attributes. It takes the following
parameters:
- `data`: Input to the layer (specifies the symbol whose output should be fed here)
- `num_hidden`: Number of hidden neurons in the layer, which is same as the dimensionality
of the layer's output
2. `LinearRegressionOutput`: Output layers in MXNet compute training loss, which is
the measure of inaccuracy in the model's predictions. The goal of training is to minimize the
training loss. In our example, the `LinearRegressionOutput` layer computes the *l2* loss against
its input and the labels provided to it. The parameters to this layer are:
- `data`: Input to this layer (specifies the symbol whose output should be fed here)
- `label`: The training labels against which we will compare the input to the layer for calculation of l2 loss
**Note on naming convention:** the label variable's name should be the same as the
`label_name` parameter passed to your training data iterator. The default value of
this is `softmax_label`, but we have updated it to `lin_reg_label` in this
tutorial as you can see in `Y = mx.symbol.Variable('lin_reg_label')` and
`train_iter = mx.io.NDArrayIter(..., label_name='lin_reg_label')`.
Finally, the network is input to a *Module*, where we specify the symbol
whose output needs to be minimized (in our case, `lro` or the `lin_reg_output`), the
learning rate to be used while optimization and the number of epochs we want to
train our model for.
```python
model = mx.mod.Module(
symbol = lro ,
data_names=['data'],
label_names = ['lin_reg_label']# network structure
)
```
We can visualize the network we created by plotting it:
```python
mx.viz.plot_network(symbol=lro)
```
## Training the model
Once we have defined the model structure, the next step is to train the
parameters of the model to fit the training data. This is accomplished using the
`fit()` function of the `Module` class.
```python
model.fit(train_iter, eval_iter,
optimizer_params={'learning_rate':0.01, 'momentum': 0.9},
num_epoch=20,
eval_metric='mse',
batch_end_callback = mx.callback.Speedometer(batch_size, 2))
```
## Using a trained model: (Testing and Inference)
Once we have a trained model, we can do a couple of things with it - we can either
use it for inference or we can evaluate the trained model on test data. The latter is shown below:
```python
model.predict(eval_iter).asnumpy()
```
We can also evaluate our model according to some metric. In this example, we are
evaluating our model's mean squared error (MSE) on the evaluation data.
```python
metric = mx.metric.MSE()
mse = model.score(eval_iter, metric)
print("Achieved {0:.6f} validation MSE".format(mse[0][1]))
assert model.score(eval_iter, metric)[0][1] < 0.01001, "Achieved MSE (%f) is larger than expected (0.01001)" % model.score(eval_iter, metric)[0][1]
```
Let us try and add some noise to the evaluation data and see how the MSE changes:
```python
eval_data = np.array([[7,2],[6,10],[12,2]])
eval_label = np.array([11.1,26.1,16.1]) #Adding 0.1 to each of the values
eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False, label_name='lin_reg_label')
model.score(eval_iter, metric)
```
We can also create a custom metric and use it to evaluate a model. More
information on metrics can be found in the [API documentation](http://mxnet.incubator.apache.org/api/python/metric/metric.html).
<!-- INSERT SOURCE DOWNLOAD BUTTONS -->