docs/tutorials/python/linear-regression.md - mxnet-test - Git at Google

 # MXNet Basics - Linear Regression using MXNet

 ```python
 import mxnet as mx
 import numpy as np
 ```

 ## Preparing the Data

 MXNet uses data in the form of Data **Iterators**. The code below illustrates
 how to encode a dataset into an iterator that MXNet can use. The data used in
 the example is made up of 2d data points with corresponding integer labels. The
 function we are trying to learn is:

 y = x<sub>1</sub>  +  2x<sub>2</sub> ,

 where (x<sub>1</sub>,x<sub>2</sub>) is one training data point and y is the
 corresponding label


 ```python
 #Training data
 train_data = np.array([[1,2],[3,4],[5,6],[3,2],[7,1],[6,9]])
 train_label = np.array([5,11,17,7,9,24])
 batch_size = 1

 #Evaluation Data
 eval_data = np.array([[7,2],[6,10],[12,2]])
 eval_label = np.array([11,26,16])
 ```

 Once we have the data ready, we need to put it into an iterator and specify
 parameters such as the 'batch_size', and 'shuffle' which will determine the size
 of data the iterator feeds during each pass, and whether or not the data will be
 shuffled respectively.


 ```python
 train_iter = mx.io.NDArrayIter(train_data,train_label, batch_size, shuffle=True,label_name='lin_reg_label')
 eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False)
 ```

 In the above example, we have made use of NDArrayIter, which is used to iterate
 over numpy arrays. In general, there are many different types of iterators in
 MXNet based on the type of data you will be using. Their complete documentation
 can be found at: http://mxnet.io/api/python/io.html

 ## MXNet Classes

 1. Model Class: The model class in MXNet is used to define the overall entity of
    the model. It contains the variable we want to minimize, the training data
    and labels, and some additional parameters such as the learning rate and
    optimization algorithm are defined at the model level.

 2. Symbols: The actual MXNet network is defined using symbols. MXNet has
    different types of symbols, including data placeholders, neural network
    layers, and loss function symbols based on our requirement.

 3. IO: The IO class as we already saw works on the data, and carries out
    operations like breaking the data into batches and shuffling it.

 ## Defining the Model

 MXNet uses **Symbols** for defining a model. Symbols are the building blocks of
 the model and compose various components of the model. Some of the parts symbols
 are used to define are:
 1. Variables: A variable is a placeholder for future data. This symbol is used
    to define a spot which will be filled with training data/labels in the future
    when we are trying to train the model.
 2. Neural Network Layers: The layers of a network or any other type of model are
    also defined by Symbols. Such a *symbol* takes one of the previous symbols as
    its input, does some transformation on them, and creates an output. One such
    example is the "Fully Connected" symbol which specifies a fully connected
    layer of a network.
 3. Output Symbols: Output symbols are MXNet's way of defining a loss. They are
    suffixed with the work "Output" (eg. the SoftmaxOutput layer" . You can also
    create your
    [own loss][https://github.com/dmlc/mxnet/blob/master/docs/tutorials/r/CustomLossFunction.md#how-to-use-your-own-loss-function]
    Some examples of existing losses are: LinearRegressionOutput, which computes
    the l2-loss between it's input symbol and the actual labels provided to it,
    SoftmaxOutput, which computs the categorical cross-entropy.

 The ones described above, and other symbols are chained one after the other,
 servng as input to one another to create the network topology. More information
 about the different types of symbols can be found
 [here][http://mxnet.io/api/python/symbol.html]

 ```python
 X = mx.sym.Variable('data')
 Y = mx.symbol.Variable('lin_reg_label')
 fully_connected_layer  = mx.sym.FullyConnected(data=X, name='fc1', num_hidden = 1)
 lro = mx.sym.LinearRegressionOutput(data=fully_connected_layer, label=Y, name="lro")
 ```

 The above network uses the following layers:

 1. FullyConnected: The fully connected symbol represents a fully connected layer
    of a neural network (without any activation being applied), which in essence,
    is just a linear regression on the input attributes. It takes the following
    parameters:

    - data: Input to the layer (specify the symbol whose output should be fed here)
    - num_hidden: Number of hidden dimension which specifies the size of the output of the layer

 2. Linear Regression Output: Output layers in MXNet aim at implementing a
    loss. In our example, the Linear Regression Output layer is used which
    specifies that an l2 loss needs to be applied against it's input and the
    actual labels provided to this layer. The parameters to this layer are:

    - data: Input to this layer (specify the symbol whose output should be fed here)
    - label: The training label against whom we will compare the input to the layer for calculation of l2 loss

 Note for Naming Convention: the label variable's name should be the same as the
 label_name parameter passed to your training data iterator. The default value of
 this is 'softmax_label', but we have updated it to lin_reg_label in this
 tutorial as you can see in Y = mx.symbol.Variable('lin_reg_label') and
 train_iter = mx.io.NDArrayIter(..., label_name="lin_reg_label")**

 Finally, the network is stored into a *Module*, where you define the symbol
 who's value is to be minimised (in our case, lro or the lin_reg_output"), the
 learning rate to be used while optimization and the number of epochs we want to
 train our model on.

 ```python
 model = mx.mod.Module(
     symbol = lro ,
     data_names=['data'],
     label_names = ['lin_reg_label']# network structure
 )
 ```

 We can plot the network we have created in order to visualize it.

 ```python
 mx.viz.plot_network(symbol=lro)
 ```

 ## Training the model

 Once we have defined the model structure, the next step is to train the
 parameters of the model to fit the training data. This is done by using the
 **fit()** function of the **Module** class.

 ```python
 model.fit(train_iter, eval_iter,
             optimizer_params={'learning_rate':0.01, 'momentum': 0.9},
             num_epoch=1000,
             batch_end_callback = mx.callback.Speedometer(batch_size, 2))
 ```

 ## Using a trained model: (Testing and Inference)

 Once we have a trained model, we can do multiple things on it. We can use it for inference, we can evaluate the trained model on test data. This is shown below.

 ```python
 model.predict(eval_iter).asnumpy()
 ```

 We can also evaluate our model for some metric. In this example, we are evaulating our model's mean squared error on the evaluation data.

 ```python
 metric = mx.metric.MSE()
 model.score(eval_iter, metric)
 ```

 Let us try to add some noise to the evaluation data and see how the MSE changes


 ```python
 eval_data = np.array([[7,2],[6,10],[12,2]])
 eval_label = np.array([11.1,26.1,16.1]) #Adding 0.1 to each of the values
 eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False)
 model.score(eval_iter, metric)
 ```

 Finally, you can create your own metrics and use it to evauate your model. More
 information on metrics here:
 http://mxnet-test.readthedocs.io/en/latest/api/metric.html

 <!-- INSERT SOURCE DOWNLOAD BUTTONS -->
	# MXNet Basics - Linear Regression using MXNet

	```python
	import mxnet as mx
	import numpy as np
	```

	## Preparing the Data

	MXNet uses data in the form of Data Iterators. The code below illustrates
	how to encode a dataset into an iterator that MXNet can use. The data used in
	the example is made up of 2d data points with corresponding integer labels. The
	function we are trying to learn is:

	y = x<sub>1</sub> + 2x<sub>2</sub> ,

	where (x<sub>1</sub>,x<sub>2</sub>) is one training data point and y is the
	corresponding label


	```python
	#Training data
	train_data = np.array([[1,2],[3,4],[5,6],[3,2],[7,1],[6,9]])
	train_label = np.array([5,11,17,7,9,24])
	batch_size = 1

	#Evaluation Data
	eval_data = np.array([[7,2],[6,10],[12,2]])
	eval_label = np.array([11,26,16])
	```

	Once we have the data ready, we need to put it into an iterator and specify
	parameters such as the 'batch_size', and 'shuffle' which will determine the size
	of data the iterator feeds during each pass, and whether or not the data will be
	shuffled respectively.


	```python
	train_iter = mx.io.NDArrayIter(train_data,train_label, batch_size, shuffle=True,label_name='lin_reg_label')
	eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False)
	```

	In the above example, we have made use of NDArrayIter, which is used to iterate
	over numpy arrays. In general, there are many different types of iterators in
	MXNet based on the type of data you will be using. Their complete documentation
	can be found at: http://mxnet.io/api/python/io.html

	## MXNet Classes

	1. Model Class: The model class in MXNet is used to define the overall entity of
	the model. It contains the variable we want to minimize, the training data
	and labels, and some additional parameters such as the learning rate and
	optimization algorithm are defined at the model level.

	2. Symbols: The actual MXNet network is defined using symbols. MXNet has
	different types of symbols, including data placeholders, neural network
	layers, and loss function symbols based on our requirement.

	3. IO: The IO class as we already saw works on the data, and carries out
	operations like breaking the data into batches and shuffling it.

	## Defining the Model

	MXNet uses Symbols for defining a model. Symbols are the building blocks of
	the model and compose various components of the model. Some of the parts symbols
	are used to define are:
	1. Variables: A variable is a placeholder for future data. This symbol is used
	to define a spot which will be filled with training data/labels in the future
	when we are trying to train the model.
	2. Neural Network Layers: The layers of a network or any other type of model are
	also defined by Symbols. Such a symbol takes one of the previous symbols as
	its input, does some transformation on them, and creates an output. One such
	example is the "Fully Connected" symbol which specifies a fully connected
	layer of a network.
	3. Output Symbols: Output symbols are MXNet's way of defining a loss. They are
	suffixed with the work "Output" (eg. the SoftmaxOutput layer" . You can also
	create your
	[own loss][https://github.com/dmlc/mxnet/blob/master/docs/tutorials/r/CustomLossFunction.md#how-to-use-your-own-loss-function]
	Some examples of existing losses are: LinearRegressionOutput, which computes
	the l2-loss between it's input symbol and the actual labels provided to it,
	SoftmaxOutput, which computs the categorical cross-entropy.

	The ones described above, and other symbols are chained one after the other,
	servng as input to one another to create the network topology. More information
	about the different types of symbols can be found
	[here][http://mxnet.io/api/python/symbol.html]

	```python
	X = mx.sym.Variable('data')
	Y = mx.symbol.Variable('lin_reg_label')
	fully_connected_layer = mx.sym.FullyConnected(data=X, name='fc1', num_hidden = 1)
	lro = mx.sym.LinearRegressionOutput(data=fully_connected_layer, label=Y, name="lro")
	```

	The above network uses the following layers:

	1. FullyConnected: The fully connected symbol represents a fully connected layer
	of a neural network (without any activation being applied), which in essence,
	is just a linear regression on the input attributes. It takes the following
	parameters:

	- data: Input to the layer (specify the symbol whose output should be fed here)
	- num_hidden: Number of hidden dimension which specifies the size of the output of the layer

	2. Linear Regression Output: Output layers in MXNet aim at implementing a
	loss. In our example, the Linear Regression Output layer is used which
	specifies that an l2 loss needs to be applied against it's input and the
	actual labels provided to this layer. The parameters to this layer are:

	- data: Input to this layer (specify the symbol whose output should be fed here)
	- label: The training label against whom we will compare the input to the layer for calculation of l2 loss

	Note for Naming Convention: the label variable's name should be the same as the
	label_name parameter passed to your training data iterator. The default value of
	this is 'softmax_label', but we have updated it to lin_reg_label in this
	tutorial as you can see in Y = mx.symbol.Variable('lin_reg_label') and
	train_iter = mx.io.NDArrayIter(..., label_name="lin_reg_label")**

	Finally, the network is stored into a Module, where you define the symbol
	who's value is to be minimised (in our case, lro or the lin_reg_output"), the
	learning rate to be used while optimization and the number of epochs we want to
	train our model on.

	```python
	model = mx.mod.Module(
	symbol = lro ,
	data_names=['data'],
	label_names = ['lin_reg_label']# network structure
	)
	```

	We can plot the network we have created in order to visualize it.

	```python
	mx.viz.plot_network(symbol=lro)
	```

	## Training the model

	Once we have defined the model structure, the next step is to train the
	parameters of the model to fit the training data. This is done by using the
	fit() function of the Module class.

	```python
	model.fit(train_iter, eval_iter,
	optimizer_params={'learning_rate':0.01, 'momentum': 0.9},
	num_epoch=1000,
	batch_end_callback = mx.callback.Speedometer(batch_size, 2))
	```

	## Using a trained model: (Testing and Inference)

	Once we have a trained model, we can do multiple things on it. We can use it for inference, we can evaluate the trained model on test data. This is shown below.

	```python
	model.predict(eval_iter).asnumpy()
	```

	We can also evaluate our model for some metric. In this example, we are evaulating our model's mean squared error on the evaluation data.

	```python
	metric = mx.metric.MSE()
	model.score(eval_iter, metric)
	```

	Let us try to add some noise to the evaluation data and see how the MSE changes


	```python
	eval_data = np.array([[7,2],[6,10],[12,2]])
	eval_label = np.array([11.1,26.1,16.1]) #Adding 0.1 to each of the values
	eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False)
	model.score(eval_iter, metric)
	```

	Finally, you can create your own metrics and use it to evauate your model. More
	information on metrics here:
	http://mxnet-test.readthedocs.io/en/latest/api/metric.html

	<!-- INSERT SOURCE DOWNLOAD BUTTONS -->