blob: e3e7dff68b25499297ad1d4aea623455c0d39727 [file] [log] [blame]
# Interactive Training using Python
---
`Layer` class ([layer.py](layer.py)) has the following methods for an interactive training.
For the basic usage of Python binding features, please refer to [python.md](python.md).
**ComputeFeature(self, \*srclys)**
* This method creates and sets up singa::Layer and maintains its source layers, then call singa::Layer::ComputeFeature(...) for data transformation.
* `*srclys`: (an arbtrary number of) source layers
**ComputeGradient(self)**
* This method creates calls singa::Layer::ComputeGradient(...) for gradient computation.
**GetParams(self)**
* This method calls singa::Layer::GetParam() to retrieve parameter values of the layer. Currently, it returns weight and bias. Each parameter is a 2D numpy array.
**SetParams(self, \*params)**
* This method sets parameter values of the layer.
* `*params`: (an arbitrary number of) parameters, each of which is a 2D numpy array. Typically, it sets weight and bias, 2D numpy array.
* * *
`Dummy` class is a subclass of `Layer`, which is provided to fetch input data and/or label information.
Specifically, it creates singa::DummyLayer.
**Feed(self, shape, data, aux_data)**
* This method sets input data and/or auxiary data such as labels.
* `shape`: the shape (width and height) of dataset
* `data`: input dataset
* `aux_data`: auxiary dataset (e.g., labels)
In addition, `Dummy` class has two subclasses named `ImageInput` and `LabelInput`.
* `ImageInput` class will take three arguments as follows.
**\_\_init__(self, height=None, width=None, nb_channel=1)**
* Both `ImageInput` and `LabelInput` classes have their own Feed method to call Feed of Dummy class.
**Feed(self, data)**
<!--
Users can save or load model parameter (e.g., weight and bias) at anytime during training.
The following methods are provided in `model.py`.
**save_model_parameter(step, fout, neuralnet)**
* This method saves model parameters into the specified checkpoint (fout).
* `step`: the step id of training
* `fout`: the name of checkpoint (output filename)
* `neuralnet`: neural network model, i.e., a list of layers
**load_model_parameter(fin, neuralnet, batchsize=1, data_shape=None)**
* This method loads model parameters from the specified checkpoint (fin).
* `fin`: the name of checkpoint (input filename)
* `neuralnet`: neural network model, i.e., a list of layers
* `batchsize`:
* `data_shape`:
-->
* * *
## Example scripts for the interactive training
Two example scripts are provided at [`train_mnist.py`]() and [`train_cifar10.py`](), one is training MLP model for MNIST dataset, and another is training CNN model for CIFAR10 dataset.
* Assume that `nn` is a neural network model, i.e., a list of layers. Currently, this examples considers sequential models. Example MLP and CNN are shown below.
* `load_dataset()` method loads input data and corresponding labels, each of which is a 2D numpy array.
For example, loading MNIST dataset returns x: [60000 x 784] and y: [60000 x 1]. Loading CIFAR10 dataset, x: [10000 x 3072] and y: [10000 x 1].
* `sgd` is an Updater instance. Please see [`python.md`](python.md) and [`model.py`]() for more details.
#### Basic steps for the interactive training
* Step 1: Prepare batchsized data and corresponding label information, and then input the data using `Feed()` method.
* Step 2: (a) Transform data according to neuralnet (nn) structure using `ComputeFeature()`. Note that this example considers a sequential model, so it uses a simple loop. (b) Users need to provide `label` information for loss layer to compute loss function. (c) Users can print out the training performance, e.g., loss and accuracy.
* Step 3: Compute gradient in a reverse order of neuralnet (nn) structure using `ComputeGradient()`.
* Step 4: Update parameters, e.g., weight and bias, of layers using `Update()` of the updater.
Here is an example script for the interactive training.
```
bsize = 64 # batchsize
disp_freq = 10 # step to show the training accuracy
x, y = load_dataset()
for i in range(x.shape[0] / bsize):
# (Step1) Input data containing "bsize" samples
xb, yb = x[i*bsize:(i+1)*bsize, :], y[i*bsize:(i+1)*bsize, :]
nn[0].Feed(xb)
label.Feed(yb)
# (Step2-a) Transform data according to the neuralnet (nn) structure
for h in range(1, len(nn)):
nn[h].ComputeFeature(nn[h-1])
# (Step2-b) Provide label to compute loss function
loss.ComputeFeature(nn[-1], label)
# (Step2-c) Print out performance, e.g., loss and accuracy
if (i+1) % disp_freq == 0:
print ' Step {:>3}: '.format(i+1),
loss.display()
# (Step3) Compute gradient in a reverse order
loss.ComputeGradient()
for h in range(len(nn)-1, 0, -1):
nn[h].ComputeGradient()
# (Step 4) Update parameter
sgd.Update(i+1, nn[h])
```
<a id="model"></a>
### <a href="#model">Example MLP</a>
Here is an example MLP model with 5 fully-connected hidden layers.
Please refer to [`python.md`](python.md) and [`layer.py`]() for more details about layer definition. `SGD()` is an updater defined in [`model.py`]().
```
input = ImageInput(28, 28) # image width and height
label = LabelInput()
nn = []
nn.append(input)
nn.append(Dense(2500, init='uniform'))
nn.append(Activation('stanh'))
nn.append(Dense(2000, init='uniform'))
nn.append(Activation('stanh'))
nn.append(Dense(1500, init='uniform'))
nn.append(Activation('stanh'))
nn.append(Dense(1000, init='uniform'))
nn.append(Activation('stanh'))
nn.append(Dense(500, init='uniform'))
nn.append(Activation('stanh'))
nn.append(Dense(10, init='uniform'))
loss = Loss('softmaxloss')
sgd = SGD(lr=0.001, lr_type='step')
```
### <a href="#model2">Example CNN</a>
Here is an example MLP model with 3 convolution and pooling layers.
Please refer to [`python.md`]() and [`layer.py`]() for more details about layer definition. `SGD()` is an updater defined in [`model.py`]().
```
input = ImageInput(32, 32, 3) # image width, height, channel
label = LabelInput()
nn = []
nn.append(input)
nn.append(Convolution2D(32, 5, 1, 2, w_std=0.0001, b_lr=2))
nn.append(MaxPooling2D(pool_size=(3,3), stride=2))
nn.append(Activation('relu'))
nn.append(LRN2D(3, alpha=0.00005, beta=0.75))
nn.append(Convolution2D(32, 5, 1, 2, b_lr=2))
nn.append(Activation('relu'))
nn.append(AvgPooling2D(pool_size=(3,3), stride=2))
nn.append(LRN2D(3, alpha=0.00005, beta=0.75))
nn.append(Convolution2D(64, 5, 1, 2))
nn.append(Activation('relu'))
nn.append(AvgPooling2D(pool_size=(3,3), stride=2))
nn.append(Dense(10, w_wd=250, b_lr=2, b_wd=0))
loss = Loss('softmaxloss')
sgd = SGD(decay=0.004, momentum=0.9, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001))
```