content/v0.3.0/en/_sources/docs/python_interactive_training.txt - singa-site - Git at Google

 # Interactive Training using Python

 ---

 `Layer` class ([layer.py](layer.py)) has the following methods for an interactive training.
 For the basic usage of Python binding features, please refer to [python.md](python.md).

 **ComputeFeature(self, \*srclys)**

 * This method creates and sets up singa::Layer and maintains its source layers, then call singa::Layer::ComputeFeature(...) for data transformation.

 	* `*srclys`: (an arbtrary number of) source layers

 **ComputeGradient(self)**

 * This method creates calls singa::Layer::ComputeGradient(...) for gradient computation.

 **GetParams(self)**

 * This method calls singa::Layer::GetParam() to retrieve parameter values of the layer. Currently, it returns weight and bias. Each parameter is a 2D numpy array.

 **SetParams(self, \*params)**

 * This method sets parameter values of the layer.
 	* `*params`: (an arbitrary number of) parameters, each of which is a 2D numpy array. Typically, it sets weight and bias, 2D numpy array.

 * * *

 `Dummy` class is a subclass of `Layer`, which is provided to fetch input data and/or label information.
 Specifically, it creates singa::DummyLayer.

 **Feed(self, shape, data, aux_data)**

 * This method sets input data and/or auxiary data such as labels.

 	* `shape`: the shape (width and height) of dataset
 	* `data`: input dataset
 	* `aux_data`: auxiary dataset (e.g., labels)

 In addition, `Dummy` class has two subclasses named `ImageInput` and `LabelInput`.

 * `ImageInput` class will take three arguments as follows.

 	**\_\_init__(self, height=None, width=None, nb_channel=1)**

 * Both `ImageInput` and `LabelInput` classes have their own Feed method to call Feed of Dummy class.

 	**Feed(self, data)**


 <!--

 Users can save or load model parameter (e.g., weight and bias) at anytime during training.
 The following methods are provided in `model.py`.

 **save_model_parameter(step, fout, neuralnet)**

 * This method saves model parameters into the specified checkpoint (fout).

 	* `step`: the step id of training
 	* `fout`: the name of checkpoint (output filename)
 	* `neuralnet`: neural network model, i.e., a list of layers

 **load_model_parameter(fin, neuralnet, batchsize=1, data_shape=None)**

 * This method loads model parameters from the specified checkpoint (fin).

 	* `fin`: the name of checkpoint (input filename)
 	* `neuralnet`: neural network model, i.e., a list of layers
 	* `batchsize`:
 	* `data_shape`:
 -->

 * * *

 ## Example scripts for the interactive training

 Two example scripts are provided at [`train_mnist.py`]() and [`train_cifar10.py`](), one is training MLP model for MNIST dataset, and another is training CNN model for CIFAR10 dataset.

 * Assume that `nn` is a neural network model, i.e., a list of layers. Currently, this examples considers sequential models. Example MLP and CNN are shown below.

 * `load_dataset()` method loads input data and corresponding labels, each of which is a 2D numpy array.
 For example, loading MNIST dataset returns x: [60000 x 784] and y: [60000 x 1]. Loading CIFAR10 dataset, x: [10000 x 3072] and y: [10000 x 1].

 * `sgd` is an Updater instance. Please see [`python.md`](python.md) and [`model.py`]() for more details.

 #### Basic steps for the interactive training

 * Step 1: Prepare batchsized data and corresponding label information, and then input the data using `Feed()` method.

 * Step 2: (a) Transform data according to neuralnet (nn) structure using `ComputeFeature()`. Note that this example considers a sequential model, so it uses a simple loop. (b) Users need to provide `label` information for loss layer to compute loss function. (c) Users can print out the training performance, e.g., loss and accuracy.

 * Step 3: Compute gradient in a reverse order of neuralnet (nn) structure using `ComputeGradient()`.

 * Step 4: Update parameters, e.g., weight and bias, of layers using `Update()` of the updater.

 Here is an example script for the interactive training.
 ```
 bsize = 64      # batchsize
 disp_freq = 10  # step to show the training accuracy

 x, y = load_dataset()

 for i in range(x.shape[0] / bsize):

 	# (Step1) Input data containing "bsize" samples
 	xb, yb = x[i*bsize:(i+1)*bsize, :], y[i*bsize:(i+1)*bsize, :]
 	nn[0].Feed(xb)
 	label.Feed(yb)

 	# (Step2-a) Transform data according to the neuralnet (nn) structure
 	for h in range(1, len(nn)):
 		nn[h].ComputeFeature(nn[h-1])

 	# (Step2-b) Provide label to compute loss function
 	loss.ComputeFeature(nn[-1], label)

 	# (Step2-c) Print out performance, e.g., loss and accuracy
 	if (i+1) % disp_freq == 0:
 		print '  Step {:>3}: '.format(i+1),
 		loss.display()

 	# (Step3) Compute gradient in a reverse order
 	loss.ComputeGradient()
 	for h in range(len(nn)-1, 0, -1):
 		nn[h].ComputeGradient()
 		# (Step 4) Update parameter
 		sgd.Update(i+1, nn[h])
 ```

 <a id="model"></a>
 ### <a href="#model">Example MLP</a>

 Here is an example MLP model with 5 fully-connected hidden layers.
 Please refer to [`python.md`](python.md) and [`layer.py`]() for more details about layer definition. `SGD()` is an updater defined in [`model.py`]().

 ```
 input = ImageInput(28, 28) # image width and height
 label = LabelInput()

 nn = []
 nn.append(input)
 nn.append(Dense(2500, init='uniform'))
 nn.append(Activation('stanh'))
 nn.append(Dense(2000, init='uniform'))
 nn.append(Activation('stanh'))
 nn.append(Dense(1500, init='uniform'))
 nn.append(Activation('stanh'))
 nn.append(Dense(1000, init='uniform'))
 nn.append(Activation('stanh'))
 nn.append(Dense(500, init='uniform'))
 nn.append(Activation('stanh'))
 nn.append(Dense(10, init='uniform'))
 loss = Loss('softmaxloss')

 sgd = SGD(lr=0.001, lr_type='step')

 ```

 ### <a href="#model2">Example CNN</a>

 Here is an example MLP model with 3 convolution and pooling layers.
 Please refer to [`python.md`]() and [`layer.py`]() for more details about layer definition. `SGD()` is an updater defined in [`model.py`]().

 ```
 input = ImageInput(32, 32, 3) # image width, height, channel
 label = LabelInput()

 nn = []
 nn.append(input)
 nn.append(Convolution2D(32, 5, 1, 2, w_std=0.0001, b_lr=2))
 nn.append(MaxPooling2D(pool_size=(3,3), stride=2))
 nn.append(Activation('relu'))
 nn.append(LRN2D(3, alpha=0.00005, beta=0.75))
 nn.append(Convolution2D(32, 5, 1, 2, b_lr=2))
 nn.append(Activation('relu'))
 nn.append(AvgPooling2D(pool_size=(3,3), stride=2))
 nn.append(LRN2D(3, alpha=0.00005, beta=0.75))
 nn.append(Convolution2D(64, 5, 1, 2))
 nn.append(Activation('relu'))
 nn.append(AvgPooling2D(pool_size=(3,3), stride=2))
 nn.append(Dense(10, w_wd=250, b_lr=2, b_wd=0))
 loss = Loss('softmaxloss')

 sgd = SGD(decay=0.004, momentum=0.9, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001))
 ```
	# Interactive Training using Python

	---

	`Layer` class ([layer.py](layer.py)) has the following methods for an interactive training.
	For the basic usage of Python binding features, please refer to [python.md](python.md).

	*ComputeFeature(self, \srclys)**

	* This method creates and sets up singa::Layer and maintains its source layers, then call singa::Layer::ComputeFeature(...) for data transformation.

	* `*srclys`: (an arbtrary number of) source layers

	ComputeGradient(self)

	* This method creates calls singa::Layer::ComputeGradient(...) for gradient computation.

	GetParams(self)

	* This method calls singa::Layer::GetParam() to retrieve parameter values of the layer. Currently, it returns weight and bias. Each parameter is a 2D numpy array.

	*SetParams(self, \params)**

	* This method sets parameter values of the layer.
	* `*params`: (an arbitrary number of) parameters, each of which is a 2D numpy array. Typically, it sets weight and bias, 2D numpy array.

	* * *

	`Dummy` class is a subclass of `Layer`, which is provided to fetch input data and/or label information.
	Specifically, it creates singa::DummyLayer.

	Feed(self, shape, data, aux_data)

	* This method sets input data and/or auxiary data such as labels.

	* `shape`: the shape (width and height) of dataset
	* `data`: input dataset
	* `aux_data`: auxiary dataset (e.g., labels)

	In addition, `Dummy` class has two subclasses named `ImageInput` and `LabelInput`.

	* `ImageInput` class will take three arguments as follows.

	\_\_init__(self, height=None, width=None, nb_channel=1)

	* Both `ImageInput` and `LabelInput` classes have their own Feed method to call Feed of Dummy class.

	Feed(self, data)


	<!--

	Users can save or load model parameter (e.g., weight and bias) at anytime during training.
	The following methods are provided in `model.py`.

	save_model_parameter(step, fout, neuralnet)

	* This method saves model parameters into the specified checkpoint (fout).

	* `step`: the step id of training
	* `fout`: the name of checkpoint (output filename)
	* `neuralnet`: neural network model, i.e., a list of layers

	load_model_parameter(fin, neuralnet, batchsize=1, data_shape=None)

	* This method loads model parameters from the specified checkpoint (fin).

	* `fin`: the name of checkpoint (input filename)
	* `neuralnet`: neural network model, i.e., a list of layers
	* `batchsize`:
	* `data_shape`:
	-->

	* * *

	## Example scripts for the interactive training

	Two example scripts are provided at [`train_mnist.py`]() and [`train_cifar10.py`](), one is training MLP model for MNIST dataset, and another is training CNN model for CIFAR10 dataset.

	* Assume that `nn` is a neural network model, i.e., a list of layers. Currently, this examples considers sequential models. Example MLP and CNN are shown below.

	* `load_dataset()` method loads input data and corresponding labels, each of which is a 2D numpy array.
	For example, loading MNIST dataset returns x: [60000 x 784] and y: [60000 x 1]. Loading CIFAR10 dataset, x: [10000 x 3072] and y: [10000 x 1].

	* `sgd` is an Updater instance. Please see [`python.md`](python.md) and [`model.py`]() for more details.

	#### Basic steps for the interactive training

	* Step 1: Prepare batchsized data and corresponding label information, and then input the data using `Feed()` method.

	* Step 2: (a) Transform data according to neuralnet (nn) structure using `ComputeFeature()`. Note that this example considers a sequential model, so it uses a simple loop. (b) Users need to provide `label` information for loss layer to compute loss function. (c) Users can print out the training performance, e.g., loss and accuracy.

	* Step 3: Compute gradient in a reverse order of neuralnet (nn) structure using `ComputeGradient()`.

	* Step 4: Update parameters, e.g., weight and bias, of layers using `Update()` of the updater.

	Here is an example script for the interactive training.
	```
	bsize = 64 # batchsize
	disp_freq = 10 # step to show the training accuracy

	x, y = load_dataset()

	for i in range(x.shape[0] / bsize):

	# (Step1) Input data containing "bsize" samples
	xb, yb = x[ibsize:(i+1)bsize, :], y[ibsize:(i+1)bsize, :]
	nn[0].Feed(xb)
	label.Feed(yb)

	# (Step2-a) Transform data according to the neuralnet (nn) structure
	for h in range(1, len(nn)):
	nn[h].ComputeFeature(nn[h-1])

	# (Step2-b) Provide label to compute loss function
	loss.ComputeFeature(nn[-1], label)

	# (Step2-c) Print out performance, e.g., loss and accuracy
	if (i+1) % disp_freq == 0:
	print ' Step {:>3}: '.format(i+1),
	loss.display()

	# (Step3) Compute gradient in a reverse order
	loss.ComputeGradient()
	for h in range(len(nn)-1, 0, -1):
	nn[h].ComputeGradient()
	# (Step 4) Update parameter
	sgd.Update(i+1, nn[h])
	```

	<a id="model"></a>
	### <a href="#model">Example MLP</a>

	Here is an example MLP model with 5 fully-connected hidden layers.
	Please refer to [`python.md`](python.md) and [`layer.py`]() for more details about layer definition. `SGD()` is an updater defined in [`model.py`]().

	```
	input = ImageInput(28, 28) # image width and height
	label = LabelInput()

	nn = []
	nn.append(input)
	nn.append(Dense(2500, init='uniform'))
	nn.append(Activation('stanh'))
	nn.append(Dense(2000, init='uniform'))
	nn.append(Activation('stanh'))
	nn.append(Dense(1500, init='uniform'))
	nn.append(Activation('stanh'))
	nn.append(Dense(1000, init='uniform'))
	nn.append(Activation('stanh'))
	nn.append(Dense(500, init='uniform'))
	nn.append(Activation('stanh'))
	nn.append(Dense(10, init='uniform'))
	loss = Loss('softmaxloss')

	sgd = SGD(lr=0.001, lr_type='step')

	```

	### <a href="#model2">Example CNN</a>

	Here is an example MLP model with 3 convolution and pooling layers.
	Please refer to [`python.md`]() and [`layer.py`]() for more details about layer definition. `SGD()` is an updater defined in [`model.py`]().

	```
	input = ImageInput(32, 32, 3) # image width, height, channel
	label = LabelInput()

	nn = []
	nn.append(input)
	nn.append(Convolution2D(32, 5, 1, 2, w_std=0.0001, b_lr=2))
	nn.append(MaxPooling2D(pool_size=(3,3), stride=2))
	nn.append(Activation('relu'))
	nn.append(LRN2D(3, alpha=0.00005, beta=0.75))
	nn.append(Convolution2D(32, 5, 1, 2, b_lr=2))
	nn.append(Activation('relu'))
	nn.append(AvgPooling2D(pool_size=(3,3), stride=2))
	nn.append(LRN2D(3, alpha=0.00005, beta=0.75))
	nn.append(Convolution2D(64, 5, 1, 2))
	nn.append(Activation('relu'))
	nn.append(AvgPooling2D(pool_size=(3,3), stride=2))
	nn.append(Dense(10, w_wd=250, b_lr=2, b_wd=0))
	loss = Loss('softmaxloss')

	sgd = SGD(decay=0.004, momentum=0.9, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001))
	```