content/v0.3.0/en/_sources/docs/python.txt - singa-site - Git at Google

 # Python Binding

 ---

 Python binding provides APIs for configuring a training job following
 [keras](http://keras.io/), including the configuration of neural net, training
 algorithm, etc.  It replaces the configuration file (e.g., *job.conf*) in
 protobuf format, which is typically long and error-prone to prepare. We will add
 python functions to interact with the layer and neural net
 objects (see [here](python_interactive_training.html)), which would enable users to train and debug their models
 interactively.

 Here is the layout of python related code,

     SINGAROOT/tool/python
     |-- pb2 (has job_pb2.py)
     |-- singa
         |-- model.py
         |-- layer.py
         |-- parameter.py
         |-- initialization.py
         |-- utils
             |-- utility.py
             |-- message.py
     |-- examples
         |-- cifar10_cnn.py, mnist_mlp.py, , mnist_rbm1.py, mnist_ae.py, etc.
         |-- datasets
             |-- cifar10.py
             |-- mnist.py

 ## Compiling and running instructions

 In order to use the Python APIs, users need to add the following arguments when compiling
 SINGA,

     ./configure --enable-python --with-python=PYTHON_DIR
     make

 where PYTHON_DIR has Python.h


 The training program is launched by

     bin/singa-run.sh -exec <user_main.py>

 where user_main.py creates the JobProto object and passes it to Driver::Train to
 start the training.

 For example,

     cd SINGAROOT
     bin/singa-run.sh -exec tool/python/examples/cifar10_cnn.py


 ## Examples


 ### MLP Example

 This example uses python APIs to configure and train a MLP model over the MNIST
 dataset. The configuration content is the same as that written in *SINGAROOT/examples/mnist/job.conf*.

 ```
 X_train, X_test, workspace = mnist.load_data()

 m = Sequential('mlp', sys.argv)

 m.add(Dense(2500, init='uniform', activation='stanh'))
 m.add(Dense(2000, init='uniform', activation='stanh'))
 m.add(Dense(1500, init='uniform', activation='stanh'))
 m.add(Dense(1000, init='uniform', activation='stanh'))
 m.add(Dense(500,  init='uniform', activation='stanh'))
 m.add(Dense(10, init='uniform', activation='softmax'))

 sgd = SGD(lr=0.001, lr_type='step')
 topo = Cluster(workspace)
 m.compile(loss='categorical_crossentropy', optimizer=sgd, cluster=topo)
 m.fit(X_train, nb_epoch=1000, with_test=True)
 result = m.evaluate(X_test, batch_size=100, test_steps=10, test_freq=60)
 ```

 ### CNN Example

 This example uses python APIs to configure and train a CNN model over the Cifar10
 dataset. The configuration content is the same as that written in *SINGAROOT/examples/cifar10/job.conf*.


 ```
 X_train, X_test, workspace = cifar10.load_data()

 m = Sequential('cnn', sys.argv)

 m.add(Convolution2D(32, 5, 1, 2, w_std=0.0001, b_lr=2))
 m.add(MaxPooling2D(pool_size=(3,3), stride=2))
 m.add(Activation('relu'))
 m.add(LRN2D(3, alpha=0.00005, beta=0.75))

 m.add(Convolution2D(32, 5, 1, 2, b_lr=2))
 m.add(Activation('relu'))
 m.add(AvgPooling2D(pool_size=(3,3), stride=2))
 m.add(LRN2D(3, alpha=0.00005, beta=0.75))

 m.add(Convolution2D(64, 5, 1, 2))
 m.add(Activation('relu'))
 m.add(AvgPooling2D(pool_size=(3,3), stride=2))

 m.add(Dense(10, w_wd=250, b_lr=2, b_wd=0, activation='softmax'))

 sgd = SGD(decay=0.004, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001))
 topo = Cluster(workspace)
 m.compile(updater=sgd, cluster=topo)
 m.fit(X_train, nb_epoch=1000, with_test=True)
 result = m.evaluate(X_test, 1000, test_steps=30, test_freq=300)
 ```


 ### RBM Example

 This example uses python APIs to configure and train a RBM model over the MNIST
 dataset. The configuration content is the same as that written in *SINGAROOT/examples/rbm*.conf*.

 ```
 rbmid = 3
 X_train, X_test, workspace = mnist.load_data(nb_rbm=rbmid)
 m = Energy('rbm'+str(rbmid), sys.argv)

 out_dim = [1000, 500, 250]
 m.add(RBM(out_dim, w_std=0.1, b_wd=0))

 sgd = SGD(lr=0.1, decay=0.0002, momentum=0.8)
 topo = Cluster(workspace)
 m.compile(optimizer=sgd, cluster=topo)
 m.fit(X_train, alg='cd', nb_epoch=6000)
 ```

 ### AutoEncoder Example
 This example uses python APIs to configure and train an autoencoder model over
 the MNIST dataset. The configuration content is the same as that written in
 *SINGAROOT/examples/autoencoder.conf*.


 ```
 rbmid = 4
 X_train, X_test, workspace = mnist.load_data(nb_rbm=rbmid+1)
 m = Sequential('autoencoder', sys.argv)

 hid_dim = [1000, 500, 250, 30]
 m.add(Autoencoder(hid_dim, out_dim=784, activation='sigmoid', param_share=True))

 agd = AdaGrad(lr=0.01)
 topo = Cluster(workspace)
 m.compile(loss='mean_squared_error', optimizer=agd, cluster=topo)
 m.fit(X_train, alg='bp', nb_epoch=12200)
 ```

 ### To run SINGA on GPU

 Users need to set a list of gpu ids to `device` field in fit() or evaluate().
 The number of GPUs must be the same to the number of workers configured for
 cluster topology.


 ```
 gpu_id = [0]
 m.fit(X_train, nb_epoch=100, with_test=True, device=gpu_id)
 ```

 ### TIPS

 Hidden layers for MLP can be configured as

 ```
 for n in [2500, 2000, 1500, 1000, 500]:
   m.add(Dense(n, init='uniform', activation='tanh'))
 m.add(Dense(10, init='uniform', activation='softmax'))
 ```

 Activation layer can be specified separately

 ```
 m.add(Dense(2500, init='uniform'))
 m.add(Activation('tanh'))
 ```

 Users can explicitly specify hyper-parameters of weight and bias

 ```
 par = Parameter(init='uniform', scale=0.05)
 m.add(Dense(2500, w_param=par, b_param=par, activation='tanh'))
 m.add(Dense(2000, w_param=par, b_param=par, activation='tanh'))
 m.add(Dense(1500, w_param=par, b_param=par, activation='tanh'))
 m.add(Dense(1000, w_param=par, b_param=par, activation='tanh'))
 m.add(Dense(500, w_param=par, b_param=par, activation='tanh'))
 m.add(Dense(10, w_param=par, b_param=par, activation='softmax'))
 ```


 ```
 parw = Parameter(init='gauss', std=0.0001)
 parb = Parameter(init='const', value=0)
 m.add(Convolution(32, 5, 1, 2, w_param=parw, b_param=parb, b_lr=2))
 m.add(MaxPooling2D(pool_size(3,3), stride=2))
 m.add(Activation('relu'))
 m.add(LRN2D(3, alpha=0.00005, beta=0.75))

 parw.update(std=0.01)
 m.add(Convolution(32, 5, 1, 2, w_param=parw, b_param=parb))
 m.add(Activation('relu'))
 m.add(AvgPooling2D(pool_size(3,3), stride=2))
 m.add(LRN2D(3, alpha=0.00005, beta=0.75))

 m.add(Convolution(64, 5, 1, 2, w_param=parw, b_param=parb, b_lr=1))
 m.add(Activation('relu'))
 m.add(AvgPooling2D(pool_size(3,3), stride=2))

 m.add(Dense(10, w_param=parw, w_wd=250, b_param=parb, b_lr=2, b_wd=0, activation='softmax'))
 ```


 Data can be added in this way,

 ```
 X_train, X_test = mnist.load_data()  // parameter values are set in load_data()
 m.fit(X_train, ...)                  // Data layer for training is added
 m.evaluate(X_test, ...)              // Data layer for testing is added
 ```
 or this way,

 ```
 X_train, X_test = mnist.load_data()  // parameter values are set in load_data()
 m.add(X_train)                       // explicitly add Data layer
 m.add(X_test)                        // explicitly add Data layer
 ```


 ```
 store = Store(path='train.bin', batch_size=64, ...)        // parameter values are set explicitly
 m.add(Data(load='recordinput', phase='train', conf=store)) // Data layer is added
 store = Store(path='test.bin', batch_size=100, ...)        // parameter values are set explicitly
 m.add(Data(load='recordinput', phase='test', conf=store))  // Data layer is added
 ```


 ### Cases to run SINGA

 (1) Run SINGA for training

 ```
 m.fit(X_train, nb_epoch=1000)
 ```

 (2) Run SINGA for training and validation

 ```
 m.fit(X_train, validate_data=X_valid, nb_epoch=1000)
 ```

 (3) Run SINGA for test while training

 ```
 m.fit(X_train, nb_epoch=1000, with_test=True)
 result = m.evaluate(X_test, batch_size=100, test_steps=100)
 ```

 (4) Run SINGA for test only
 Assume a checkpoint exists after training

 ```
 result = m.evaluate(X_test, batch_size=100, checkpoint_path=workspace+'/checkpoint/step100-worker0')
 ```


 ## Implementation Details

 ### Layer class (inherited)

 * Data
 * Dense
 * Activation
 * Convolution2D
 * MaxPooling2D
 * AvgPooling2D
 * LRN2D
 * Dropout
 * RBM
 * Autoencoder

 ### Model class

 Model class has `jobconf` (JobProto) and `layers` (layer list)

 Methods in Model class

 * add
 	* add Layer into Model
 	* 2 subclasses: Sequential model and Energy model

 * compile
 	* set Updater (i.e., optimizer) and Cluster (i.e., topology) components

 * fit
 	* set Training data and parameter values for the training
 		* (optional) set Validatiaon data and parameter values
 	* set Train_one_batch component
 	* specify `with_test` field if a user wants to run SINGA with test data simultaneously.
 	* [TODO] recieve train/validation results, e.g., accuracy, loss, ppl, etc.

 * evaluate
 	* set Testing data and parameter values for the testing
 	* specify `checkpoint_path` field if a user want to run SINGA only for testing.
 	* [TODO] recieve test results, e.g., accuracy, loss, ppl, etc.

 ### Results

 fit() and evaluate() return train/test results, a dictionary containing

 * [key]: step number
 * [value]: a list of dictionay
 	* 'acc' for accuracy
 	* 'loss' for loss
 	* 'ppl' for ppl
 	* 'se' for squred error


 ### Parameter class

 Users need to set parameter and initial values. For example,

 * Parameter (fields in Param proto)
 	* lr = (float) // learning rate multiplier, used to scale the learning rate when updating parameters.
 	* wd = (float) // weight decay multiplier, used to scale the weight decay when updating parameters.

 * Parameter initialization (fields in ParamGen proto)
 	* init = (string) // one of the types, 'uniform', 'constant', 'gaussian'
 	* high = (float)  // for 'uniform'
 	* low = (float)   // for 'uniform'
 	* value = (float) // for 'constant'
 	* mean = (float)  // for 'gaussian'
 	* std = (float)   // for 'gaussian'

 * Weight (`w_param`) is 'gaussian' with mean=0, std=0.01 at default

 * Bias (`b_param`) is 'constant' with value=0 at default

 * How to update the parameter fields
 	* for updating Weight, put `w_` in front of field name
 	* for updating Bias, put `b_` in front of field name

 Several ways to set Parameter values

 ```
 parw = Parameter(lr=2, wd=10, init='gaussian', std=0.1)
 parb = Parameter(lr=1, wd=0, init='constant', value=0)
 m.add(Convolution2D(10, w_param=parw, b_param=parb, ...)
 ```

 ```
 m.add(Dense(10, w_mean=1, w_std=0.1, w_lr=2, w_wd=10, ...)
 ```

 ```
 parw = Parameter(init='constant', mean=0)
 m.add(Dense(10, w_param=parw, w_lr=1, w_wd=1, b_value=1, ...)
 ```

 ### Other classes

 * Store
 * Algorithm
 * Updater
 * SGD
 * AdaGrad
 * Cluster
	# Python Binding

	---

	Python binding provides APIs for configuring a training job following
	[keras](http://keras.io/), including the configuration of neural net, training
	algorithm, etc. It replaces the configuration file (e.g., job.conf) in
	protobuf format, which is typically long and error-prone to prepare. We will add
	python functions to interact with the layer and neural net
	objects (see [here](python_interactive_training.html)), which would enable users to train and debug their models
	interactively.

	Here is the layout of python related code,

	SINGAROOT/tool/python
	\|-- pb2 (has job_pb2.py)
	\|-- singa
	\|-- model.py
	\|-- layer.py
	\|-- parameter.py
	\|-- initialization.py
	\|-- utils
	\|-- utility.py
	\|-- message.py
	\|-- examples
	\|-- cifar10_cnn.py, mnist_mlp.py, , mnist_rbm1.py, mnist_ae.py, etc.
	\|-- datasets
	\|-- cifar10.py
	\|-- mnist.py

	## Compiling and running instructions

	In order to use the Python APIs, users need to add the following arguments when compiling
	SINGA,

	./configure --enable-python --with-python=PYTHON_DIR
	make

	where PYTHON_DIR has Python.h


	The training program is launched by

	bin/singa-run.sh -exec <user_main.py>

	where user_main.py creates the JobProto object and passes it to Driver::Train to
	start the training.

	For example,

	cd SINGAROOT
	bin/singa-run.sh -exec tool/python/examples/cifar10_cnn.py



	## Examples


	### MLP Example

	This example uses python APIs to configure and train a MLP model over the MNIST
	dataset. The configuration content is the same as that written in SINGAROOT/examples/mnist/job.conf.

	```
	X_train, X_test, workspace = mnist.load_data()

	m = Sequential('mlp', sys.argv)

	m.add(Dense(2500, init='uniform', activation='stanh'))
	m.add(Dense(2000, init='uniform', activation='stanh'))
	m.add(Dense(1500, init='uniform', activation='stanh'))
	m.add(Dense(1000, init='uniform', activation='stanh'))
	m.add(Dense(500, init='uniform', activation='stanh'))
	m.add(Dense(10, init='uniform', activation='softmax'))

	sgd = SGD(lr=0.001, lr_type='step')
	topo = Cluster(workspace)
	m.compile(loss='categorical_crossentropy', optimizer=sgd, cluster=topo)
	m.fit(X_train, nb_epoch=1000, with_test=True)
	result = m.evaluate(X_test, batch_size=100, test_steps=10, test_freq=60)
	```

	### CNN Example

	This example uses python APIs to configure and train a CNN model over the Cifar10
	dataset. The configuration content is the same as that written in SINGAROOT/examples/cifar10/job.conf.


	```
	X_train, X_test, workspace = cifar10.load_data()

	m = Sequential('cnn', sys.argv)

	m.add(Convolution2D(32, 5, 1, 2, w_std=0.0001, b_lr=2))
	m.add(MaxPooling2D(pool_size=(3,3), stride=2))
	m.add(Activation('relu'))
	m.add(LRN2D(3, alpha=0.00005, beta=0.75))

	m.add(Convolution2D(32, 5, 1, 2, b_lr=2))
	m.add(Activation('relu'))
	m.add(AvgPooling2D(pool_size=(3,3), stride=2))
	m.add(LRN2D(3, alpha=0.00005, beta=0.75))

	m.add(Convolution2D(64, 5, 1, 2))
	m.add(Activation('relu'))
	m.add(AvgPooling2D(pool_size=(3,3), stride=2))

	m.add(Dense(10, w_wd=250, b_lr=2, b_wd=0, activation='softmax'))

	sgd = SGD(decay=0.004, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001))
	topo = Cluster(workspace)
	m.compile(updater=sgd, cluster=topo)
	m.fit(X_train, nb_epoch=1000, with_test=True)
	result = m.evaluate(X_test, 1000, test_steps=30, test_freq=300)
	```


	### RBM Example

	This example uses python APIs to configure and train a RBM model over the MNIST
	dataset. The configuration content is the same as that written in SINGAROOT/examples/rbm.conf*.

	```
	rbmid = 3
	X_train, X_test, workspace = mnist.load_data(nb_rbm=rbmid)
	m = Energy('rbm'+str(rbmid), sys.argv)

	out_dim = [1000, 500, 250]
	m.add(RBM(out_dim, w_std=0.1, b_wd=0))

	sgd = SGD(lr=0.1, decay=0.0002, momentum=0.8)
	topo = Cluster(workspace)
	m.compile(optimizer=sgd, cluster=topo)
	m.fit(X_train, alg='cd', nb_epoch=6000)
	```

	### AutoEncoder Example
	This example uses python APIs to configure and train an autoencoder model over
	the MNIST dataset. The configuration content is the same as that written in
	SINGAROOT/examples/autoencoder.conf.


	```
	rbmid = 4
	X_train, X_test, workspace = mnist.load_data(nb_rbm=rbmid+1)
	m = Sequential('autoencoder', sys.argv)

	hid_dim = [1000, 500, 250, 30]
	m.add(Autoencoder(hid_dim, out_dim=784, activation='sigmoid', param_share=True))

	agd = AdaGrad(lr=0.01)
	topo = Cluster(workspace)
	m.compile(loss='mean_squared_error', optimizer=agd, cluster=topo)
	m.fit(X_train, alg='bp', nb_epoch=12200)
	```

	### To run SINGA on GPU

	Users need to set a list of gpu ids to `device` field in fit() or evaluate().
	The number of GPUs must be the same to the number of workers configured for
	cluster topology.


	```
	gpu_id = [0]
	m.fit(X_train, nb_epoch=100, with_test=True, device=gpu_id)
	```

	### TIPS

	Hidden layers for MLP can be configured as

	```
	for n in [2500, 2000, 1500, 1000, 500]:
	m.add(Dense(n, init='uniform', activation='tanh'))
	m.add(Dense(10, init='uniform', activation='softmax'))
	```

	Activation layer can be specified separately

	```
	m.add(Dense(2500, init='uniform'))
	m.add(Activation('tanh'))
	```

	Users can explicitly specify hyper-parameters of weight and bias

	```
	par = Parameter(init='uniform', scale=0.05)
	m.add(Dense(2500, w_param=par, b_param=par, activation='tanh'))
	m.add(Dense(2000, w_param=par, b_param=par, activation='tanh'))
	m.add(Dense(1500, w_param=par, b_param=par, activation='tanh'))
	m.add(Dense(1000, w_param=par, b_param=par, activation='tanh'))
	m.add(Dense(500, w_param=par, b_param=par, activation='tanh'))
	m.add(Dense(10, w_param=par, b_param=par, activation='softmax'))
	```


	```
	parw = Parameter(init='gauss', std=0.0001)
	parb = Parameter(init='const', value=0)
	m.add(Convolution(32, 5, 1, 2, w_param=parw, b_param=parb, b_lr=2))
	m.add(MaxPooling2D(pool_size(3,3), stride=2))
	m.add(Activation('relu'))
	m.add(LRN2D(3, alpha=0.00005, beta=0.75))

	parw.update(std=0.01)
	m.add(Convolution(32, 5, 1, 2, w_param=parw, b_param=parb))
	m.add(Activation('relu'))
	m.add(AvgPooling2D(pool_size(3,3), stride=2))
	m.add(LRN2D(3, alpha=0.00005, beta=0.75))

	m.add(Convolution(64, 5, 1, 2, w_param=parw, b_param=parb, b_lr=1))
	m.add(Activation('relu'))
	m.add(AvgPooling2D(pool_size(3,3), stride=2))

	m.add(Dense(10, w_param=parw, w_wd=250, b_param=parb, b_lr=2, b_wd=0, activation='softmax'))
	```


	Data can be added in this way,

	```
	X_train, X_test = mnist.load_data() // parameter values are set in load_data()
	m.fit(X_train, ...) // Data layer for training is added
	m.evaluate(X_test, ...) // Data layer for testing is added
	```
	or this way,

	```
	X_train, X_test = mnist.load_data() // parameter values are set in load_data()
	m.add(X_train) // explicitly add Data layer
	m.add(X_test) // explicitly add Data layer
	```


	```
	store = Store(path='train.bin', batch_size=64, ...) // parameter values are set explicitly
	m.add(Data(load='recordinput', phase='train', conf=store)) // Data layer is added
	store = Store(path='test.bin', batch_size=100, ...) // parameter values are set explicitly
	m.add(Data(load='recordinput', phase='test', conf=store)) // Data layer is added
	```


	### Cases to run SINGA

	(1) Run SINGA for training

	```
	m.fit(X_train, nb_epoch=1000)
	```

	(2) Run SINGA for training and validation

	```
	m.fit(X_train, validate_data=X_valid, nb_epoch=1000)
	```

	(3) Run SINGA for test while training

	```
	m.fit(X_train, nb_epoch=1000, with_test=True)
	result = m.evaluate(X_test, batch_size=100, test_steps=100)
	```

	(4) Run SINGA for test only
	Assume a checkpoint exists after training

	```
	result = m.evaluate(X_test, batch_size=100, checkpoint_path=workspace+'/checkpoint/step100-worker0')
	```


	## Implementation Details

	### Layer class (inherited)

	* Data
	* Dense
	* Activation
	* Convolution2D
	* MaxPooling2D
	* AvgPooling2D
	* LRN2D
	* Dropout
	* RBM
	* Autoencoder

	### Model class

	Model class has `jobconf` (JobProto) and `layers` (layer list)

	Methods in Model class

	* add
	* add Layer into Model
	* 2 subclasses: Sequential model and Energy model

	* compile
	* set Updater (i.e., optimizer) and Cluster (i.e., topology) components

	* fit
	* set Training data and parameter values for the training
	* (optional) set Validatiaon data and parameter values
	* set Train_one_batch component
	* specify `with_test` field if a user wants to run SINGA with test data simultaneously.
	* [TODO] recieve train/validation results, e.g., accuracy, loss, ppl, etc.

	* evaluate
	* set Testing data and parameter values for the testing
	* specify `checkpoint_path` field if a user want to run SINGA only for testing.
	* [TODO] recieve test results, e.g., accuracy, loss, ppl, etc.

	### Results

	fit() and evaluate() return train/test results, a dictionary containing

	* [key]: step number
	* [value]: a list of dictionay
	* 'acc' for accuracy
	* 'loss' for loss
	* 'ppl' for ppl
	* 'se' for squred error


	### Parameter class

	Users need to set parameter and initial values. For example,

	* Parameter (fields in Param proto)
	* lr = (float) // learning rate multiplier, used to scale the learning rate when updating parameters.
	* wd = (float) // weight decay multiplier, used to scale the weight decay when updating parameters.

	* Parameter initialization (fields in ParamGen proto)
	* init = (string) // one of the types, 'uniform', 'constant', 'gaussian'
	* high = (float) // for 'uniform'
	* low = (float) // for 'uniform'
	* value = (float) // for 'constant'
	* mean = (float) // for 'gaussian'
	* std = (float) // for 'gaussian'

	* Weight (`w_param`) is 'gaussian' with mean=0, std=0.01 at default

	* Bias (`b_param`) is 'constant' with value=0 at default

	* How to update the parameter fields
	* for updating Weight, put `w_` in front of field name
	* for updating Bias, put `b_` in front of field name

	Several ways to set Parameter values

	```
	parw = Parameter(lr=2, wd=10, init='gaussian', std=0.1)
	parb = Parameter(lr=1, wd=0, init='constant', value=0)
	m.add(Convolution2D(10, w_param=parw, b_param=parb, ...)
	```

	```
	m.add(Dense(10, w_mean=1, w_std=0.1, w_lr=2, w_wd=10, ...)
	```

	```
	parw = Parameter(init='constant', mean=0)
	m.add(Dense(10, w_param=parw, w_lr=1, w_wd=1, b_value=1, ...)
	```

	### Other classes

	* Store
	* Algorithm
	* Updater
	* SGD
	* AdaGrad
	* Cluster