SINGA-125 Improve Python Helper
- Update README.md
- Update layer.py and model.py
. deal with non-square values for kernel, stride, pad
. users can specify Accuracy layer by 'show_acc=True'
- Update cifar10 examples
. set momentum to 0.9 to speed up convergence
diff --git a/tool/python/README.md b/tool/python/README.md
index 02e7fd1..e383cfb 100644
--- a/tool/python/README.md
+++ b/tool/python/README.md
@@ -1,6 +1,8 @@
-## SINGA-81 Add Python Helper, which enables users to construct a model (JobProto) and run Singa in Python
+# Python Helper
- SINGAROOT/tool/python
+Users can construct a model and run SINGA using Python. Specifically, the Python helper enables users to generate JobProto for the model and run Driver::Train or Driver::Test using Python. The Python Helper tool can be found in `SINGA_ROOT/tool/python` consisting of the following directories.
+
+ SINGAROOT/tool/python
|-- pb2 (has job_pb2.py)
|-- singa
|-- model.py
@@ -11,79 +13,83 @@
|-- utility.py
|-- message.py
|-- examples
- |-- cifar10_cnn.py, mnist_mlp.py, , mnist_rbm1.py, mnist_ae.py, etc.
+ |-- cifar10_cnn.py, mnist_mlp.py, mnist_rbm1.py, mnist_ae.py, etc.
|-- datasets
|-- cifar10.py
|-- mnist.py
-### How to Run
+##1. Basic User Guide
+
+In order to use the Python Helper features, users need to add the following option when building SINGA as follows.
+```
+./configure --enable-python --with-python=PYTHON_DIR
+```
+where `PYTHON_DIR` has `Python.h`
+
+### (a) How to Run
```
bin/singa-run.sh -exec user_main.py
```
-The python code, e.g., user_main.py, would create the JobProto object and pass it to Driver::Train.
+The python code, e.g., `user_main.py`, would create the JobProto object and pass it to Driver::Train or Driver:Test.
-For example,
+For running CIFAR10 example,
```
cd SINGA_ROOT
bin/singa-run.sh -exec tool/python/examples/cifar10_cnn.py
```
-
-Note that, in order to use the Python Helper feature, users need to add the following option
+For running MNIST example,
```
-./configure --enable-python --with-python=PYTHON_DIR
+cd SINGA_ROOT
+bin/singa-run.sh -exec tool/python/examples/mnist_mlp.py
```
-where PYTHON_DIR has Python.h
-### Layer class (inherited)
+### (b) Class Description
-* Data
-* Dense
-* Activation
-* Convolution2D
-* MaxPooling2D
-* AvgPooling2D
-* LRN2D
-* Dropout
-* RBM
-* Autoencoder
+#### Layer class
-### Model class
+The following classes configure field values for a particular layer and generate its LayerProto.
-* Model class has `jobconf` (JobProto) and `layers` (layer list)
+* `Data` for a data layer.
+* `Dense` for an innerproduct layer.
+* `Activation` for an activation layer.
+* `Convolution2D` for a convolution layer.
+* `MaxPooling2D` for a max pooling layer.
+* `AvgPooling2D` for an average pooling layer.
+* `LRN2D` for a normalization (or local response normalization) layer.
+* `Dropout` for a dropout layer.
+
+In addition, the following classes generate multiple layers for particular models.
+
+* `RBM` for constructing layers of RBM.
+* `Autoencoder` for constructing layers of Autoencoder
+
+
+#### Model class
+
+Model class has `jobconf` (JobProto) and `layers` (a layer list).
Methods in Model class
-* add
- * add Layer into Model
- * 2 subclasses: Sequential model and Energy model
+* `add` to add Layer into the model
+ * 2 subclasses: `Sequential` model and `Energy` model
-* compile
- * set Updater (i.e., optimizer) and Cluster (i.e., topology) components
+* `compile` to configure an optimizer and topology for training.
+ * set `Updater` (i.e., optimizer) and `Cluster` (i.e., topology) components
-* fit
+* `fit` to configure field values for training.
* set Training data and parameter values for the training
* (optional) set Validatiaon data and parameter values
- * set Train_one_batch component
- * specify `with_test` field if a user wants to run singa with test data simultaneously.
- * [TODO] recieve train/validation results, e.g., accuracy, loss, ppl, etc.
+ * set `Train_one_batch` component
+ * set `with_test` argument `True` if users want to run SINGA with test data simultaneously.
+ * return train/validation results, e.g., accuracy, loss, ppl, etc.
-* evaluate
- * set Testing data and parameter values for the testing
- * specify `checkpoint_path` field if a user want to run singa only for testing.
- * [TODO] recieve test results, e.g., accuracy, loss, ppl, etc.
+* `evaluate` to configure field values for test.
+ * set Testing data and parameter values for the test
+ * specify `checkpoint_path` field if users want to run SINGA only for test.
+ * return test results, e.g., accuracy, loss, ppl, etc.
-#### Results
-fit() and evaluate() return train/test results, a dictionary containing
-
-* [key]: step number
-* [value]: a list of dictionay
- * 'acc' for accuracy
- * 'loss' for loss
- * 'ppl' for ppl
- * 'se' for squred error
-
-#### To run Singa on GPU
+### (c) To Run Singa on GPU
Users need to set a list of gpu ids to `device` field in fit() or evaluate().
@@ -94,60 +100,53 @@
```
-### Parameter class
+### (d) How to set/update parameter values
-Users need to set parameter and initial values. For example,
+Users may need to set/update parameter field values.
-* Parameter (fields in Param proto)
- * lr = (float) // learning rate multiplier, used to scale the learning rate when updating parameters.
- * wd = (float) // weight decay multiplier, used to scale the weight decay when updating parameters.
+* Parameter fields for both Weight and Bias (i.e., fields of ParamProto)
+ * `lr` = (float) : learning rate multiplier, used to scale the learning rate when updating parameters.
+ * `wd` = (float) : weight decay multiplier, used to scale the weight decay when updating parameters.
-* Parameter initialization (fields in ParamGen proto)
- * init = (string) // one of the types, 'uniform', 'constant', 'gaussian'
- * high = (float) // for 'uniform'
- * low = (float) // for 'uniform'
- * value = (float) // for 'constant'
- * mean = (float) // for 'gaussian'
- * std = (float) // for 'gaussian'
+* Parameter initialization (fields of ParamGenProto)
+ * `init` = (string) : one of the types, 'uniform', 'constant', 'gaussian'
+ * `scale` = (float) : for 'uniform', it is used to set `low`=-scale and `high`=+scale
+ * `high` = (float) : for 'uniform'
+ * `low` = (float) : for 'uniform'
+ * `value` = (float) : for 'constant'
+ * `mean` = (float) : for 'gaussian'
+ * `std` = (float) : for 'gaussian'
-* Weight (`w_param`) is 'gaussian' with mean=0, std=0.01 at default
+* Weight (`w_param`) is set as 'gaussian' with `mean`=0 and `std`=0.01 at default.
-* Bias (`b_param`) is 'constant' with value=0 at default
+* Bias (`b_param`) is set as 'constant' with `value`=0 at default.
-* How to update the parameter fields
- * for updating Weight, put `w_` in front of field name
- * for updating Bias, put `b_` in front of field name
+* In order to set/update the parameter fields of either Weight or Bias
+ * for Weight, put `w_` in front of field name
+ * for Bias, put `b_` in front of field name
-Several ways to set Parameter values
-```
-parw = Parameter(lr=2, wd=10, init='gaussian', std=0.1)
-parb = Parameter(lr=1, wd=0, init='constant', value=0)
-m.add(Convolution2D(10, w_param=parw, b_param=parb, ...)
-```
-```
-m.add(Dense(10, w_mean=1, w_std=0.1, w_lr=2, w_wd=10, ...)
-```
-```
-parw = Parameter(init='constant', mean=0)
-m.add(Dense(10, w_param=parw, w_lr=1, w_wd=1, b_value=1, ...)
-```
+ For example,
+ ```
+ m.add(Dense(10, w_mean=1, w_std=0.1, w_lr=2, w_wd=10, ...)
+ ```
+
+
+### (e) Results
+
+fit() and evaluate() return training/test results, i.e., a dictionary containing
+
+* [key]: step number
+* [value]: a list of dictionay
+ * 'acc' for accuracy
+ * 'loss' for loss
+ * 'ppl' for ppl
+ * 'se' for squred error
-#### Other classes
+## 2. Examples
-* Store
-* Algorithm
-* Updater
-* SGD
-* AdaGrad
-* Cluster
-
-
-## MLP Example
-
-An example (to generate job.conf for mnist)
-
+### MLP example (to generate job.conf for MNIST)
```
X_train, X_test, workspace = mnist.load_data()
@@ -167,10 +166,7 @@
result = m.evaluate(X_test, batch_size=100, test_steps=10, test_freq=60)
```
-## CNN Example
-
-An example (to generate job.conf for cifar10)
-
+### CNN example (to generate job.conf for cifar10)
```
X_train, X_test, workspace = cifar10.load_data()
@@ -199,8 +195,7 @@
result = m.evaluate(X_test, 1000, test_steps=30, test_freq=300)
```
-
-## RBM Example
+### RBM Example
```
rbmid = 3
X_train, X_test, workspace = mnist.load_data(nb_rbm=rbmid)
@@ -215,7 +210,7 @@
m.fit(X_train, alg='cd', nb_epoch=6000)
```
-## AutoEncoder Example
+### AutoEncoder Example
```
rbmid = 4
X_train, X_test, workspace = mnist.load_data(nb_rbm=rbmid+1)
@@ -230,7 +225,47 @@
m.fit(X_train, alg='bp', nb_epoch=12200)
```
-### TIPS
+
+## 3. Advanced User Guide
+
+### Parameter class
+
+Users can explicitly set/update parameter. There are several ways to set Parameter values
+```
+parw = Parameter(lr=2, wd=10, init='gaussian', std=0.1)
+parb = Parameter(lr=1, wd=0, init='constant', value=0)
+m.add(Convolution2D(10, w_param=parw, b_param=parb, ...)
+```
+```
+m.add(Dense(10, w_mean=1, w_std=0.1, w_lr=2, w_wd=10, ...)
+```
+```
+parw = Parameter(init='constant', mean=0)
+m.add(Dense(10, w_param=parw, w_lr=1, w_wd=1, b_value=1, ...)
+```
+
+### Data layer
+
+There are alternative ways to add Data layer. In addition, users can write your own `load_data` method of `cifar10.py` and `mnist.py` in `examples/dataset`.
+```
+X_train, X_test = mnist.load_data() // parameter values are set in load_data()
+m.fit(X_train, ...) // Data layer for training is added
+m.evaluate(X_test, ...) // Data layer for testing is added
+```
+```
+X_train, X_test = mnist.load_data() // parameter values are set in load_data()
+m.add(X_train) // explicitly add Data layer
+m.add(X_test) // explicitly add Data layer
+```
+```
+store = Store(path='train.bin', batch_size=64, ...) // parameter values are set explicitly
+m.add(Data(load='recordinput', phase='train', conf=store)) // Data layer is added
+store = Store(path='test.bin', batch_size=100, ...) // parameter values are set explicitly
+m.add(Data(load='recordinput', phase='test', conf=store)) // Data layer is added
+```
+
+
+### Other TIPS
Hidden layers for MLP can be written as
```
@@ -281,26 +316,9 @@
```
-Alternative ways to add Data layer
-```
-X_train, X_test = mnist.load_data() // parameter values are set in load_data()
-m.fit(X_train, ...) // Data layer for training is added
-m.evaluate(X_test, ...) // Data layer for testing is added
-```
-```
-X_train, X_test = mnist.load_data() // parameter values are set in load_data()
-m.add(X_train) // explicitly add Data layer
-m.add(X_test) // explicitly add Data layer
-```
-```
-store = Store(path='train.bin', batch_size=64, ...) // parameter values are set explicitly
-m.add(Data(load='recordinput', phase='train', conf=store)) // Data layer is added
-store = Store(path='test.bin', batch_size=100, ...) // parameter values are set explicitly
-m.add(Data(load='recordinput', phase='test', conf=store)) // Data layer is added
-```
-### Cases to run singa
+### Different Cases to Run SINGA
(1) Run singa for training
```
diff --git a/tool/python/examples/cifar10_cnn.py b/tool/python/examples/cifar10_cnn.py
index f03b611..8d4e778 100755
--- a/tool/python/examples/cifar10_cnn.py
+++ b/tool/python/examples/cifar10_cnn.py
@@ -47,7 +47,7 @@
m.add(Dense(10, w_wd=250, b_lr=2, b_wd=0, activation='softmax'))
-sgd = SGD(decay=0.004, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001))
+sgd = SGD(decay=0.004, momentum=0.9, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001))
topo = Cluster(workspace)
m.compile(loss='categorical_crossentropy', optimizer=sgd, cluster=topo)
m.fit(X_train, nb_epoch=1000, with_test=True)
diff --git a/tool/python/examples/cifar10_cnn_cudnn.py b/tool/python/examples/cifar10_cnn_cudnn.py
index e87b5c4..e243834 100755
--- a/tool/python/examples/cifar10_cnn_cudnn.py
+++ b/tool/python/examples/cifar10_cnn_cudnn.py
@@ -47,7 +47,7 @@
m.add(Dense(10, w_wd=250, b_lr=2, b_wd=0, activation='softmax'))
-sgd = SGD(decay=0.004, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001))
+sgd = SGD(decay=0.004, momentum=0.9, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001))
topo = Cluster(workspace)
m.compile(loss='categorical_crossentropy', optimizer=sgd, cluster=topo)
diff --git a/tool/python/singa/layer.py b/tool/python/singa/layer.py
index b391d26..491e98b 100644
--- a/tool/python/singa/layer.py
+++ b/tool/python/singa/layer.py
@@ -95,12 +95,12 @@
activation=None, **kwargs):
'''
required
- nb_filter = (int) // the number of filters
- kernel = (int) // the size of filter
+ nb_filter = (int) // the number of filters
+ kernel = (int/tuple) // the size of filter
optional
- stride = (int) // the size of stride
- pad = (int) // the size of padding
- init = (string) // 'unirom', 'gaussian', 'constant'
+ stride = (int/tuple) // the size of stride
+ pad = (int/tuple) // the size of padding
+ init = (string) // 'uniform', 'gaussian', 'constant'
w_param = (Parameter) // Parameter object for weight
b_param = (Parameter) // Parameter object for bias
**kwargs (KEY=VALUE)
@@ -112,13 +112,29 @@
b_wd = (float) // weight decay multiplier for bias
'''
- assert nb_filter > 0 and kernel > 0, 'should be set as positive int'
+ assert nb_filter > 0, 'nb_filter should be set as positive int'
super(Convolution2D, self).__init__(name=generate_name('conv', 1),
type=kCConvolution)
- fields = {'num_filters' : nb_filter,
- 'kernel' : kernel,
- 'stride' : stride,
- 'pad' : pad}
+ fields = {}
+ # for kernel
+ if type(kernel) == int:
+ fields['kernel'] = kernel
+ else:
+ fields['kernel_x'] = kernel[0]
+ fields['kernel_y'] = kernel[1]
+ # for stride
+ if type(stride) == int:
+ fields['stride'] = stride
+ else:
+ fields['stride_x'] = stride[0]
+ fields['stride_y'] = stride[1]
+ # for pad
+ if type(pad) == int:
+ fields['pad'] = pad
+ else:
+ fields['pad_x'] = pad[0]
+ fields['pad_y'] = pad[1]
+
setval(self.layer.convolution_conf, **fields)
# parameter w
@@ -158,7 +174,7 @@
if type(pool_size) == int:
pool_size = (pool_size, pool_size)
assert type(pool_size) == tuple and pool_size[0] == pool_size[1], \
- 'pool size should be square in Singa'
+ 'currently pool size should be square in Singa'
super(MaxPooling2D, self).__init__(name=generate_name('pool'),
type=kCPooling, **kwargs)
fields = {'pool' : PoolingProto().MAX,
@@ -184,7 +200,7 @@
if type(pool_size) == int:
pool_size = (pool_size, pool_size)
assert type(pool_size) == tuple and pool_size[0] == pool_size[1], \
- 'pool size should be square in Singa'
+ 'currently pool size should be square in Singa'
super(AvgPooling2D, self).__init__(name=generate_name('pool'),
type=kCPooling, **kwargs)
self.layer.pooling_conf.pool = PoolingProto().AVG
@@ -242,6 +258,16 @@
type=self.layer_type)
self.layer.dropout_conf.dropout_ratio = ratio
+class Accuracy(Layer):
+
+ def __init__(self):
+ '''
+ '''
+
+ self.name = 'accuracy'
+ self.layer_type = enumLayerType(self.name)
+ super(Accuracy, self).__init__(name=generate_name(self.name),
+ type=self.layer_type)
class RGB(Layer):
@@ -268,7 +294,7 @@
output_dim = (int)
optional
activation = (string)
- init = (string) // 'unirom', 'gaussian', 'constant'
+ init = (string) // 'uniform', 'gaussian', 'constant'
w_param = (Parameter) // Parameter object for weight
b_param = (Parameter) // Parameter object for bias
**kwargs
diff --git a/tool/python/singa/model.py b/tool/python/singa/model.py
index 6ad9422..f652f86 100644
--- a/tool/python/singa/model.py
+++ b/tool/python/singa/model.py
@@ -55,6 +55,7 @@
self.result = None
self.last_checkpoint_path = None
self.cudnn = False
+ self.accuracy = False
def add(self, layer):
'''
@@ -151,6 +152,17 @@
else:
getattr(lastly, 'srclayers').append(self.layers[0].layer.name)
+ if self.accuracy == True:
+ smly = net.layer.add()
+ smly.CopyFrom(Layer(name='softmax', type=kSoftmax).layer)
+ setval(smly, include=kTest)
+ getattr(smly, 'srclayers').append(self.layers[-1].layer.name)
+ aly = net.layer.add()
+ aly.CopyFrom(Accuracy().layer)
+ setval(aly, include=kTest)
+ getattr(aly, 'srclayers').append('softmax')
+ getattr(aly, 'srclayers').append(self.layers[0].layer.name)
+
# use of cudnn
if self.cudnn == True:
self.set_cudnn_layer_type(net)
@@ -230,7 +242,8 @@
pass
def evaluate(self, data=None, alg='bp',
- checkpoint_path=None, execpath='', device=None, **fields):
+ checkpoint_path=None, execpath='',
+ device=None, show_acc=False, **fields):
'''
required
data = (Data) // Data class object for testing data
@@ -239,6 +252,7 @@
checkpoint_path = (list) // checkpoint path
execpaths = (string) // path to user's own executable
device = (int/list) // a list of gpu ids
+ show_acc = (bool) // compute and the accuacy
**fields (KEY=VALUE)
batch_size = (int) // batch size for testing data
test_freq = (int) // frequency of testing
@@ -276,6 +290,9 @@
setval(self.jobconf, gpu=device)
self.cudnn = True
+ # set True if showing the accuracy
+ self.accuracy = show_acc
+
self.build() # construct Nneuralnet Component
#--- generate job.conf file for debug purpose