SINGAROOT/tool/python |-- pb2 (has job_pb2.py) |-- singa |-- model.py |-- layer.py |-- parameter.py |-- initialization.py |-- utils |-- utility.py |-- message.py |-- examples |-- cifar10_cnn.py, mnist_mlp.py, , mnist_rbm1.py, mnist_ae.py, etc. |-- datasets |-- cifar10.py |-- mnist.py
bin/singa-run.sh -exec user_main.py
The python code, e.g., user_main.py, would create the JobProto object and pass it to Driver::Train.
For example,
cd SINGA_ROOT bin/singa-run.sh -exec tool/python/examples/cifar10_cnn.py
Note that, in order to use the Python Helper feature, users need to add the following option
./configure --enable-python --with-python=PYTHON_DIR
where PYTHON_DIR has Python.h
jobconf
(JobProto) and layers
(layer list)Methods in Model class
add
compile
fit
with_test
field if a user wants to run singa with test data simultaneously.evaluate
checkpoint_path
field if a user want to run singa only for testing.fit() and evaluate() return train/test results, a dictionary containing
Users need to set a list of gpu ids to device
field in fit() or evaluate().
For example,
gpu_id = [0] m.fit(X_train, nb_epoch=100, with_test=True, device=gpu_id)
Users need to set parameter and initial values. For example,
Parameter (fields in Param proto)
Parameter initialization (fields in ParamGen proto)
Weight (w_param
) is ‘gaussian’ with mean=0, std=0.01 at default
Bias (b_param
) is ‘constant’ with value=0 at default
How to update the parameter fields
w_
in front of field nameb_
in front of field nameSeveral ways to set Parameter values
parw = Parameter(lr=2, wd=10, init='gaussian', std=0.1) parb = Parameter(lr=1, wd=0, init='constant', value=0) m.add(Convolution2D(10, w_param=parw, b_param=parb, ...)
m.add(Dense(10, w_mean=1, w_std=0.1, w_lr=2, w_wd=10, ...)
parw = Parameter(init='constant', mean=0) m.add(Dense(10, w_param=parw, w_lr=1, w_wd=1, b_value=1, ...)
An example (to generate job.conf for mnist)
X_train, X_test, workspace = mnist.load_data() m = Sequential('mlp', sys.argv) m.add(Dense(2500, init='uniform', activation='tanh')) m.add(Dense(2000, init='uniform', activation='tanh')) m.add(Dense(1500, init='uniform', activation='tanh')) m.add(Dense(1000, init='uniform', activation='tanh')) m.add(Dense(500, init='uniform', activation='tanh')) m.add(Dense(10, init='uniform', activation='softmax')) sgd = SGD(lr=0.001, lr_type='step') topo = Cluster(workspace) m.compile(loss='categorical_crossentropy', optimizer=sgd, cluster=topo) m.fit(X_train, nb_epoch=1000, with_test=True) result = m.evaluate(X_test, batch_size=100, test_steps=10, test_freq=60)
An example (to generate job.conf for cifar10)
X_train, X_test, workspace = cifar10.load_data() m = Sequential('cnn', sys.argv) m.add(Convolution2D(32, 5, 1, 2, w_std=0.0001, b_lr=2)) m.add(MaxPooling2D(pool_size=(3,3), stride=2)) m.add(Activation('relu')) m.add(LRN2D(3, alpha=0.00005, beta=0.75)) m.add(Convolution2D(32, 5, 1, 2, b_lr=2)) m.add(Activation('relu')) m.add(AvgPooling2D(pool_size=(3,3), stride=2)) m.add(LRN2D(3, alpha=0.00005, beta=0.75)) m.add(Convolution2D(64, 5, 1, 2)) m.add(Activation('relu')) m.add(AvgPooling2D(pool_size=(3,3), stride=2)) m.add(Dense(10, w_wd=250, b_lr=2, b_wd=0, activation='softmax')) sgd = SGD(decay=0.004, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001)) topo = Cluster(workspace) m.compile(updater=sgd, cluster=topo) m.fit(X_train, nb_epoch=1000, with_test=True) result = m.evaluate(X_test, 1000, test_steps=30, test_freq=300)
rbmid = 3 X_train, X_test, workspace = mnist.load_data(nb_rbm=rbmid) m = Energy('rbm'+str(rbmid), sys.argv) out_dim = [1000, 500, 250] m.add(RBM(out_dim, w_std=0.1, b_wd=0)) sgd = SGD(lr=0.1, decay=0.0002, momentum=0.8) topo = Cluster(workspace) m.compile(optimizer=sgd, cluster=topo) m.fit(X_train, alg='cd', nb_epoch=6000)
rbmid = 4 X_train, X_test, workspace = mnist.load_data(nb_rbm=rbmid+1) m = Sequential('autoencoder', sys.argv) hid_dim = [1000, 500, 250, 30] m.add(Autoencoder(hid_dim, out_dim=784, activation='sigmoid', param_share=True)) agd = AdaGrad(lr=0.01) topo = Cluster(workspace) m.compile(loss='mean_squared_error', optimizer=agd, cluster=topo) m.fit(X_train, alg='bp', nb_epoch=12200)
Hidden layers for MLP can be written as
for n in [2500, 2000, 1500, 1000, 500]: m.add(Dense(n, init='uniform', activation='tanh')) m.add(Dense(10, init='uniform', activation='softmax'))
Activation layer can be specified separately
m.add(Dense(2500, init='uniform')) m.add(Activation('tanh'))
Users can explicity specify weight and bias, and their values
for example of MLP
par = Parameter(init='uniform', scale=0.05) m.add(Dense(2500, w_param=par, b_param=par, activation='tanh')) m.add(Dense(2000, w_param=par, b_param=par, activation='tanh')) m.add(Dense(1500, w_param=par, b_param=par, activation='tanh')) m.add(Dense(1000, w_param=par, b_param=par, activation='tanh')) m.add(Dense(500, w_param=par, b_param=par, activation='tanh')) m.add(Dense(10, w_param=par, b_param=par, activation='softmax'))
for example of Cifar10
parw = Parameter(init='gauss', std=0.0001) parb = Parameter(init='const', value=0) m.add(Convolution(32, 5, 1, 2, w_param=parw, b_param=parb, b_lr=2)) m.add(MaxPooling2D(pool_size(3,3), stride=2)) m.add(Activation('relu')) m.add(LRN2D(3, alpha=0.00005, beta=0.75)) parw.update(std=0.01) m.add(Convolution(32, 5, 1, 2, w_param=parw, b_param=parb)) m.add(Activation('relu')) m.add(AvgPooling2D(pool_size(3,3), stride=2)) m.add(LRN2D(3, alpha=0.00005, beta=0.75)) m.add(Convolution(64, 5, 1, 2, w_param=parw, b_param=parb, b_lr=1)) m.add(Activation('relu')) m.add(AvgPooling2D(pool_size(3,3), stride=2)) m.add(Dense(10, w_param=parw, w_wd=250, b_param=parb, b_lr=2, b_wd=0, activation='softmax'))
Alternative ways to add Data layer
X_train, X_test = mnist.load_data() // parameter values are set in load_data() m.fit(X_train, ...) // Data layer for training is added m.evaluate(X_test, ...) // Data layer for testing is added
X_train, X_test = mnist.load_data() // parameter values are set in load_data() m.add(X_train) // explicitly add Data layer m.add(X_test) // explicitly add Data layer
store = Store(path='train.bin', batch_size=64, ...) // parameter values are set explicitly m.add(Data(load='recordinput', phase='train', conf=store)) // Data layer is added store = Store(path='test.bin', batch_size=100, ...) // parameter values are set explicitly m.add(Data(load='recordinput', phase='test', conf=store)) // Data layer is added
(1) Run singa for training
m.fit(X_train, nb_epoch=1000)
(2) Run singa for training and validation
m.fit(X_train, validate_data=X_valid, nb_epoch=1000)
(3) Run singa for test while training
m.fit(X_train, nb_epoch=1000, with_test=True) result = m.evaluate(X_test, batch_size=100, test_steps=100)
(4) Run singa for test only Assume a checkpoint exists after training
result = m.evaluate(X_test, batch_size=100, checkpoint_path=workspace+'/checkpoint/step100-worker0')