In this prototype, we represent a tensor as a matrix stored in a row-major format, where first dimension of tensor and matrix are exactly the same. For example, a tensor (with all zeros) of shape [3, 2, 4, 5] can be instantiated by following DML statement:
A = matrix(0, rows=3, cols=2*4*5)
Following operators work out-of-the box when both tensors X and Y have same shape:
X ^ Y
-X
X %/% Y
X %% Y
X * Y
X / Y
X + Y
X - Y
SystemML does not support implicit broadcast for above tensor operations, however one can write a DML-bodied function to do so. For example: to perform the above operations with broadcasting on second dimensions, one can use the below rep(Z, n)
function:
rep = function(matrix[double] Z, int C) return (matrix[double] ret) { ret = Z for(i in 2:C) { ret = cbind(ret, Z) } }
Using the above rep(Z, n)
function, we can realize the element-wise arithmetic operation with broadcasting. Here are some examples:
X + Y
(Note: SystemML does implicit broadcasting in this case because of the way it represents the tensor)X + Y
(Note: SystemML does implicit broadcasting in this case because of the way it represents the tensor)X + rep(Y, C)
X + rep(Y, C)
rep(X, C) + Y
rep(X, C) + Y
TODO: Map the NumPy tensor calls to DML expressions.
The images are assumed to be stored NCHW format, where N = batch size, C = #channels, H = height of image and W = width of image. Hence, the images are internally represented as a matrix with dimension (N, C * H * W).
This prototype also contains initial implementation of forward/backward functions for 2D convolution and pooling:
conv2d(x, w, ...)
conv2d_backward_filter(x, dout, ...)
and conv2d_backward_data(w, dout, ...)
max_pool(x, ...)
and max_pool_backward(x, dout, ...)
The required arguments for all above functions are:
The additional required argument for conv2d/conv2d_backward_filter/conv2d_backward_data functions is:
The additional required argument for max_pool/avg_pool functions is:
The results of these functions are consistent with Nvidia's CuDNN library.
To perform valid padding, use padding = (input_shape-filter_shape)*(stride-1)/ 2
. (Hint: for stride length of 1, padding = [0, 0]
performs valid padding).
To perform full padding, use padding = ((stride-1)*input_shape + (stride+1)*filter_shape - 2*stride) / 2
. (Hint: for stride length of 1, padding = [filter_h-1, filter_w-1]
performs full padding).
To perform same padding, use padding = (input_shape*(stride-1) + filter_shape - stride)/2
. (Hint: for stride length of 1, padding = [(filter_h-1)/2, (filter_w-1)/2]
performs same padding).
Consider one-channel 3 X 3 image =
x1 | x2 | x3 |
---|---|---|
x4 | x5 | x6 |
x7 | x8 | x9 |
and one 2 X 2 filter:
w1 | w2 |
---|---|
w3 | w4 |
Then, conv2d(x, w, stride=[1, 1], padding=[0, 0], input_shape=[1, 1, 3, 3], filter_shape=[1, 1, 2, 2])
produces following tensor of shape [1, 1, 2, 2]
, which is represented as 1 X 4
matrix in NCHW format:
w1*x1 + w2*x2 + w3*x4 + w4*x5 | w1*x2 + w2*x3 + w3*x5 + w4*x6 | w1*x4 + w2*x5 + w3*x7 + w4*x8 | w1*x5 + w2*x6 + w3*x8 + w4*x9 |
---|
Let the error propagated from above layer is
y1 | y2 | y3 | y4 |
---|
Then conv2d_backward_filter(x, y, stride=[1, 1], padding=[0, 0], input_shape=[1, 1, 3, 3], filter_shape=[1, 1, 2, 2])
produces following updates for the filter:
y1*x1 + y2*x2 + y3*x4 + y4*x5 | y1*x2 + y2*x3 + y3*x5 + y4*x6 |
---|---|
y1*x4 + y2*x5 + y3*x7 + y4*x8 | y1*x5 + y2*x6 + y3*x8 + y4*x9 |
Note: since the above update is a tensor of shape [1, 1, 2, 2], it will be represented as matrix of dimension [1, 4].
Similarly, conv2d_backward_data(w, y, stride=[1, 1], padding=[0, 0], input_shape=[1, 1, 3, 3], filter_shape=[1, 1, 2, 2])
produces following updates for the image:
w1*y1 | w2*y1 + w1*y2 | w2*y2 |
---|---|---|
w3*y1 + w1*y3 | w4*y1 + w3*y2 + w2*y3 + w1*y4 | w4*y2 + w2*y4 |
w3*y3 | w4*y3 + w3*y4 | w4*y4 |
The below script also demonstrates how to save the trained model.
# Download the MNIST dataset from mlxtend.data import mnist_data import numpy as np from sklearn.utils import shuffle X, y = mnist_data() X, y = shuffle(X, y) num_classes = np.unique(y).shape[0] img_shape = (1, 28, 28) # Split the data into training and test n_samples = len(X) X_train = X[:int(.9 * n_samples)] y_train = y[:int(.9 * n_samples)] X_test = X[int(.9 * n_samples):] y_test = y[int(.9 * n_samples):] # Download the Lenet network import urllib urllib.urlretrieve('https://raw.githubusercontent.com/niketanpansare/model_zoo/master/caffe/vision/lenet/mnist/lenet.proto', 'lenet.proto') urllib.urlretrieve('https://raw.githubusercontent.com/niketanpansare/model_zoo/master/caffe/vision/lenet/mnist/lenet_solver.proto', 'lenet_solver.proto') # Train Lenet On MNIST using scikit-learn like API from systemml.mllearn import Caffe2DML lenet = Caffe2DML(sqlCtx, solver='lenet_solver.proto').set(max_iter=500, debug=True).setStatistics(True) print('Lenet score: %f' % lenet.fit(X_train, y_train).score(X_test, y_test)) # Save the trained model lenet.save('lenet_model')
# Fine-tune the existing trained model new_lenet = Caffe2DML(sqlCtx, solver='lenet_solver.proto', weights='lenet_model').set(max_iter=500, debug=True) new_lenet.fit(X_train, y_train) new_lenet.save('lenet_model')
# Use the new model for prediction predict_lenet = Caffe2DML(sqlCtx, solver='lenet_solver.proto', weights='lenet_model') print('Lenet score: %f' % predict_lenet.score(X_test, y_test))
Similarly, you can perform prediction using the pre-trained ResNet network
from systemml.mllearn import Caffe2DML from pyspark.sql import SQLContext import numpy as np import urllib, os, scipy.ndimage from PIL import Image import systemml as sml # ImageNet specific parameters img_shape = (3, 224, 224) # Downloads a jpg image, resizes it to 224 and return as numpy array in N X CHW format url = 'https://upload.wikimedia.org/wikipedia/commons/thumb/5/58/MountainLion.jpg/312px-MountainLion.jpg' outFile = 'test.jpg' urllib.urlretrieve(url, outFile) input_image = sml.convertImageToNumPyArr(Image.open(outFile), img_shape=img_shape) # Download the ResNet network import urllib urllib.urlretrieve('https://raw.githubusercontent.com/niketanpansare/model_zoo/master/caffe/vision/resnet/ilsvrc12/ResNet_50_network.proto', 'ResNet_50_network.proto') urllib.urlretrieve('https://raw.githubusercontent.com/niketanpansare/model_zoo/master/caffe/vision/resnet/ilsvrc12/ResNet_50_solver.proto', 'ResNet_50_solver.proto') # Assumes that you have cloned the model_zoo repository # git clone https://github.com/niketanpansare/model_zoo.git resnet = Caffe2DML(sqlCtx, solver='ResNet_50_solver.proto', weights='~/model_zoo/caffe/vision/resnet/ilsvrc12/ResNet_50_pretrained_weights').set(input_shape=img_shape) resnet.predict(input_image)