blob: 08337ab95806af298d360877ba1168d67cf45748 [file] [log] [blame]
<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js lt-ie9 lt-ie8 lt-ie7"> <![endif]-->
<!--[if IE 7]> <html class="no-js lt-ie9 lt-ie8"> <![endif]-->
<!--[if IE 8]> <html class="no-js lt-ie9"> <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js"> <!--<![endif]-->
<head>
<title>Beginner's Guide for Keras2DML users - SystemML 1.2.0</title>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="description" content="Beginner's Guide for Keras2DML users">
<meta name="viewport" content="width=device-width">
<link rel="stylesheet" href="css/bootstrap.min.css">
<link rel="stylesheet" href="css/main.css">
<link rel="stylesheet" href="css/pygments-default.css">
<link rel="shortcut icon" href="img/favicon.png">
</head>
<body>
<!--[if lt IE 7]>
<p class="chromeframe">You are using an outdated browser. <a href="http://browsehappy.com/">Upgrade your browser today</a> or <a href="http://www.google.com/chromeframe/?redirect=true">install Google Chrome Frame</a> to better experience this site.</p>
<![endif]-->
<header class="navbar navbar-default navbar-fixed-top" id="topbar">
<div class="container">
<div class="navbar-header">
<div class="navbar-brand brand projectlogo">
<a href="http://systemml.apache.org/"><img class="logo" src="img/systemml-logo.png" alt="Apache SystemML" title="Apache SystemML"/></a>
</div>
<div class="navbar-brand brand projecttitle">
<a href="http://systemml.apache.org/">Apache SystemML<sup id="trademark"></sup></a><br/>
<span class="version">1.2.0</span>
</div>
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target=".navbar-collapse">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
</div>
<nav class="navbar-collapse collapse">
<ul class="nav navbar-nav navbar-right">
<li><a href="index.html">Overview</a></li>
<li><a href="https://github.com/apache/systemml">GitHub</a></li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Documentation<b class="caret"></b></a>
<ul class="dropdown-menu" role="menu">
<li><b>Running SystemML:</b></li>
<li><a href="https://github.com/apache/systemml">SystemML GitHub README</a></li>
<li><a href="spark-mlcontext-programming-guide.html">Spark MLContext</a></li>
<li><a href="spark-batch-mode.html">Spark Batch Mode</a>
<li><a href="hadoop-batch-mode.html">Hadoop Batch Mode</a>
<li><a href="standalone-guide.html">Standalone Guide</a></li>
<li><a href="jmlc.html">Java Machine Learning Connector (JMLC)</a>
<li class="divider"></li>
<li><b>Language Guides:</b></li>
<li><a href="dml-language-reference.html">DML Language Reference</a></li>
<li><a href="beginners-guide-to-dml-and-pydml.html">Beginner's Guide to DML and PyDML</a></li>
<li><a href="beginners-guide-python.html">Beginner's Guide for Python Users</a></li>
<li><a href="python-reference.html">Reference Guide for Python Users</a></li>
<li class="divider"></li>
<li><b>ML Algorithms:</b></li>
<li><a href="algorithms-reference.html">Algorithms Reference</a></li>
<li class="divider"></li>
<li><b>Tools:</b></li>
<li><a href="debugger-guide.html">Debugger Guide</a></li>
<li><a href="developer-tools-systemml.html">IDE Guide</a></li>
<li class="divider"></li>
<li><b>Other:</b></li>
<li><a href="contributing-to-systemml.html">Contributing to SystemML</a></li>
<li><a href="engine-dev-guide.html">Engine Developer Guide</a></li>
<li><a href="troubleshooting-guide.html">Troubleshooting Guide</a></li>
<li><a href="release-process.html">Release Process</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">API Docs<b class="caret"></b></a>
<ul class="dropdown-menu" role="menu">
<li><a href="./api/java/index.html">Java</a></li>
<li><a href="./api/python/index.html">Python</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Issues<b class="caret"></b></a>
<ul class="dropdown-menu" role="menu">
<li><b>JIRA:</b></li>
<li><a href="https://issues.apache.org/jira/browse/SYSTEMML">SystemML JIRA</a></li>
</ul>
</li>
</ul>
</nav>
</div>
</header>
<div class="container" id="content">
<h1 class="title">Beginner's Guide for Keras2DML users</h1>
<!--
-->
<ul id="markdown-toc">
<li><a href="#introduction" id="markdown-toc-introduction">Introduction</a> <ul>
<li><a href="#getting-started" id="markdown-toc-getting-started">Getting Started</a></li>
<li><a href="#model-conversion" id="markdown-toc-model-conversion">Model Conversion</a></li>
</ul>
</li>
<li><a href="#frequently-asked-questions" id="markdown-toc-frequently-asked-questions">Frequently asked questions</a> <ul>
<li><a href="#what-is-the-mapping-between-keras-parameters-and-caffes-solver-specification-" id="markdown-toc-what-is-the-mapping-between-keras-parameters-and-caffes-solver-specification-">What is the mapping between Keras&#8217; parameters and Caffe&#8217;s solver specification ?</a></li>
<li><a href="#how-do-i-specify-the-batch-size-and-the-number-of-epochs-" id="markdown-toc-how-do-i-specify-the-batch-size-and-the-number-of-epochs-">How do I specify the batch size and the number of epochs ?</a></li>
<li><a href="#what-optimizer-and-loss-does-keras2dml-use-by-default-if-kerasmodel-is-not-compiled-" id="markdown-toc-what-optimizer-and-loss-does-keras2dml-use-by-default-if-kerasmodel-is-not-compiled-">What optimizer and loss does Keras2DML use by default if <code>keras_model</code> is not compiled ?</a></li>
<li><a href="#what-is-the-learning-rate-schedule-used-" id="markdown-toc-what-is-the-learning-rate-schedule-used-">What is the learning rate schedule used ?</a></li>
<li><a href="#how-to-set-the-size-of-the-validation-dataset-" id="markdown-toc-how-to-set-the-size-of-the-validation-dataset-">How to set the size of the validation dataset ?</a></li>
<li><a href="#how-to-monitor-loss-via-command-line-" id="markdown-toc-how-to-monitor-loss-via-command-line-">How to monitor loss via command-line ?</a></li>
</ul>
</li>
</ul>
<p><br /></p>
<h2 id="introduction">Introduction</h2>
<p>Keras2DML is an <strong>experimental API</strong> that converts a Keras specification to DML through the intermediate Caffe2DML module.
It is designed to fit well into the mllearn framework and hence supports NumPy, Pandas as well as PySpark DataFrame.</p>
<h3 id="getting-started">Getting Started</h3>
<p>To create a Keras2DML object, one needs to create a Keras model through the Funcitonal API. please see the <a href="https://keras.io/models/model/">Functional API.</a>
This module utilizes the existing <a href="beginners-guide-caffe2dml">Caffe2DML</a> backend to convert Keras models into DML. Keras models are
parsed and translated into Caffe prototext and caffemodel files which are then piped into Caffe2DML. Thus one can follow the Caffe2DML
documentation for further information.</p>
<h3 id="model-conversion">Model Conversion</h3>
<p>Keras models are parsed based on their layer structure and corresponding weights and translated into the relative Caffe layer and weight
configuration. Be aware that currently this is a translation into Caffe and there will be loss of information from keras models such as
intializer information, and other layers which do not exist in Caffe.</p>
<p>To create a Keras2DML object, simply pass the keras object to the Keras2DML constructor. It&#8217;s also important to note that your models
should be compiled so that the loss can be accessed for Caffe2DML</p>
<p>```python
from systemml.mllearn import Keras2DML
import keras
from keras.applications.resnet50 import preprocess_input, decode_predictions, ResNet50</p>
<p>keras_model = ResNet50(weights=&#8217;imagenet&#8217;,include_top=True,pooling=&#8217;None&#8217;,input_shape=(224,224,3))
keras_model.compile(optimizer=&#8217;sgd&#8217;, loss= &#8216;categorical_crossentropy&#8217;)</p>
<p>sysml_model = Keras2DML(spark, keras_model,input_shape=(3,224,224))
sysml_model.summary()
```</p>
<h1 id="frequently-asked-questions">Frequently asked questions</h1>
<h4 id="what-is-the-mapping-between-keras-parameters-and-caffes-solver-specification-">What is the mapping between Keras&#8217; parameters and Caffe&#8217;s solver specification ?</h4>
<table>
<thead>
<tr>
<th>&#160;</th>
<th>Specified via the given parameter in the Keras2DML constructor</th>
<th>From input Keras&#8217; model</th>
<th>Corresponding parameter in the Caffe solver file</th>
</tr>
</thead>
<tbody>
<tr>
<td>Solver type</td>
<td>&#160;</td>
<td><code>type(keras_model.optimizer)</code>. Supported types: <code>keras.optimizers.{SGD, Adagrad, Adam}</code></td>
<td><code>type</code></td>
</tr>
<tr>
<td>Maximum number of iterations</td>
<td><code>max_iter</code></td>
<td>The <code>epoch</code> parameter in the <code>fit</code> method is not supported.</td>
<td><code>max_iter</code></td>
</tr>
<tr>
<td>Validation dataset</td>
<td><code>test_iter</code> (explained in the below section)</td>
<td>The <code>validation_data</code> parameter in the <code>fit</code> method is not supported.</td>
<td><code>test_iter</code></td>
</tr>
<tr>
<td>Monitoring the loss</td>
<td><code>display, test_interval</code> (explained in the below section)</td>
<td>The <code>LossHistory</code> callback in the <code>fit</code> method is not supported.</td>
<td><code>display, test_interval</code></td>
</tr>
<tr>
<td>Learning rate schedule</td>
<td><code>lr_policy</code></td>
<td>The <code>LearningRateScheduler</code> callback in the <code>fit</code> method is not supported.</td>
<td><code>lr_policy</code> (default: step)</td>
</tr>
<tr>
<td>Base learning rate</td>
<td>&#160;</td>
<td><code>keras_model.optimizer.lr</code></td>
<td><code>base_lr</code></td>
</tr>
<tr>
<td>Learning rate decay over each update</td>
<td>&#160;</td>
<td><code>keras_model.optimizer.decay</code></td>
<td><code>gamma</code></td>
</tr>
<tr>
<td>Global regularizer to use for all layers</td>
<td><code>regularization_type,weight_decay</code></td>
<td>The current version of Keras2DML doesnot support custom regularizers per layer.</td>
<td><code>regularization_type,weight_decay</code></td>
</tr>
<tr>
<td>If type of the optimizer is <code>keras.optimizers.SGD</code></td>
<td>&#160;</td>
<td><code>momentum, nesterov</code></td>
<td><code>momentum, type</code></td>
</tr>
<tr>
<td>If type of the optimizer is <code>keras.optimizers.Adam</code></td>
<td>&#160;</td>
<td><code>beta_1, beta_2, epsilon</code>. The parameter <code>amsgrad</code> is not supported.</td>
<td><code>momentum, momentum2, delta</code></td>
</tr>
<tr>
<td>If type of the optimizer is <code>keras.optimizers.Adagrad</code></td>
<td>&#160;</td>
<td><code>epsilon</code></td>
<td><code>delta</code></td>
</tr>
</tbody>
</table>
<h4 id="how-do-i-specify-the-batch-size-and-the-number-of-epochs-">How do I specify the batch size and the number of epochs ?</h4>
<p>Since Keras2DML is a mllearn API, it doesnot accept the batch size and number of epochs as the parameter in the <code>fit</code> method.
Instead, these parameters are passed via <code>batch_size</code> and <code>max_iter</code> parameters in the Keras2DML constructor.
For example, the equivalent Python code for <code>keras_model.fit(features, labels, epochs=10, batch_size=64)</code> is as follows:</p>
<p><code>python
from systemml.mllearn import Keras2DML
epochs = 10
batch_size = 64
num_samples = features.shape[0]
max_iter = int(epochs*math.ceil(num_samples/batch_size))
sysml_model = Keras2DML(spark, keras_model, batch_size=batch_size, max_iter=max_iter, ...)
sysml_model.fit(features, labels)
</code></p>
<h4 id="what-optimizer-and-loss-does-keras2dml-use-by-default-if-kerasmodel-is-not-compiled-">What optimizer and loss does Keras2DML use by default if <code>keras_model</code> is not compiled ?</h4>
<p>If the user does not <code>compile</code> the keras model, then we use cross entropy loss and SGD optimizer with nesterov momentum:</p>
<p><code>python
keras_model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.SGD(lr=0.01, momentum=0.95, decay=5e-4, nesterov=True))
</code></p>
<h4 id="what-is-the-learning-rate-schedule-used-">What is the learning rate schedule used ?</h4>
<p>Keras2DML does not support the <code>LearningRateScheduler</code> callback.
Instead one can set the custom learning rate schedule to one of the following schedules by using the <code>lr_policy</code> parameter of the constructor:
- <code>step</code>: return <code>base_lr * gamma ^ (floor(iter / step))</code> (default schedule)
- <code>fixed</code>: always return <code>base_lr</code>.
- <code>exp</code>: return <code>base_lr * gamma ^ iter</code>
- <code>inv</code>: return <code>base_lr * (1 + gamma * iter) ^ (- power)</code>
- <code>poly</code>: the effective learning rate follows a polynomial decay, to be zero by the max_iter. return <code>base_lr (1 - iter/max_iter) ^ (power)</code>
- <code>sigmoid</code>: the effective learning rate follows a sigmod decay return b<code>ase_lr ( 1/(1 + exp(-gamma * (iter - stepsize))))</code></p>
<h4 id="how-to-set-the-size-of-the-validation-dataset-">How to set the size of the validation dataset ?</h4>
<p>The size of the validation dataset is determined by the parameters <code>test_iter</code> and the batch size. For example: If the batch size is 64 and
<code>test_iter</code> is set to 10 in the <code>Keras2DML</code>&#8217;s constructor, then the validation size is 640. This setting generates following DML code internally:</p>
<p><code>python
num_images = nrow(y_full)
BATCH_SIZE = 64
num_validation = 10 * BATCH_SIZE
X = X_full[(num_validation+1):num_images,]; y = y_full[(num_validation+1):num_images,]
X_val = X_full[1:num_validation,]; y_val = y_full[1:num_validation,]
num_images = nrow(y)
</code></p>
<h4 id="how-to-monitor-loss-via-command-line-">How to monitor loss via command-line ?</h4>
<p>To monitor loss, please set the parameters <code>display</code>, <code>test_iter</code> and <code>test_interval</code> in the <code>Keras2DML</code>&#8217;s constructor.<br />
For example: for the expression <code>Keras2DML(..., display=100, test_iter=10, test_interval=500)</code>, we
- display the training loss and accuracy every 100 iterations and
- carry out validation every 500 training iterations and display validation loss and accuracy.</p>
</div> <!-- /container -->
<script src="js/vendor/jquery-1.12.0.min.js"></script>
<script src="js/vendor/bootstrap.min.js"></script>
<script src="js/vendor/anchor.min.js"></script>
<script src="js/main.js"></script>
<!-- Analytics -->
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-71553733-1', 'auto');
ga('send', 'pageview');
</script>
<!-- MathJax Section -->
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
</script>
<script>
// Note that we load MathJax this way to work with local file (file://), HTTP and HTTPS.
// We could use "//cdn.mathjax...", but that won't support "file://".
(function(d, script) {
script = d.createElement('script');
script.type = 'text/javascript';
script.async = true;
script.onload = function(){
MathJax.Hub.Config({
tex2jax: {
inlineMath: [ ["$", "$"], ["\\\\(","\\\\)"] ],
displayMath: [ ["$$","$$"], ["\\[", "\\]"] ],
processEscapes: true,
skipTags: ['script', 'noscript', 'style', 'textarea', 'pre']
}
});
};
script.src = ('https:' == document.location.protocol ? 'https://' : 'http://') +
'cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
d.getElementsByTagName('head')[0].appendChild(script);
}(document));
</script>
</body>
</html>