docs/v1.12/group__grp__nn.html - madlib-site - Git at Google

 <!-- HTML header for doxygen 1.8.4-->
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
 <meta http-equiv="X-UA-Compatible" content="IE=9"/>
 <meta name="generator" content="Doxygen 1.8.13"/>
 <meta name="keywords" content="madlib,postgres,greenplum,machine learning,data mining,deep learning,ensemble methods,data science,market basket analysis,affinity analysis,pca,lda,regression,elastic net,huber white,proportional hazards,k-means,latent dirichlet allocation,bayes,support vector machines,svm"/>
 <title>MADlib: Neural Network</title>
 <link href="tabs.css" rel="stylesheet" type="text/css"/>
 <script type="text/javascript" src="jquery.js"></script>
 <script type="text/javascript" src="dynsections.js"></script>
 <link href="navtree.css" rel="stylesheet" type="text/css"/>
 <script type="text/javascript" src="resize.js"></script>
 <script type="text/javascript" src="navtreedata.js"></script>
 <script type="text/javascript" src="navtree.js"></script>
 <script type="text/javascript">
   $(document).ready(initResizable);
 </script>
 <link href="search/search.css" rel="stylesheet" type="text/css"/>
 <script type="text/javascript" src="search/searchdata.js"></script>
 <script type="text/javascript" src="search/search.js"></script>
 <script type="text/javascript">
   $(document).ready(function() { init_search(); });
 </script>
 <script type="text/x-mathjax-config">
   MathJax.Hub.Config({
     extensions: ["tex2jax.js", "TeX/AMSmath.js", "TeX/AMSsymbols.js"],
     jax: ["input/TeX","output/HTML-CSS"],
 });
 </script><script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js"></script>
 <!-- hack in the navigation tree -->
 <script type="text/javascript" src="eigen_navtree_hacks.js"></script>
 <link href="doxygen.css" rel="stylesheet" type="text/css" />
 <link href="madlib_extra.css" rel="stylesheet" type="text/css"/>
 </head>
 <body>
 <div id="top"><!-- do not remove this div, it is closed by doxygen! -->
 <div id="titlearea">
 <table cellspacing="0" cellpadding="0">
  <tbody>
  <tr style="height: 56px;">
   <td id="projectlogo"><a href="http://madlib.apache.org"><img alt="Logo" src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td>
   <td style="padding-left: 0.5em;">
    <div id="projectname">
    <span id="projectnumber">1.12</span>
    </div>
    <div id="projectbrief">User Documentation for MADlib</div>
   </td>
    <td>        <div id="MSearchBox" class="MSearchBoxInactive">
         <span class="left">
           <img id="MSearchSelect" src="search/mag_sel.png"
                onmouseover="return searchBox.OnSearchSelectShow()"
                onmouseout="return searchBox.OnSearchSelectHide()"
                alt=""/>
           <input type="text" id="MSearchField" value="Search" accesskey="S"
                onfocus="searchBox.OnSearchFieldFocus(true)"
                onblur="searchBox.OnSearchFieldFocus(false)"
                onkeyup="searchBox.OnSearchFieldChange(event)"/>
           </span><span class="right">
             <a id="MSearchClose" href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" border="0" src="search/close.png" alt=""/></a>
           </span>
         </div>
 </td>
  </tr>
  </tbody>
 </table>
 </div>
 <!-- end header part -->
 <!-- Generated by Doxygen 1.8.13 -->
 <script type="text/javascript">
 var searchBox = new SearchBox("searchBox", "search",false,'Search');
 </script>
 </div><!-- top -->
 <div id="side-nav" class="ui-resizable side-nav-resizable">
   <div id="nav-tree">
     <div id="nav-tree-contents">
       <div id="nav-sync" class="sync"></div>
     </div>
   </div>
   <div id="splitbar" style="-moz-user-select:none;"
        class="ui-resizable-handle">
   </div>
 </div>
 <script type="text/javascript">
 $(document).ready(function(){initNavTree('group__grp__nn.html','');});
 </script>
 <div id="doc-content">
 <!-- window showing the filter options -->
 <div id="MSearchSelectWindow"
      onmouseover="return searchBox.OnSearchSelectShow()"
      onmouseout="return searchBox.OnSearchSelectHide()"
      onkeydown="return searchBox.OnSearchSelectKey(event)">
 </div>

 <!-- iframe showing the search results (closed by default) -->
 <div id="MSearchResultsWindow">
 <iframe src="javascript:void(0)" frameborder="0"
         name="MSearchResults" id="MSearchResults">
 </iframe>
 </div>

 <div class="header">
   <div class="headertitle">
 <div class="title">Neural Network<div class="ingroups"><a class="el" href="group__grp__super.html">Supervised Learning</a></div></div>  </div>
 </div><!--header-->
 <div class="contents">
 <div class="toc"><b>Contents</b><ul>
 <li class="level1">
 <a href="#mlp_classification">Classification</a> </li>
 <li class="level1">
 <a href="#mlp_regression">Regression</a> </li>
 <li class="level1">
 <a href="#optimizer_params">Optimizer Parameters</a> </li>
 <li class="level1">
 <a href="#predict">Prediction Functions</a> </li>
 <li class="level1">
 <a href="#example">Examples</a> </li>
 <li class="level1">
 <a href="#background">Technical Background</a> </li>
 <li class="level1">
 <a href="#literature">Literature</a> </li>
 <li class="level1">
 <a href="#related">Related Topics</a> </li>
 </ul>
 </div><p>Multilayer Perceptron (MLP) is a type of neural network that can be used for regression and classification.</p>
 <p>Also called "vanilla neural networks", MLPs consist of several fully connected hidden layers with non-linear activation functions. In the case of classification, the final layer of the neural net has as many nodes as classes, and the output of the neural net can be interpreted as the probability that a given input feature belongs to a specific class.</p>
 <p><a class="anchor" id="mlp_classification"></a></p><dl class="section user"><dt>Classification Training Function</dt><dd>The MLP classification training function has the following format:</dd></dl>
 <pre class="syntax">
 mlp_classification(
     source_table,
     output_table,
     independent_varname,
     dependent_varname,
     hidden_layer_sizes,
     optimizer_params,
     activation,
     weights,
     warm_start,
     verbose
     )
 </pre><p><b>Arguments</b> </p><dl class="arglist">
 <dt>source_table </dt>
 <dd><p class="startdd">TEXT. Name of the table containing the training data.</p>
 <p class="enddd"></p>
 </dd>
 <dt>output_table </dt>
 <dd><p class="startdd">TEXT. Name of the output table containing the model. Details of the output table are shown below. </p>
 <p class="enddd"></p>
 </dd>
 <dt>independent_varname </dt>
 <dd><p class="startdd">TEXT. Expression list to evaluate for the independent variables.</p>
 <dl class="section note"><dt>Note</dt><dd>Please note that an intercept variable should not be included as part of this expression - this is different from other MADlib modules. Also please note that <b>independent variables should be encoded properly.</b> All values are cast to DOUBLE PRECISION, so categorical variables should be one-hot or dummy encoded as appropriate. See <a href="group__grp__encode__categorical.html">Encoding Categorical Variables</a> for more details on how to do this. </dd></dl>
 </dd>
 <dt>dependent_varname </dt>
 <dd><p class="startdd">TEXT. Name of the dependent variable column. For classification, supported types are: text, varchar, character varying, char, character integer, smallint, bigint, and boolean. </p>
 <p class="enddd"></p>
 </dd>
 <dt>hidden_layer_sizes (optional) </dt>
 <dd><p class="startdd">INTEGER[], default: ARRAY[100]. The number of neurons in each hidden layer. The length of this array will determine the number of hidden layers. For example, ARRAY[5,10] means 2 hidden layers, one with 5 neurons and the other with 10 neurons. Use ARRAY[]::INTEGER[] for no hidden layers. </p>
 <p class="enddd"></p>
 </dd>
 <dt>optimizer_params (optional) </dt>
 <dd><p class="startdd">TEXT, default: NULL. Parameters for optimization in a comma-separated string of key-value pairs. See the description below for details. </p>
 <p class="enddd"></p>
 </dd>
 <dt>activation (optional) </dt>
 <dd><p class="startdd">TEXT, default: 'sigmoid'. Activation function. Currently three functions are supported: 'sigmoid' (default), 'relu', and 'tanh'. The text can be any prefix of the three strings; for e.g., specifying 's' will use sigmoid activation. </p>
 <p class="enddd"></p>
 </dd>
 <dt>weights (optional) </dt>
 <dd><p class="startdd">TEXT, default: 1. Weights for input rows. Column name which specifies the weight for each input row. This weight will be incorporated into the update during stochastic gradient descent (SGD), but will not be used for loss calculations. If not specified, weight for each row will default to 1 (equal weights). Column should be a numeric type. </p>
 <p class="enddd"></p>
 </dd>
 <dt>warm_start (optional) </dt>
 <dd><p class="startdd">BOOLEAN, default: FALSE. Initalize weights with the coefficients from the last call of the training function. If set to true, weights will be initialized from the output_table generated by the previous run. Note that all parameters other than optimizer_params and verbose must remain constant between calls when warm_start is used.</p>
 <dl class="section note"><dt>Note</dt><dd>The warm start feature works based on the name of the output_table. When using warm start, do not drop the output table or the output table summary before calling the training function, since these are needed to obtain the weights from the previous run. If you are not using warm start, the output table and the output table summary must be dropped in the usual way before calling the training function. </dd></dl>
 </dd>
 <dt>verbose (optional) </dt>
 <dd>BOOLEAN, default: FALSE. Provides verbose output of the results of training, including the value of loss at each iteration. </dd>
 </dl>
 <p><b>Output tables</b> <br />
  The model table produced by MLP contains the following columns: </p><table class="output">
 <tr>
 <th>coeffs </th><td>FLOAT8[]. Flat array containing the weights of the neural net.  </td></tr>
 <tr>
 <th>n_iterations </th><td>INTEGER. Number of iterations completed by the stochastic gradient descent algorithm. The algorithm either converged in this number of iterations or hit the maximum number specified in the optimization parameters.   </td></tr>
 <tr>
 <th>loss </th><td>FLOAT8. The cross entropy over the training data. See Technical Background section below for more details.  </td></tr>
 </table>
 <p>A summary table named &lt;output_table&gt;_summary is also created, which has the following columns: </p><table class="output">
 <tr>
 <th>source_table </th><td>The source table.  </td></tr>
 <tr>
 <th>independent_varname </th><td>The independent variables.  </td></tr>
 <tr>
 <th>dependent_varname </th><td>The dependent variable.  </td></tr>
 <tr>
 <th>tolerance </th><td>The tolerance as given in optimizer_params.  </td></tr>
 <tr>
 <th>learning_rate_init </th><td>The initial learning rate as given in optimizer_params.  </td></tr>
 <tr>
 <th>learning_rate_policy </th><td>The learning rate policy as given in optimizer_params.  </td></tr>
 <tr>
 <th>n_iterations </th><td>The number of iterations run.  </td></tr>
 <tr>
 <th>n_tries </th><td>The number of tries as given in optimizer_params.  </td></tr>
 <tr>
 <th>layer_sizes </th><td>The number of units in each layer including the input and output layers.  </td></tr>
 <tr>
 <th>activation </th><td>The activation function.  </td></tr>
 <tr>
 <th>is_classification </th><td>True if the model was trained for classification, False if it was trained for regression.  </td></tr>
 <tr>
 <th>classes </th><td>The classes which were trained against (empty for regression).  </td></tr>
 <tr>
 <th>weights </th><td>The weight column used during training.  </td></tr>
 <tr>
 <th>x_means </th><td>The mean for all input features (used for normalization).  </td></tr>
 <tr>
 <th>x_stds </th><td><p class="starttd">The standard deviation for all input features (used for normalization). </p>
 <p class="endtd"></p>
 </td></tr>
 </table>
 <p><a class="anchor" id="mlp_regression"></a></p><dl class="section user"><dt>Regression Training Function</dt><dd>The MLP regression training function has the following format: <pre class="syntax">
 mlp_regression(
     source_table,
     output_table,
     independent_varname,
     dependent_varname,
     hidden_layer_sizes,
     optimizer_params,
     activation,
     weights,
     warm_start,
     verbose
     )
 </pre></dd></dl>
 <p><b>Arguments</b> </p>
 <p>Parameters for regression are largely the same as for classification. In the model table, the loss refers to mean square error instead of cross entropy. In the summary table, there is no classes column. The following arguments have specifications which differ from mlp_classification: </p><dl class="arglist">
 <dt>dependent_varname </dt>
 <dd>TEXT. Name of the dependent variable column. For regression, supported types are any numeric type, or array of numeric types (for multiple regression).  </dd>
 </dl>
 <p><a class="anchor" id="optimizer_params"></a></p><dl class="section user"><dt>Optimizer Parameters</dt><dd>Parameters in this section are supplied in the <em>optimizer_params</em> argument as a string containing a comma-delimited list of name-value pairs. All of these named parameters are optional and their order does not matter. You must use the format "&lt;param_name&gt; = &lt;value&gt;" to specify the value of a parameter, otherwise the parameter is ignored.</dd></dl>
 <pre class="syntax">
   'learning_rate_init = &lt;value&gt;,
    learning_rate_policy = &lt;value&gt;,
    gamma = &lt;value&gt;,
    power = &lt;value&gt;,
    iterations_per_step = &lt;value&gt;,
    n_iterations = &lt;value&gt;,
    n_tries = &lt;value&gt;,
    lambda = &lt;value&gt;,
    tolerance = &lt;value&gt;'
 </pre><p> <b>Optimizer</b> <b>Parameters</b> </p><dl class="arglist">
 <dt>learning_rate_init </dt>
 <dd><p class="startdd">Default: 0.001. Also known as the learning rate. A small value is usually desirable to ensure convergence, while a large value provides more room for progress during training. Since the best value depends on the condition number of the data, in practice one often tunes this parameter. </p>
 <p class="enddd"></p>
 </dd>
 <dt>learning_rate_policy </dt>
 <dd><p class="startdd">Default: constant. One of 'constant', 'exp', 'inv' or 'step' or any prefix of these (e.g., 's' means 'step'). These are defined below, where 'iter' is the current iteration of SGD:</p><ul>
 <li>'constant': learning_rate = learning_rate_init</li>
 <li>'exp': learning_rate = learning_rate_init * gamma^(iter)</li>
 <li>'inv': learning_rate = learning_rate_init * (iter+1)^(-power)</li>
 <li>'step': learning_rate = learning_rate_init * gamma^(floor(iter/iterations_per_step)) </li>
 </ul>
 <p class="enddd"></p>
 </dd>
 <dt>gamma </dt>
 <dd><p class="startdd">Default: 0.1. Decay rate for learning rate when learning_rate_policy is 'exp' or 'step'. </p>
 <p class="enddd"></p>
 </dd>
 <dt>power </dt>
 <dd><p class="startdd">Default: 0.5. Exponent for learning_rate_policy = 'inv'. </p>
 <p class="enddd"></p>
 </dd>
 <dt>iterations_per_step </dt>
 <dd><p class="startdd">Default: 100. Number of iterations to run before decreasing the learning rate by a factor of gamma. Valid for learning rate policy = 'step'. </p>
 <p class="enddd"></p>
 </dd>
 <dt>n_iterations </dt>
 <dd><p class="startdd">Default: 100. The maximum number of iterations allowed. </p>
 <p class="enddd"></p>
 </dd>
 <dt>n_tries </dt>
 <dd><p class="startdd">Default: 1. Number of times to retrain the network with randomly initialized weights. </p>
 <p class="enddd"></p>
 </dd>
 <dt>lambda </dt>
 <dd><p class="startdd">Default: 0. The regularization coefficient for L2 regularization. </p>
 <p class="enddd"></p>
 </dd>
 <dt>tolerance </dt>
 <dd><p class="startdd">Default: 0.001. The criterion to end iterations. The training stops whenever the difference between the training models of two consecutive iterations is smaller than <em>tolerance</em> or the iteration number is larger than <em>n_iterations</em>. If you want to run the full number of iterations specified in <em>n_interations</em>, set tolerance=0.0 </p>
 <p class="enddd"></p>
 </dd>
 </dl>
 <p><a class="anchor" id="predict"></a></p><dl class="section user"><dt>Prediction Function</dt><dd>Used to generate predictions on novel data given a previously trained model. The same syntax is used for classification and regression. <pre class="syntax">
 mlp_predict(
     model_table,
     data_table,
     id_col_name,
     output_table,
     pred_type
     )
 </pre></dd></dl>
 <p><b>Arguments</b> </p><dl class="arglist">
 <dt>model_table </dt>
 <dd><p class="startdd">TEXT. Model table produced by the training function.</p>
 <p class="enddd"></p>
 </dd>
 <dt>data_table </dt>
 <dd><p class="startdd">TEXT. Name of the table containing the data for prediction. This table is expected to contain the same input features that were used during training. The table should also contain id_col_name used for identifying each row.</p>
 <p class="enddd"></p>
 </dd>
 <dt>id_col_name </dt>
 <dd><p class="startdd">TEXT. The name of the id column in data_table.</p>
 <p class="enddd"></p>
 </dd>
 <dt>output_table </dt>
 <dd>TEXT. Name of the table where output predictions are written. If this table name is already in use, an error is returned. Table contains: <table class="output">
 <tr>
 <th>id </th><td>Gives the 'id' for each prediction, corresponding to each row from the data_table.  </td></tr>
 <tr>
 <th>estimated_COL_NAME </th><td>(For pred_type='response') The estimated class for classification or value for regression, where COL_NAME is the name of the column to be predicted from training data.   </td></tr>
 <tr>
 <th>prob_CLASS </th><td><p class="starttd">(For pred_type='prob' for classification) The probability of a given class CLASS as given by softmax. There will be one column for each class in the training data.  </p>
 <p class="endtd"></p>
 </td></tr>
 </table>
 </dd>
 <dt>pred_type </dt>
 <dd>TEXT. The type of output requested: 'response' gives the actual prediction, 'prob' gives the probability of each class. For regression, only type='response' is defined.  </dd>
 </dl>
 <p><a class="anchor" id="example"></a></p><dl class="section user"><dt>Examples</dt><dd><ol type="1">
 <li>Create an input data set. <pre class="example">
 DROP TABLE IF EXISTS iris_data;
 CREATE TABLE iris_data(
     id integer,
     attributes numeric[],
     class_text varchar,
     class integer
 );
 INSERT INTO iris_data VALUES
 (1,ARRAY[5.1,3.5,1.4,0.2],'Iris-setosa',1),
 (2,ARRAY[4.9,3.0,1.4,0.2],'Iris-setosa',1),
 (3,ARRAY[4.7,3.2,1.3,0.2],'Iris-setosa',1),
 (4,ARRAY[4.6,3.1,1.5,0.2],'Iris-setosa',1),
 (5,ARRAY[5.0,3.6,1.4,0.2],'Iris-setosa',1),
 (6,ARRAY[5.4,3.9,1.7,0.4],'Iris-setosa',1),
 (7,ARRAY[4.6,3.4,1.4,0.3],'Iris-setosa',1),
 (8,ARRAY[5.0,3.4,1.5,0.2],'Iris-setosa',1),
 (9,ARRAY[4.4,2.9,1.4,0.2],'Iris-setosa',1),
 (10,ARRAY[4.9,3.1,1.5,0.1],'Iris-setosa',1),
 (11,ARRAY[7.0,3.2,4.7,1.4],'Iris-versicolor',2),
 (12,ARRAY[6.4,3.2,4.5,1.5],'Iris-versicolor',2),
 (13,ARRAY[6.9,3.1,4.9,1.5],'Iris-versicolor',2),
 (14,ARRAY[5.5,2.3,4.0,1.3],'Iris-versicolor',2),
 (15,ARRAY[6.5,2.8,4.6,1.5],'Iris-versicolor',2),
 (16,ARRAY[5.7,2.8,4.5,1.3],'Iris-versicolor',2),
 (17,ARRAY[6.3,3.3,4.7,1.6],'Iris-versicolor',2),
 (18,ARRAY[4.9,2.4,3.3,1.0],'Iris-versicolor',2),
 (19,ARRAY[6.6,2.9,4.6,1.3],'Iris-versicolor',2),
 (20,ARRAY[5.2,2.7,3.9,1.4],'Iris-versicolor',2);
 </pre></li>
 <li>Generate a multilayer perceptron with a single hidden layer of 5 units. Use the attributes column as the independent variables, and use the class column as the classification. Set the tolerance to 0 so that 500 iterations will be run. Use a hyperbolic tangent activation function. The model will be written to mlp_model. <pre class="example">
 DROP TABLE IF EXISTS mlp_model, mlp_model_summary;
 -- Set seed so results are reproducible
 SELECT setseed(0);
 SELECT madlib.mlp_classification(
     'iris_data',      -- Source table
     'mlp_model',      -- Destination table
     'attributes',     -- Input features
     'class_text',     -- Label
     ARRAY[5],         -- Number of units per layer
     'learning_rate_init=0.003,
     n_iterations=500,
     tolerance=0',     -- Optimizer params
     'tanh',           -- Activation function
     NULL,             -- Default weight (1)
     FALSE,            -- No warm start
     TRUE              -- Verbose
 );
 </pre></li>
 <li>View the classification model. <pre class="example">
 -- Set extended display on for easier reading of output
 \x ON
 -- Results may vary depending on platform
 SELECT * FROM mlp_model;
 </pre> Result: <pre class="result">
 [ RECORD 1 ]--+---------------------------------------------------------------------------------------
 coeff          | {-0.172392477419,-0.0836446652758,-0.0162194484142,-0.647268294231,-0.504884325538...
 loss           | 0.0136695756314
 num_iterations | 500
 </pre></li>
 <li>Next train a regression example. This dataset contains housing prices. <pre class="example">
 DROP TABLE IF EXISTS lin_housing;
 CREATE TABLE lin_housing (id serial,
                           x float8[],
                           grp_by_col int,
                           y float8);
 INSERT INTO lin_housing VALUES
 (1,ARRAY[0.00632,18.00,2.310,0,0.5380,6.5750,65.20,4.0900,1,296.0,15.30,396.90,4.98],1,24.00),
 (2,ARRAY[0.02731,0.00,7.070,0,0.4690,6.4210,78.90,4.9671,2,242.0,17.80,396.90,9.14],1,21.60),
 (3,ARRAY[0.02729,0.00,7.070,0,0.4690,7.1850,61.10,4.9671,2,242.0,17.80,392.83,4.03],1,34.70),
 (4,ARRAY[0.03237,0.00,2.180,0,0.4580,6.9980,45.80,6.0622,3,222.0,18.70,394.63,2.94],1,33.40),
 (5,ARRAY[0.06905,0.00,2.180,0,0.4580,7.1470,54.20,6.0622,3,222.0,18.70,396.90,5.33],1,36.20),
 (6,ARRAY[0.02985,0.00,2.180,0,0.4580,6.4300,58.70,6.0622,3,222.0,18.70,394.12,5.21],1,28.70),
 (7,ARRAY[0.08829,12.50,7.870,0,0.5240,6.0120,66.60,5.5605,5,311.0,15.20,395.60,12.43],1,22.90),
 (8,ARRAY[0.14455,12.50,7.870,0,0.5240,6.1720,96.10,5.9505,5,311.0,15.20,396.90,19.15],1,27.10),
 (9,ARRAY[0.21124,12.50,7.870,0,0.5240,5.6310,100.00,6.0821,5,311.0,15.20,386.63,29.93],1,16.50),
 (10,ARRAY[0.17004,12.50,7.870,0,0.5240,6.0040,85.90,6.5921,5,311.0,15.20,386.71,17.10],1,18.90),
 (11,ARRAY[0.22489,12.50,7.870,0,0.5240,6.3770,94.30,6.3467,5,311.0,15.20,392.52,20.45],1,15.00),
 (12,ARRAY[0.11747,12.50,7.870,0,0.5240,6.0090,82.90,6.2267,5,311.0,15.20,396.90,13.27],1,18.90),
 (13,ARRAY[0.09378,12.50,7.870,0,0.5240,5.8890,39.00,5.4509,5,311.0,15.20,390.50,15.71],1,21.70),
 (14,ARRAY[0.62976,0.00,8.140,0,0.5380,5.9490,61.80,4.7075,4,307.0,21.00,396.90,8.26],1,20.40),
 (15,ARRAY[0.63796,0.00,8.140,0,0.5380,6.0960,84.50,4.4619,4,307.0,21.00,380.02,10.26],1,18.20),
 (16,ARRAY[0.62739,0.00,8.140,0,0.5380,5.8340,56.50,4.4986,4,307.0,21.00,395.62,8.47],1,19.90),
 (17,ARRAY[1.05393,0.00,8.140,0,0.5380,5.9350,29.30,4.4986,4,307.0,21.00,386.85,6.58],1, 23.10),
 (18,ARRAY[0.78420,0.00,8.140,0,0.5380,5.9900,81.70,4.2579,4,307.0,21.00,386.75,14.67],1,17.50),
 (19,ARRAY[0.80271,0.00,8.140,0,0.5380,5.4560,36.60,3.7965,4,307.0,21.00,288.99,11.69],1,20.20),
 (20,ARRAY[0.72580,0.00,8.140,0,0.5380,5.7270,69.50,3.7965,4,307.0,21.00,390.95,11.28],1,18.20);
 </pre></li>
 <li>Now train a regression model using a multilayer perceptron with 2 hidden layers of 25 nodes each. <pre class="example">
 DROP TABLE IF EXISTS mlp_regress, mlp_regress_summary;
 SELECT setseed(0);
 SELECT madlib.mlp_regression(
     'lin_housing',         -- Source table
     'mlp_regress',         -- Desination table
     'x',                   -- Input features
     'y',                   -- Dependent variable
     ARRAY[25,25],            -- Number of units per layer
     'learning_rate_init=0.001,
     n_iterations=500,
     lambda=0.001,
     tolerance=0',
     'relu',
     NULL,             -- Default weight (1)
     FALSE,            -- No warm start
     TRUE              -- Verbose
 );
 </pre></li>
 <li>View the regression model. <pre class="example">
 -- Set extended display on for easier reading of output.
 \x ON
 -- Results may vary depending on platform.
 SELECT * FROM mlp_regress;
 </pre> Result: <pre class="result">
 [ RECORD 1 ]--+-----------------------------------------------------------------------------------
 coeff          | {-0.135647108464,0.0315402969485,-0.117580589352,-0.23084537701,-0.10868726702...
 loss           | 0.114125125042
 num_iterations | 500
 </pre></li>
 <li>Now let's look at the prediction functions. In the following examples we will use the training data set for prediction as well, which is not usual but serves to show the syntax. First we will test the classification example. The prediction is in the the estimated_class_text column with the actual value in the class_text column. <pre class="example">
 DROP TABLE IF EXISTS mlp_prediction;
 SELECT madlib.mlp_predict(
          'mlp_model',         -- Model table
          'iris_data',         -- Test data table
          'id',                -- Id column in test table
          'mlp_prediction',    -- Output table for predictions
          'response'           -- Output classes, not probabilities
      );
 SELECT * FROM mlp_prediction JOIN iris_data USING (id) ORDER BY id;
 </pre> Result for the classification model: <pre class="result">
  id | estimated_class_text |    attributes     |   class_text    | class
 ----+----------------------+-------------------+-----------------+-------
   1 | Iris-setosa          | {5.1,3.5,1.4,0.2} | Iris-setosa     |     1
   2 | Iris-setosa          | {4.9,3.0,1.4,0.2} | Iris-setosa     |     1
   3 | Iris-setosa          | {4.7,3.2,1.3,0.2} | Iris-setosa     |     1
   4 | Iris-setosa          | {4.6,3.1,1.5,0.2} | Iris-setosa     |     1
   5 | Iris-setosa          | {5.0,3.6,1.4,0.2} | Iris-setosa     |     1
   6 | Iris-setosa          | {5.4,3.9,1.7,0.4} | Iris-setosa     |     1
   7 | Iris-setosa          | {4.6,3.4,1.4,0.3} | Iris-setosa     |     1
   8 | Iris-setosa          | {5.0,3.4,1.5,0.2} | Iris-setosa     |     1
   9 | Iris-setosa          | {4.4,2.9,1.4,0.2} | Iris-setosa     |     1
  10 | Iris-setosa          | {4.9,3.1,1.5,0.1} | Iris-setosa     |     1
  11 | Iris-versicolor      | {7.0,3.2,4.7,1.4} | Iris-versicolor |     2
  12 | Iris-versicolor      | {6.4,3.2,4.5,1.5} | Iris-versicolor |     2
  13 | Iris-versicolor      | {6.9,3.1,4.9,1.5} | Iris-versicolor |     2
  14 | Iris-versicolor      | {5.5,2.3,4.0,1.3} | Iris-versicolor |     2
  15 | Iris-versicolor      | {6.5,2.8,4.6,1.5} | Iris-versicolor |     2
  16 | Iris-versicolor      | {5.7,2.8,4.5,1.3} | Iris-versicolor |     2
  17 | Iris-versicolor      | {6.3,3.3,4.7,1.6} | Iris-versicolor |     2
  18 | Iris-versicolor      | {4.9,2.4,3.3,1.0} | Iris-versicolor |     2
  19 | Iris-versicolor      | {6.6,2.9,4.6,1.3} | Iris-versicolor |     2
  20 | Iris-versicolor      | {5.2,2.7,3.9,1.4} | Iris-versicolor |     2
 </pre> Count the missclassifications: <pre class="example">
 SELECT COUNT(*) FROM mlp_prediction JOIN iris_data USING (id)
 WHERE mlp_prediction.estimated_class_text != iris_data.class_text;
 </pre> <pre class="result">
  count
 -------+
      0
 </pre></li>
 <li>Prediction using the regression model: <pre class="example">
 DROP TABLE IF EXISTS mlp_regress_prediction;
 SELECT madlib.mlp_predict(
          'mlp_regress',               -- Model table
          'lin_housing',               -- Test data table
          'id',                        -- Id column in test table
          'mlp_regress_prediction',    -- Output table for predictions
          'response'                   -- Output values, not probabilities
      );
 SELECT *, ABS(y-estimated_y) as abs_diff FROM lin_housing
 JOIN mlp_regress_prediction USING (id) ORDER BY id;
 </pre> Result for the regression model: <pre class="result">
  id |                                   x                                   | grp_by_col |  y   |   estimated_y    |      abs_diff
 ----+-----------------------------------------------------------------------+------------+------+------------------+---------------------
   1 | {0.00632,18,2.31,0,0.538,6.575,65.2,4.09,1,296,15.3,396.9,4.98}       |          1 |   24 | 23.9976935779896 | 0.00230642201042741
   2 | {0.02731,0,7.07,0,0.469,6.421,78.9,4.9671,2,242,17.8,396.9,9.14}      |          1 | 21.6 | 22.0225551503712 |   0.422555150371196
   3 | {0.02729,0,7.07,0,0.469,7.185,61.1,4.9671,2,242,17.8,392.83,4.03}     |          1 | 34.7 | 34.3269436787012 |   0.373056321298805
   4 | {0.03237,0,2.18,0,0.458,6.998,45.8,6.0622,3,222,18.7,394.63,2.94}     |          1 | 33.4 | 34.7421700032985 |    1.34217000329847
   5 | {0.06905,0,2.18,0,0.458,7.147,54.2,6.0622,3,222,18.7,396.9,5.33}      |          1 | 36.2 | 35.1914922401243 |    1.00850775987566
   6 | {0.02985,0,2.18,0,0.458,6.43,58.7,6.0622,3,222,18.7,394.12,5.21}      |          1 | 28.7 | 29.5286073543722 |   0.828607354372203
   7 | {0.08829,12.5,7.87,0,0.524,6.012,66.6,5.5605,5,311,15.2,395.6,12.43}  |          1 | 22.9 | 23.2022360304219 |   0.302236030421945
   8 | {0.14455,12.5,7.87,0,0.524,6.172,96.1,5.9505,5,311,15.2,396.9,19.15}  |          1 | 27.1 | 23.3649065290002 |    3.73509347099978
   9 | {0.21124,12.5,7.87,0,0.524,5.631,100,6.0821,5,311,15.2,386.63,29.93}  |          1 | 16.5 | 17.7779926866502 |    1.27799268665021
  10 | {0.17004,12.5,7.87,0,0.524,6.004,85.9,6.5921,5,311,15.2,386.71,17.1}  |          1 | 18.9 | 13.9266690257803 |    4.97333097421974
  11 | {0.22489,12.5,7.87,0,0.524,6.377,94.3,6.3467,5,311,15.2,392.52,20.45} |          1 |   15 | 18.5049155838719 |    3.50491558387192
  12 | {0.11747,12.5,7.87,0,0.524,6.009,82.9,6.2267,5,311,15.2,396.9,13.27}  |          1 | 18.9 | 18.4287114359317 |    0.47128856406826
  13 | {0.09378,12.5,7.87,0,0.524,5.889,39,5.4509,5,311,15.2,390.5,15.71}    |          1 | 21.7 | 22.6228336114696 |   0.922833611469631
  14 | {0.62976,0,8.14,0,0.538,5.949,61.8,4.7075,4,307,21,396.9,8.26}        |          1 | 20.4 | 20.1083536059151 |   0.291646394084896
  15 | {0.63796,0,8.14,0,0.538,6.096,84.5,4.4619,4,307,21,380.02,10.26}      |          1 | 18.2 | 18.8935467873061 |   0.693546787306062
  16 | {0.62739,0,8.14,0,0.538,5.834,56.5,4.4986,4,307,21,395.62,8.47}       |          1 | 19.9 | 19.8383202293121 |  0.0616797706878742
  17 | {1.05393,0,8.14,0,0.538,5.935,29.3,4.4986,4,307,21,386.85,6.58}       |          1 | 23.1 |  23.160463540176 |  0.0604635401760412
  18 | {0.7842,0,8.14,0,0.538,5.99,81.7,4.2579,4,307,21,386.75,14.67}        |          1 | 17.5 | 16.8540384345856 |    0.64596156541436
  19 | {0.80271,0,8.14,0,0.538,5.456,36.6,3.7965,4,307,21,288.99,11.69}      |          1 | 20.2 | 20.3628760580577 |   0.162876058057684
  20 | {0.7258,0,8.14,0,0.538,5.727,69.5,3.7965,4,307,21,390.95,11.28}       |          1 | 18.2 | 18.1198369917265 |  0.0801630082734555
 (20 rows)
 </pre> RMS error: <pre class="example">
 SELECT SQRT(SUM(ABS(y-estimated_y))/COUNT(y)) as rms_error FROM lin_housing
 JOIN mlp_regress_prediction USING (id);
 </pre> <pre class="result">
     rms_error
 ------------------+
  1.02862119016012
 </pre> Note that the results you get for all examples may vary with the platform you are using.</li>
 </ol>
 </dd></dl>
 <p><a class="anchor" id="background"></a></p><dl class="section user"><dt>Technical Background</dt><dd></dd></dl>
 <p>To train a neural net, the loss function is minimized using stochastic gradient descent. In the case of classification, the loss function is cross entropy. For regression, mean square error is used. Weights in the neural net are updated via the backpropogation process, which uses dynamic programming to compute the partial derivative of each weight with respect to the overall loss. This partial derivative incorporates the activation function used, which requires that the activation function be differentiable.</p>
 <p>For an overview of multilayer perceptrons, see [1].</p>
 <p>For details on backpropogation, see [2].</p>
 <p><a class="anchor" id="literature"></a></p><dl class="section user"><dt>Literature</dt><dd></dd></dl>
 <p><a class="anchor" id="mlp-lit-1"></a>[1] "Multilayer Perceptron." Wikipedia. Wikimedia Foundation, 12 July 2017. Web. 12 July 2017.</p>
 <p>[2] Yu Hen Hu. "Lecture 11. MLP (III): Back-Propagation." University of Wisconsin Madison: Computer-Aided Engineering. Web. 12 July 2017, <a href="http://homepages.cae.wisc.edu/~ece539/videocourse/notes/pdf/lec%2011%20MLP%20(3)%20BP.pdf">http://homepages.cae.wisc.edu/~ece539/videocourse/notes/pdf/lec%2011%20MLP%20(3)%20BP.pdf</a></p>
 <p><a class="anchor" id="related"></a></p><dl class="section user"><dt>Related Topics</dt><dd></dd></dl>
 <p>File <a class="el" href="mlp_8sql__in.html" title="SQL functions for multilayer perceptron. ">mlp.sql_in</a> documenting the training function </p>
 </div><!-- contents -->
 </div><!-- doc-content -->
 <!-- start footer part -->
 <div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
   <ul>
     <li class="footer">Generated on Tue Aug 29 2017 09:14:35 for MADlib by
     <a href="http://www.doxygen.org/index.html">
     <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.13 </li>
   </ul>
 </div>
 </body>
 </html>