src/ports/postgres/modules/deep_learning/madlib_keras_fit_multiple_model.sql_in - madlib - Git at Google

 /* ----------------------------------------------------------------------- *//**
  *
  * Licensed to the Apache Software Foundation (ASF) under one
  * or more contributor license agreements.  See the NOTICE file
  * distributed with this work for additional information
  * regarding copyright ownership.  The ASF licenses this file
  * to you under the Apache License, Version 2.0 (the
  * "License"); you may not use this file except in compliance
  * with the License.  You may obtain a copy of the License at
  *
  *   http://www.apache.org/licenses/LICENSE-2.0
  *
  * Unless required by applicable law or agreed to in writing,
  * software distributed under the License is distributed on an
  * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  * KIND, either express or implied.  See the License for the
  * specific language governing permissions and limitations
  * under the License.
  *
  *
  * @file madlib_keras_model_selection.sql_in
  *
  * @brief SQL functions for model hopper distributed training
  * @date August 2019
  *
  *
  *//* ----------------------------------------------------------------------- */

 m4_include(`SQLCommon.m4')

 /**
 @addtogroup grp_keras_run_model_selection

 @brief Explore network architectures and hyperparameters by training many models a time.

 <div class="toc"><b>Contents</b><ul>
 <li class="level1"><a href="#keras_fit">Fit</a></li>
 <li class="level1"><a href="#keras_evaluate">Evaluate</a></li>
 <li class="level1"><a href="#keras_predict">Predict</a></li>
 <li class="level1"><a href="#example">Examples</a></li>
 <li class="level1"><a href="#notes">Notes</a></li>
 <li class="level1"><a href="#background">Technical Background</a></li>
 <li class="level1"><a href="#literature">Literature</a></li>
 <li class="level1"><a href="#related">Related Topics</a></li>
 </ul></div>

 \warning <em> This MADlib method is still in early stage development.
 Interface and implementation are subject to change. </em>

 This module allows you to explore network architectures and
 hyperparameters by training many models a time across the
 database cluster.  The aim is to support efficient empirical comparison of multiple
 training configurations. This process is called model selection,
 and the implementation here is based on a parallel execution strategy
 called model hopper parallelism (MOP) [1,2].

 Models are designed in Keras [3], which is a high-level neural
 network API written in Python.  It can run
 on top of different backends and the one that is currently
 supported by MADlib is TensorFlow [4].

 The main use case is image classification
 using sequential models, which are made up of a
 linear stack of layers.  This includes multilayer perceptrons (MLPs)
 and convolutional neural networks (CNNs).  Regression is not
 currently supported.

 Before doing model selection in MADlib you will need to run
 the mini-batch preprocessor, and create a table with the various models
 and hyperparameters to try.

 You can mini-batch the training and evaluation datasets by using the
 <a href="group__grp__input__preprocessor__dl.html">Preprocessor
 for Images</a> which is a utility that prepares image data for
 use by models that support mini-batch as an optimization option.
 This is a one-time operation and you would only
 need to re-run the preprocessor if your input data has changed.
 The advantage of using mini-batching is that it
 can perform better than stochastic gradient descent
 because it uses more than one training example at a time,
 typically resulting faster and smoother convergence [5].
 The input preprocessor also sets the distribution rules
 for the training data.  For example, you may only want
 to train models on segments that reside on hosts that are GPU enabled.

 You can set up the models and hyperparameters to try with the
 <a href="group__grp__keras__setup__model__selection.html">Setup
 Model Selection</a> utility to define the unique combinations
 of model architectures, compile and fit parameters.

 @note 1. If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and some versions
 of GPDB 6, the database will
 keep adding to the disk space (in proportion to model size) and will only
 release the disk space once the fit multiple query has completed execution.
 This is not the case for GPDB 6.5.0+ where disk space is released during the
 fit multiple query.

 @note 2. CUDA GPU memory cannot be released until the process holding it is terminated.
 When a MADlib deep learning function is called with GPUs, Greenplum internally
 creates a process (called a slice) which calls TensorFlow to do the computation.
 This process holds the GPU memory until one of the following two things happen:
 query finishes and user logs out of the Postgres client/session; or,
 query finishes and user waits for the timeout set by gp_vmem_idle_resource_timeout.
 The default value for this timeout is 18 sec [8].  So the recommendation is:
 log out/reconnect to the session after every GPU query; or
 wait for gp_vmem_idle_resource_timeout before you run another GPU query (you can
 also set it to a lower value).

 @anchor keras_fit
 @par Fit
 The fit (training) function has the following format:

 <pre class="syntax">
 madlib_keras_fit_multiple_model(
     source_table,
     model_output_table,
     model_selection_table,
     num_iterations,
     use_gpus,
     validation_table,
     metrics_compute_frequency,
     warm_start,
     name,
     description,
     use_caching
     )
 </pre>

 \b Arguments
 <dl class="arglist">
   <dt>source_table</dt>
   <dd>TEXT. Name of the table containing the training data.
   This is the name of the output
   table from the image preprocessor.  Independent
   and dependent variables are specified in the preprocessor
   step which is why you do not need to explictly state
   them here as part of the fit function.</dd>

   <dt>model_output_table</dt>
   <dd>TEXT. Name of the output table containing the
   multiple models created.
   @note pg_temp is not allowed as an output table schema for fit multiple.
   Details of output tables are shown below.
   </dd>

   <dt>model_selection_table</dt>
   <dd>TEXT. Name of the table containing model selection parameters to be tried.
   Here we mean both hyper-parameter tuning and model architecture search.
   </dd>

   <DT>num_iterations</DT>
   <DD>INTEGER.  Number of iterations to train.

     @note
     This parameter is different than the number of passes over the dataset,
     which is commonly referred to as the number of epochs.  Since MADlib operates
     in a distributed system, the number of
     epochs is actually equal to this parameter 'num_iterations' X 'epochs' as
     specified in the Keras fit parameter.
   </DD>

   <DT>use_gpus (optional)</DT>
   <DD>BOOLEAN, default: FALSE (i.e., CPU). Determines whether GPUs
   are to be used for training the neural network.  Set to TRUE to use GPUs.

     @note
     This parameter must not conflict with how the distribution rules are set in
     the preprocessor function.  For example, if you set a distribution rule to use
     certain segments on hosts that do not have GPUs attached, you will get an error
     if you set ‘use_gpus’ to TRUE.  Also, we have seen some memory related issues
     when segments share GPU resources.
     For example, if you have 1 GPU per segment host and your cluster has 4
     segments per segment host, it means that all 4
     segments will share the same
     GPU on each host. The current recommended
     configuration is 1 GPU per segment.
   </DD>

   <dt>validation_table (optional)</dt>
   <dd>TEXT, default: none. Name of the table containing
   the validation dataset.
   Note that the validation dataset must be preprocessed
   in the same way as the training dataset, so this
   is the name of the output
   table from running the image preprocessor on the validation dataset.
   Using a validation dataset can mean a
   longer training time, depending on its size.
   This can be controlled using the 'metrics_compute_frequency'
   parameter described below.</dd>

   <DT>metrics_compute_frequency (optional)</DT>
   <DD>INTEGER, default: once at the end of training
   after 'num_iterations'.  Frequency to compute per-iteration
   metrics for the training dataset and validation dataset
   (if specified).  There can be considerable cost to
   computing metrics every iteration, especially if the
   training dataset is large.  This parameter is a way of
   controlling the frequency of those computations.
   For example, if you specify 5, then metrics will be computed
   every 5 iterations as well as at the end of training
   after 'num_iterations'.  If you use the default,
   metrics will be computed only
   once after 'num_iterations' have completed.
   </DD>

   <DT>warm_start (optional)</DT>
   <DD>BOOLEAN, default: FALSE.
     Initalize weights with the coefficients
     from the last call to the fit
     function. If set to TRUE, weights will be
     initialized from the model table
     generated by the previous training run.

     @note
     The warm start feature works based on the name of the
     model output table from a previous training run.
     When using warm start, do not drop the model output table
     or the model output summary table
     before calling the fit function, since these are needed to obtain the
     weights from the previous run.
     If you are not using warm start, the model output table
     and the model output table summary must be dropped in
     the usual way before calling the training function.
   </DD>

   <DT>name (optional)</DT>
   <DD>TEXT, default: NULL.
     Free text string to identify a name, if desired.
   </DD>

   <DT>description (optional)</DT>
   <DD>TEXT, default: NULL.
     Free text string to provide a description, if desired.
   </DD>

   <DT>use_caching (optional)</DT>
   <DD>BOOLEAN, default: FALSE. Use caching of images in memory on the
   segment in order to speed up processing.

   @note
   When set to TRUE, image byte arrays on each segment are maintained
   in cache (SD). This can speed up training significantly, however the
   memory usage per segment increases.  In effect, it
   requires enough available memory on a segment so that all images
   residing on that segment can be read into memory.
 </dl>

 <b>Output tables</b>
 <br>
     The model output table produced by fit contains the following columns.
     There is one row per model as per the rows in the 'model_selection_table':
     <table class="output">
       <tr>
         <th>mst_key</th>
         <td>INTEGER. ID that defines a unique tuple for model architecture-compile parameters-fit parameters,
         as defined in the 'model_selection_table'.</td>
       </tr>
       <tr>
         <th>model_weights</th>
         <td>BYTEA8. Byte array containing the weights of the neural net.</td>
       </tr>
       <tr>
         <th>model_arch</th>
         <td>TEXT. A JSON representation of the model architecture
         used in training.</td>
       </tr>
     </table>

     An info table named \<model_output_table\>_info is also created, which has the following columns.
     There is one row per model as per the rows in the 'model_selection_table':
     <table class="output">
       <tr>
         <th>mst_key</th>
         <td>INTEGER. ID that defines a unique tuple for model architecture-compile parameters-fit parameters,
         as defined in the 'model_selection_table'.</td>
       </tr>
       <tr>
         <th>model_id</th>
         <td>INTEGER. ID that defines model in the 'model_arch_table'.</td>
       </tr>
       <tr>
         <th>compile_params</th>
         <td>Compile parameters passed to Keras.</td>
     </tr>
     <tr>
         <th>fit_params</th>
         <td>Fit parameters passed to Keras.</td>
     </tr>
     <tr>
         <th>model_type</th>
         <td>General identifier for type of model trained.
         Currently says 'madlib_keras'.</td>
     </tr>
     <tr>
         <th>model_size</th>
         <td>Size of the model in KB.  Models are stored in
         'bytea' data format which is used for binary strings
         in PostgreSQL type databases.</td>
     </tr>
     <tr>
         <th>metrics_elapsed_time</th>
         <td> Array of elapsed time for metric computations as
         per the 'metrics_compute_frequency' parameter.
         Useful for drawing a curve showing loss, accuracy or
         other metrics as a function of time.
         For example, if 'metrics_compute_frequency=5'
         this would be an array of elapsed time for every 5th
         iteration, plus the last iteration.</td>
     </tr>
     <tr>
         <th>metrics_type</th>
         <td>Metric specified in the 'compile_params'.</td>
     </tr>
     <tr>
         <th>training_metrics_final</th>
         <td>Final value of the training
         metric after all iterations have completed.
         The metric reported is the one
         specified in the 'metrics_type' parameter.</td>
     </tr>
     <tr>
         <th>training_loss_final</th>
         <td>Final value of the training loss after all
         iterations have completed.</td>
     </tr>
     <tr>
         <th>training_metrics</th>
         <td>Array of training metrics as
         per the 'metrics_compute_frequency' parameter.
         For example, if 'metrics_compute_frequency=5'
         this would be an array of metrics for every 5th
         iteration, plus the last iteration.</td>
     </tr>
     <tr>
         <th>training_loss</th>
         <td>Array of training losses as
         per the 'metrics_compute_frequency' parameter.
         For example, if 'metrics_compute_frequency=5'
         this would be an array of losses for every 5th
         iteration, plus the last iteration.</td>
     </tr>
     <tr>
         <th>validation_metrics_final</th>
         <td>Final value of the validation
         metric after all iterations have completed.
         The metric reported is the one
         specified in the 'metrics_type' parameter.</td>
     </tr>
     <tr>
         <th>validation_loss_final</th>
         <td>Final value of the validation loss after all
         iterations have completed.</td>
     </tr>
     <tr>
         <th>validation_metrics</th>
         <td>Array of validation metrics as
         per the 'metrics_compute_frequency' parameter.
         For example, if 'metrics_compute_frequency=5'
         this would be an array of metrics for every 5th
         iteration, plus the last iteration.</td>
     </tr>
     <tr>
         <th>validation_loss</th>
         <td>Array of validation losses as
         per the 'metrics_compute_frequency' parameter.
         For example, if 'metrics_compute_frequency=5'
         this would be an array of losses for every 5th
         iteration, plus the last iteration.</td>
     </tr>

     </table>

     A summary table named \<model\>_summary is also created, which has the following columns:
     <table class="output">
     <tr>
         <th>source_table</th>
         <td>Source table used for training.</td>
     </tr>
     <tr>
         <th>validation_table</th>
         <td>Name of the table containing
         the validation dataset (if specified).</td>
     </tr>
     <tr>
         <th>model</th>
         <td>Name of the output table containing
         the model for each model selection tuple.</td>
     </tr>
     <tr>
         <th>model_info</th>
         <td>Name of the output table containing
         the model performance and other info for
         each model selection tuple.</td>
     </tr>
     <tr>
         <th>dependent_varname</th>
         <td>Dependent variable column from the original
         source table in the image preprocessing step.</td>
     </tr>
     <tr>
         <th>independent_varname</th>
         <td>Independent variables column from the original
         source table in the image preprocessing step.</td>
     </tr>
     <tr>
         <th>model_arch_table</th>
         <td>Name of the table containing
         the model architecture and (optionally) the
         initial model weights.</td>
     </tr>
     <tr>
         <th>num_iterations</th>
         <td>Number of iterations of training completed.</td>
     </tr>
     <tr>
         <th>metrics_compute_frequency</th>
         <td>Frequency that per-iteration metrics are computed
         for the training dataset and validation
         datasets.</td>
     </tr>
     <tr>
         <th>warm_start</th>
         <td>Indicates whether warm start used or not.</td>
     </tr>
     <tr>
         <th>name</th>
         <td>Name of the training run (free text).</td>
     </tr>
     <tr>
         <th>description</th>
         <td>Description of the training run (free text).</td>
     </tr>
     <tr>
         <th>start_training_time</th>
         <td>Timestamp for start of training.</td>
     </tr>
     <tr>
         <th>end_training_time</th>
         <td>Timestamp for end of training.</td>
     </tr>
     <tr>
         <th>madlib_version</th>
         <td>Version of MADlib used.</td>
     </tr>
     <tr>
         <th>num_classes</th>
         <td>Count of distinct classes values used.</td>
     </tr>
     <tr>
         <th>class_values</th>
         <td>Array of actual class values used.</td>
     </tr>
     <tr>
         <th>dependent_vartype</th>
         <td>Data type of the dependent variable.</td>
     </tr>
     <tr>
         <th>normalizing_constant</th>
         <td>Normalizing constant used from the
         image preprocessing step.</td>
     </tr>
     <tr>
         <th>metrics_iters</th>
         <td>Array indicating the iterations for which
         metrics are calculated, as derived from the
         parameters 'num_iterations' and 'metrics_compute_frequency'.
         For example, if 'num_iterations=5'
         and 'metrics_compute_frequency=2', then 'metrics_iters' value
         would be {2,4,5} indicating that metrics were computed
         at iterations 2, 4 and 5 (at the end).
         If 'num_iterations=5'
         and 'metrics_compute_frequency=1', then 'metrics_iters' value
         would be {1,2,3,4,5} indicating that metrics were computed
         at every iteration.</td>
     </tr>
    </table>

 @anchor keras_evaluate
 @par Evaluate
 The evaluation function has the following format:

 <pre class="syntax">
 madlib_keras_evaluate(
     model_table,
     test_table,
     output_table,
     use_gpus
     )
 </pre>

 \b Arguments
 <dl class="arglist">

 <DT>model_table</DT>
   <DD>TEXT. Name of the table containing the model
   to use for validation.
   </DD>

   <DT>test_table</DT>
   <dd>TEXT. Name of the table containing the evaluation dataset.
   Note that test/validation data must be preprocessed in the same
   way as the training dataset, so
   this is the name of the output
   table from the image preprocessor.  Independent
   and dependent variables are specified in the preprocessor
   step which is why you do not need to explictly state
   them here as part of the fit function.</dd>

   <DT>output_table</DT>
   <DD>TEXT. Name of table that validation output will be
   written to. Table contains:</DD>
     <table class="output">
       <tr>
         <th>loss</th>
         <td>Loss value on evaluation dataset.</td>
       </tr>
       <tr>
         <th>metric</th>
         <td>Metric value on evaluation dataset, where 'metrics_type'
         below identifies the type of metric.</td>
       </tr>
       <tr>
         <th>metrics_type</th>
         <td>Type of metric used that was used in the training step.</td>
       </tr>

   <DT>use_gpus (optional)</DT>
   <DD>BOOLEAN, default: FALSE (i.e., CPU). Determines whether GPUs
   are to be used for training the neural network.  Set to TRUE to use GPUs.

     @note
     This parameter must not conflict with how the distribution rules are set in
     the preprocessor function.  For example, if you set a distribution rule to use
     certain segments on hosts that do not have GPUs attached, you will get an error
     if you set ‘use_gpus’ to TRUE.  Also, we have seen some memory related issues
     when segments share GPU resources.
     For example, if you have 1 GPU per segment host and your cluster has 4
     segments per segment host, it means that all 4
     segments will share the same
     GPU on each host. The current recommended
     configuration is 1 GPU per segment.
   </DD>
 </DL>

 @anchor keras_predict
 @par Predict
 The prediction function has the following format:
 <pre class="syntax">
 madlib_keras_predict(
     model_table,
     test_table,
     id_col,
     independent_varname,
     output_table,
     pred_type,
     use_gpus
     )
 </pre>

 \b Arguments
 <dl class="arglist">

 <DT>model_table</DT>
   <DD>TEXT. Name of the table containing the model
   to use for prediction.
   </DD>

   <DT>test_table</DT>
   <DD>TEXT. Name of the table containing the dataset to
   predict on.  Note that test data is not preprocessed (unlike
   fit and evaluate) so put one test image per row for prediction.
   Also see the comment below for the 'independent_varname' parameter
   regarding normalization.

   </DD>

   <DT>id_col</DT>
   <DD>TEXT. Name of the id column in the test data table.
   </DD>

   <DT>independent_varname</DT>
   <DD>TEXT. Column with independent variables in the test table.
   If a 'normalizing_const' is specified when preprocessing the
   training dataset, this same normalization will be applied to
   the independent variables used in predict.
   </DD>

   <DT>output_table</DT>
   <DD>TEXT. Name of the table that prediction output will be
   written to. Table contains:</DD>
     <table class="output">
       <tr>
         <th>id</th>
         <td>Gives the 'id' for each prediction, corresponding to each row from the test_table.</td>
       </tr>
       <tr>
         <th>estimated_COL_NAME</th>
         <td>
         (For pred_type='response') The estimated class
          for classification, where
          COL_NAME is the name of the column to be
          predicted from test data.
         </td>
       </tr>
       <tr>
         <th>prob_CLASS</th>
         <td>
         (For pred_type='prob' for classification) The
         probability of a given class.
         There will be one column for each class
         in the training data.
         </td>
       </tr>

   <DT>pred_type (optional)</DT>
   <DD>TEXT, default: 'response'. The type of output
   desired, where 'response' gives the actual prediction
   and 'prob' gives the probability value for each class.
   </DD>

   <DT>use_gpus (optional)</DT>
   <DD>BOOLEAN, default: FALSE (i.e., CPU). Determines
   whether GPUs are to be used for prediction/inference.
   Set to TRUE to use GPUs.

     @note
     The prediction function uses the whole cluster.  If you are using GPUs, it
     requires that GPUs are attached to all hosts, and that there are the same number
     of GPUs on each host (homogeneous cluster).  This is different from the fit()
     and evaluate() functions that support GPUs on only some of the hosts (heterogeneous cluster).
     Therefore, if you have GPUs only on some of the hosts, or an uneven numbers of GPUs per host, then
     set this parameter to FALSE to use CPUs.
   </DD>
 </DL>

 @anchor example
 @par Examples

 @note
 Deep learning works best on very large datasets,
 but that is not convenient for a quick introduction
 to the syntax. So in this example we use an MLP on the well
 known iris data set from https://archive.ics.uci.edu/ml/datasets/iris.
 For more realistic examples with images please refer
 to the deep learning notebooks
 at https://github.com/apache/madlib-site/tree/asf-site/community-artifacts.

 <h4>Classification</h4>

 -#  Create an input data set.
 <pre class="example">
 DROP TABLE IF EXISTS iris_data;
 CREATE TABLE iris_data(
     id serial,
     attributes numeric[],
     class_text varchar
 );
 INSERT INTO iris_data(id, attributes, class_text) VALUES
 (1,ARRAY[5.1,3.5,1.4,0.2],'Iris-setosa'),
 (2,ARRAY[4.9,3.0,1.4,0.2],'Iris-setosa'),
 (3,ARRAY[4.7,3.2,1.3,0.2],'Iris-setosa'),
 (4,ARRAY[4.6,3.1,1.5,0.2],'Iris-setosa'),
 (5,ARRAY[5.0,3.6,1.4,0.2],'Iris-setosa'),
 (6,ARRAY[5.4,3.9,1.7,0.4],'Iris-setosa'),
 (7,ARRAY[4.6,3.4,1.4,0.3],'Iris-setosa'),
 (8,ARRAY[5.0,3.4,1.5,0.2],'Iris-setosa'),
 (9,ARRAY[4.4,2.9,1.4,0.2],'Iris-setosa'),
 (10,ARRAY[4.9,3.1,1.5,0.1],'Iris-setosa'),
 (11,ARRAY[5.4,3.7,1.5,0.2],'Iris-setosa'),
 (12,ARRAY[4.8,3.4,1.6,0.2],'Iris-setosa'),
 (13,ARRAY[4.8,3.0,1.4,0.1],'Iris-setosa'),
 (14,ARRAY[4.3,3.0,1.1,0.1],'Iris-setosa'),
 (15,ARRAY[5.8,4.0,1.2,0.2],'Iris-setosa'),
 (16,ARRAY[5.7,4.4,1.5,0.4],'Iris-setosa'),
 (17,ARRAY[5.4,3.9,1.3,0.4],'Iris-setosa'),
 (18,ARRAY[5.1,3.5,1.4,0.3],'Iris-setosa'),
 (19,ARRAY[5.7,3.8,1.7,0.3],'Iris-setosa'),
 (20,ARRAY[5.1,3.8,1.5,0.3],'Iris-setosa'),
 (21,ARRAY[5.4,3.4,1.7,0.2],'Iris-setosa'),
 (22,ARRAY[5.1,3.7,1.5,0.4],'Iris-setosa'),
 (23,ARRAY[4.6,3.6,1.0,0.2],'Iris-setosa'),
 (24,ARRAY[5.1,3.3,1.7,0.5],'Iris-setosa'),
 (25,ARRAY[4.8,3.4,1.9,0.2],'Iris-setosa'),
 (26,ARRAY[5.0,3.0,1.6,0.2],'Iris-setosa'),
 (27,ARRAY[5.0,3.4,1.6,0.4],'Iris-setosa'),
 (28,ARRAY[5.2,3.5,1.5,0.2],'Iris-setosa'),
 (29,ARRAY[5.2,3.4,1.4,0.2],'Iris-setosa'),
 (30,ARRAY[4.7,3.2,1.6,0.2],'Iris-setosa'),
 (31,ARRAY[4.8,3.1,1.6,0.2],'Iris-setosa'),
 (32,ARRAY[5.4,3.4,1.5,0.4],'Iris-setosa'),
 (33,ARRAY[5.2,4.1,1.5,0.1],'Iris-setosa'),
 (34,ARRAY[5.5,4.2,1.4,0.2],'Iris-setosa'),
 (35,ARRAY[4.9,3.1,1.5,0.1],'Iris-setosa'),
 (36,ARRAY[5.0,3.2,1.2,0.2],'Iris-setosa'),
 (37,ARRAY[5.5,3.5,1.3,0.2],'Iris-setosa'),
 (38,ARRAY[4.9,3.1,1.5,0.1],'Iris-setosa'),
 (39,ARRAY[4.4,3.0,1.3,0.2],'Iris-setosa'),
 (40,ARRAY[5.1,3.4,1.5,0.2],'Iris-setosa'),
 (41,ARRAY[5.0,3.5,1.3,0.3],'Iris-setosa'),
 (42,ARRAY[4.5,2.3,1.3,0.3],'Iris-setosa'),
 (43,ARRAY[4.4,3.2,1.3,0.2],'Iris-setosa'),
 (44,ARRAY[5.0,3.5,1.6,0.6],'Iris-setosa'),
 (45,ARRAY[5.1,3.8,1.9,0.4],'Iris-setosa'),
 (46,ARRAY[4.8,3.0,1.4,0.3],'Iris-setosa'),
 (47,ARRAY[5.1,3.8,1.6,0.2],'Iris-setosa'),
 (48,ARRAY[4.6,3.2,1.4,0.2],'Iris-setosa'),
 (49,ARRAY[5.3,3.7,1.5,0.2],'Iris-setosa'),
 (50,ARRAY[5.0,3.3,1.4,0.2],'Iris-setosa'),
 (51,ARRAY[7.0,3.2,4.7,1.4],'Iris-versicolor'),
 (52,ARRAY[6.4,3.2,4.5,1.5],'Iris-versicolor'),
 (53,ARRAY[6.9,3.1,4.9,1.5],'Iris-versicolor'),
 (54,ARRAY[5.5,2.3,4.0,1.3],'Iris-versicolor'),
 (55,ARRAY[6.5,2.8,4.6,1.5],'Iris-versicolor'),
 (56,ARRAY[5.7,2.8,4.5,1.3],'Iris-versicolor'),
 (57,ARRAY[6.3,3.3,4.7,1.6],'Iris-versicolor'),
 (58,ARRAY[4.9,2.4,3.3,1.0],'Iris-versicolor'),
 (59,ARRAY[6.6,2.9,4.6,1.3],'Iris-versicolor'),
 (60,ARRAY[5.2,2.7,3.9,1.4],'Iris-versicolor'),
 (61,ARRAY[5.0,2.0,3.5,1.0],'Iris-versicolor'),
 (62,ARRAY[5.9,3.0,4.2,1.5],'Iris-versicolor'),
 (63,ARRAY[6.0,2.2,4.0,1.0],'Iris-versicolor'),
 (64,ARRAY[6.1,2.9,4.7,1.4],'Iris-versicolor'),
 (65,ARRAY[5.6,2.9,3.6,1.3],'Iris-versicolor'),
 (66,ARRAY[6.7,3.1,4.4,1.4],'Iris-versicolor'),
 (67,ARRAY[5.6,3.0,4.5,1.5],'Iris-versicolor'),
 (68,ARRAY[5.8,2.7,4.1,1.0],'Iris-versicolor'),
 (69,ARRAY[6.2,2.2,4.5,1.5],'Iris-versicolor'),
 (70,ARRAY[5.6,2.5,3.9,1.1],'Iris-versicolor'),
 (71,ARRAY[5.9,3.2,4.8,1.8],'Iris-versicolor'),
 (72,ARRAY[6.1,2.8,4.0,1.3],'Iris-versicolor'),
 (73,ARRAY[6.3,2.5,4.9,1.5],'Iris-versicolor'),
 (74,ARRAY[6.1,2.8,4.7,1.2],'Iris-versicolor'),
 (75,ARRAY[6.4,2.9,4.3,1.3],'Iris-versicolor'),
 (76,ARRAY[6.6,3.0,4.4,1.4],'Iris-versicolor'),
 (77,ARRAY[6.8,2.8,4.8,1.4],'Iris-versicolor'),
 (78,ARRAY[6.7,3.0,5.0,1.7],'Iris-versicolor'),
 (79,ARRAY[6.0,2.9,4.5,1.5],'Iris-versicolor'),
 (80,ARRAY[5.7,2.6,3.5,1.0],'Iris-versicolor'),
 (81,ARRAY[5.5,2.4,3.8,1.1],'Iris-versicolor'),
 (82,ARRAY[5.5,2.4,3.7,1.0],'Iris-versicolor'),
 (83,ARRAY[5.8,2.7,3.9,1.2],'Iris-versicolor'),
 (84,ARRAY[6.0,2.7,5.1,1.6],'Iris-versicolor'),
 (85,ARRAY[5.4,3.0,4.5,1.5],'Iris-versicolor'),
 (86,ARRAY[6.0,3.4,4.5,1.6],'Iris-versicolor'),
 (87,ARRAY[6.7,3.1,4.7,1.5],'Iris-versicolor'),
 (88,ARRAY[6.3,2.3,4.4,1.3],'Iris-versicolor'),
 (89,ARRAY[5.6,3.0,4.1,1.3],'Iris-versicolor'),
 (90,ARRAY[5.5,2.5,4.0,1.3],'Iris-versicolor'),
 (91,ARRAY[5.5,2.6,4.4,1.2],'Iris-versicolor'),
 (92,ARRAY[6.1,3.0,4.6,1.4],'Iris-versicolor'),
 (93,ARRAY[5.8,2.6,4.0,1.2],'Iris-versicolor'),
 (94,ARRAY[5.0,2.3,3.3,1.0],'Iris-versicolor'),
 (95,ARRAY[5.6,2.7,4.2,1.3],'Iris-versicolor'),
 (96,ARRAY[5.7,3.0,4.2,1.2],'Iris-versicolor'),
 (97,ARRAY[5.7,2.9,4.2,1.3],'Iris-versicolor'),
 (98,ARRAY[6.2,2.9,4.3,1.3],'Iris-versicolor'),
 (99,ARRAY[5.1,2.5,3.0,1.1],'Iris-versicolor'),
 (100,ARRAY[5.7,2.8,4.1,1.3],'Iris-versicolor'),
 (101,ARRAY[6.3,3.3,6.0,2.5],'Iris-virginica'),
 (102,ARRAY[5.8,2.7,5.1,1.9],'Iris-virginica'),
 (103,ARRAY[7.1,3.0,5.9,2.1],'Iris-virginica'),
 (104,ARRAY[6.3,2.9,5.6,1.8],'Iris-virginica'),
 (105,ARRAY[6.5,3.0,5.8,2.2],'Iris-virginica'),
 (106,ARRAY[7.6,3.0,6.6,2.1],'Iris-virginica'),
 (107,ARRAY[4.9,2.5,4.5,1.7],'Iris-virginica'),
 (108,ARRAY[7.3,2.9,6.3,1.8],'Iris-virginica'),
 (109,ARRAY[6.7,2.5,5.8,1.8],'Iris-virginica'),
 (110,ARRAY[7.2,3.6,6.1,2.5],'Iris-virginica'),
 (111,ARRAY[6.5,3.2,5.1,2.0],'Iris-virginica'),
 (112,ARRAY[6.4,2.7,5.3,1.9],'Iris-virginica'),
 (113,ARRAY[6.8,3.0,5.5,2.1],'Iris-virginica'),
 (114,ARRAY[5.7,2.5,5.0,2.0],'Iris-virginica'),
 (115,ARRAY[5.8,2.8,5.1,2.4],'Iris-virginica'),
 (116,ARRAY[6.4,3.2,5.3,2.3],'Iris-virginica'),
 (117,ARRAY[6.5,3.0,5.5,1.8],'Iris-virginica'),
 (118,ARRAY[7.7,3.8,6.7,2.2],'Iris-virginica'),
 (119,ARRAY[7.7,2.6,6.9,2.3],'Iris-virginica'),
 (120,ARRAY[6.0,2.2,5.0,1.5],'Iris-virginica'),
 (121,ARRAY[6.9,3.2,5.7,2.3],'Iris-virginica'),
 (122,ARRAY[5.6,2.8,4.9,2.0],'Iris-virginica'),
 (123,ARRAY[7.7,2.8,6.7,2.0],'Iris-virginica'),
 (124,ARRAY[6.3,2.7,4.9,1.8],'Iris-virginica'),
 (125,ARRAY[6.7,3.3,5.7,2.1],'Iris-virginica'),
 (126,ARRAY[7.2,3.2,6.0,1.8],'Iris-virginica'),
 (127,ARRAY[6.2,2.8,4.8,1.8],'Iris-virginica'),
 (128,ARRAY[6.1,3.0,4.9,1.8],'Iris-virginica'),
 (129,ARRAY[6.4,2.8,5.6,2.1],'Iris-virginica'),
 (130,ARRAY[7.2,3.0,5.8,1.6],'Iris-virginica'),
 (131,ARRAY[7.4,2.8,6.1,1.9],'Iris-virginica'),
 (132,ARRAY[7.9,3.8,6.4,2.0],'Iris-virginica'),
 (133,ARRAY[6.4,2.8,5.6,2.2],'Iris-virginica'),
 (134,ARRAY[6.3,2.8,5.1,1.5],'Iris-virginica'),
 (135,ARRAY[6.1,2.6,5.6,1.4],'Iris-virginica'),
 (136,ARRAY[7.7,3.0,6.1,2.3],'Iris-virginica'),
 (137,ARRAY[6.3,3.4,5.6,2.4],'Iris-virginica'),
 (138,ARRAY[6.4,3.1,5.5,1.8],'Iris-virginica'),
 (139,ARRAY[6.0,3.0,4.8,1.8],'Iris-virginica'),
 (140,ARRAY[6.9,3.1,5.4,2.1],'Iris-virginica'),
 (141,ARRAY[6.7,3.1,5.6,2.4],'Iris-virginica'),
 (142,ARRAY[6.9,3.1,5.1,2.3],'Iris-virginica'),
 (143,ARRAY[5.8,2.7,5.1,1.9],'Iris-virginica'),
 (144,ARRAY[6.8,3.2,5.9,2.3],'Iris-virginica'),
 (145,ARRAY[6.7,3.3,5.7,2.5],'Iris-virginica'),
 (146,ARRAY[6.7,3.0,5.2,2.3],'Iris-virginica'),
 (147,ARRAY[6.3,2.5,5.0,1.9],'Iris-virginica'),
 (148,ARRAY[6.5,3.0,5.2,2.0],'Iris-virginica'),
 (149,ARRAY[6.2,3.4,5.4,2.3],'Iris-virginica'),
 (150,ARRAY[5.9,3.0,5.1,1.8],'Iris-virginica');
 </pre>
 Create a test/validation dataset from the training data:
 <pre class="example">
 DROP TABLE IF EXISTS iris_train, iris_test;
 -- Set seed so results are reproducible
 SELECT setseed(0);
 SELECT madlib.train_test_split('iris_data',     -- Source table
                                'iris',          -- Output table root name
                                 0.8,            -- Train proportion
                                 NULL,           -- Test proportion (0.2)
                                 NULL,           -- Strata definition
                                 NULL,           -- Output all columns
                                 NULL,           -- Sample without replacement
                                 TRUE            -- Separate output tables
                               );
 SELECT COUNT(*) FROM iris_train;
 </pre>
 <pre class="result">
  count
 ------+
    120
 </pre>

 -# Call the preprocessor for deep learning.  For the training dataset:
 <pre class="example">
 \\x on
 DROP TABLE IF EXISTS iris_train_packed, iris_train_packed_summary;
 SELECT madlib.training_preprocessor_dl('iris_train',         -- Source table
                                        'iris_train_packed',  -- Output table
                                        'class_text',         -- Dependent variable
                                        'attributes'          -- Independent variable
                                         );
 SELECT * FROM iris_train_packed_summary;
 </pre>
 <pre class="result">
 -[ RECORD 1 ]-------+---------------------------------------------
 source_table        | iris_train
 output_table        | iris_train_packed
 dependent_varname   | class_text
 independent_varname | attributes
 dependent_vartype   | character varying
 class_values        | {Iris-setosa,Iris-versicolor,Iris-virginica}
 buffer_size         | 60
 normalizing_const   | 1.0
 num_classes         | 3
 </pre>
 For the validation dataset:
 <pre class="example">
 DROP TABLE IF EXISTS iris_test_packed, iris_test_packed_summary;
 SELECT madlib.validation_preprocessor_dl('iris_test',          -- Source table
                                          'iris_test_packed',   -- Output table
                                          'class_text',         -- Dependent variable
                                          'attributes',         -- Independent variable
                                          'iris_train_packed'   -- From training preprocessor step
                                           );
 SELECT * FROM iris_test_packed_summary;
 </pre>
 <pre class="result">
 -[ RECORD 1 ]-------+---------------------------------------------
 source_table        | iris_test
 output_table        | iris_test_packed
 dependent_varname   | class_text
 independent_varname | attributes
 dependent_vartype   | character varying
 class_values        | {Iris-setosa,Iris-versicolor,Iris-virginica}
 buffer_size         | 15
 normalizing_const   | 1.0
 num_classes         | 3
 </pre>

 -# Define and load model architecture.  Use Keras to define
 the model architecture with 1 hidden layer:
 <pre class="example">
 import keras
 from keras.models import Sequential
 from keras.layers import Dense
 model1 = Sequential()
 model1.add(Dense(10, activation='relu', input_shape=(4,)))
 model1.add(Dense(10, activation='relu'))
 model1.add(Dense(3, activation='softmax'))
 model1.summary()
 \verbatim

 _________________________________________________________________
 Layer (type)                 Output Shape              Param #
 =================================================================
 dense_1 (Dense)              (None, 10)                50
 _________________________________________________________________
 dense_2 (Dense)              (None, 10)                110
 _________________________________________________________________
 dense_3 (Dense)              (None, 3)                 33
 =================================================================
 Total params: 193
 Trainable params: 193
 Non-trainable params: 0
 \endverbatim
 </pre>
 Export the model to JSON:
 <pre class="example">
 model1.to_json()
 </pre>
 <pre class="result">
 '{"class_name": "Sequential", "keras_version": "2.1.6", "config": [{"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_1", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "dtype": "float32", "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "batch_input_shape": [null, 4], "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_2", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_3", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "softmax", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 3, "use_bias": true, "activity_regularizer": null}}], "backend": "tensorflow"}'
 </pre>
 Define model architecture with 2 hidden layers:
 <pre class="example">
 model2 = Sequential()
 model2.add(Dense(10, activation='relu', input_shape=(4,)))
 model2.add(Dense(10, activation='relu'))
 model2.add(Dense(10, activation='relu'))
 model2.add(Dense(3, activation='softmax'))
 model2.summary()
 \verbatim

 Layer (type)                 Output Shape              Param #
 =================================================================
 dense_4 (Dense)              (None, 10)                50
 _________________________________________________________________
 dense_5 (Dense)              (None, 10)                110
 _________________________________________________________________
 dense_6 (Dense)              (None, 10)                110
 _________________________________________________________________
 dense_7 (Dense)              (None, 3)                 33
 =================================================================
 Total params: 303
 Trainable params: 303
 Non-trainable params: 0
 \endverbatim
 </pre>
 Export the model to JSON:
 <pre class="example">
 model2.to_json()
 </pre>
 <pre class="result">
 '{"class_name": "Sequential", "keras_version": "2.1.6", "config": [{"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_4", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "dtype": "float32", "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "batch_input_shape": [null, 4], "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_5", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_6", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_7", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "softmax", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 3, "use_bias": true, "activity_regularizer": null}}], "backend": "tensorflow"}'
 </pre>
 Load into model architecture table:
 <pre class="example">
 DROP TABLE IF EXISTS model_arch_library;
 SELECT madlib.load_keras_model('model_arch_library',  -- Output table,
 $$
 {"class_name": "Sequential", "keras_version": "2.1.6", "config": [{"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_1", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "dtype": "float32", "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "batch_input_shape": [null, 4], "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_2", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_3", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "softmax", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 3, "use_bias": true, "activity_regularizer": null}}], "backend": "tensorflow"}
 $$
 ::json,         -- JSON blob
                                NULL,                  -- Weights
                                'Sophie',              -- Name
                                'MLP with 1 hidden layer'       -- Descr
 );
 SELECT madlib.load_keras_model('model_arch_library',  -- Output table,
 $$
 {"class_name": "Sequential", "keras_version": "2.1.6", "config": [{"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_4", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "dtype": "float32", "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "batch_input_shape": [null, 4], "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_5", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_6", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "relu", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 10, "use_bias": true, "activity_regularizer": null}}, {"class_name": "Dense", "config": {"kernel_initializer": {"class_name": "VarianceScaling", "config": {"distribution": "uniform", "scale": 1.0, "seed": null, "mode": "fan_avg"}}, "name": "dense_7", "kernel_constraint": null, "bias_regularizer": null, "bias_constraint": null, "activation": "softmax", "trainable": true, "kernel_regularizer": null, "bias_initializer": {"class_name": "Zeros", "config": {}}, "units": 3, "use_bias": true, "activity_regularizer": null}}], "backend": "tensorflow"}
 $$
 ::json,         -- JSON blob
                                NULL,                  -- Weights
                                'Maria',               -- Name
                                'MLP with 2 hidden layers'       -- Descr
 );
 </pre>
 -# Define model selection tuples and load.  Select the model(s) from the model architecture
 table that you want to run, along with the compile and fit parameters. Combinations will be
 created for the set of model selection parameters will be loaded:
 <pre class="example">
 DROP TABLE IF EXISTS mst_table, mst_table_summary;
 SELECT madlib.load_model_selection_table('model_arch_library', -- model architecture table
                                          'mst_table',          -- model selection table output
                                           ARRAY[1,2],              -- model ids from model architecture table
                                           ARRAY[                   -- compile params
                                               $$loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']$$,
                                               $$loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy']$$,
                                               $$loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy']$$
                                           ],
                                           ARRAY[                    -- fit params
                                               $$batch_size=4,epochs=1$$,
                                               $$batch_size=8,epochs=1$$
                                           ]
                                          );
 SELECT * FROM mst_table ORDER BY mst_key;
 </pre>
 <pre class="result">
  mst_key | model_id |                                 compile_params                                  |      fit_params
 ---------+----------+---------------------------------------------------------------------------------+-----------------------
        1 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1
        2 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1
        3 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1
        4 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1
        5 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1
        6 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1
        7 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1
        8 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1
        9 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1
       10 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1
       11 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1
       12 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1
 (12 rows)
 </pre>
 This is the name of the model architecture table that corresponds to the model selection table:
 <pre class="example">
 SELECT * FROM mst_table_summary;
 </pre>
 <pre class="result">
   model_arch_table
 --------------------+
  model_arch_library
 </pre>

 -# Train multiple models.
 <pre class="example">
 DROP TABLE IF EXISTS iris_multi_model, iris_multi_model_summary, iris_multi_model_info;
 SELECT madlib.madlib_keras_fit_multiple_model('iris_train_packed',    -- source_table
                                               'iris_multi_model',     -- model_output_table
                                               'mst_table',            -- model_selection_table
                                               10,                     -- num_iterations
                                               FALSE                   -- use gpus
                                              );
 </pre>
 View the model summary:
 <pre class="example">
 SELECT * FROM iris_multi_model_summary;
 </pre>
 <pre class="result">
 source_table              | iris_train_packed
 validation_table          |
 model                     | iris_multi_model
 model_info                | iris_multi_model_info
 dependent_varname         | class_text
 independent_varname       | attributes
 model_arch_table          | model_arch_library
 num_iterations            | 10
 metrics_compute_frequency | 10
 warm_start                | f
 name                      |
 description               |
 start_training_time       | 2019-12-16 18:54:33.826414
 end_training_time         | 2019-12-16 18:56:19.106321
 madlib_version            | 1.17.0
 num_classes               | 3
 class_values              | {Iris-setosa,Iris-versicolor,Iris-virginica}
 dependent_vartype         | character varying
 normalizing_const         | 1
 metrics_iters             | {10}
 </pre>
 View results for each model:
 <pre class="example">
 SELECT * FROM iris_multi_model_info ORDER BY training_metrics_final DESC, training_loss_final;
 </pre>
 <pre class="result">
  mst_key | model_id |                                 compile_params                                  |      fit_params       |  model_type  |  model_size  | metrics_elapsed_time | metrics_type | training_metrics_final | training_loss_final |  training_metrics   |    training_loss    | validation_metrics_final | validation_loss_final | validation_metrics | validation_loss
 ---------+----------+---------------------------------------------------------------------------------+-----------------------+--------------+--------------+----------------------+--------------+------------------------+---------------------+---------------------+---------------------+--------------------------+-----------------------+--------------------+-----------------
        9 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {119.42963886261}    | {accuracy}   |         0.983333349228 |       0.07286978513 | {0.983333349227905} | {0.072869785130024} |                          |                       |                    |
       10 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {118.485460996628}   | {accuracy}   |         0.975000023842 |     0.0798489004374 | {0.975000023841858} | {0.079848900437355} |                          |                       |                    |
        4 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {118.707404851913}   | {accuracy}   |         0.975000023842 |      0.143356323242 | {0.975000023841858} | {0.143356323242188} |                          |                       |                    |
       11 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {118.224883794785}   | {accuracy}   |         0.958333313465 |      0.636615753174 | {0.958333313465118} | {0.636615753173828} |                          |                       |                    |
        2 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {117.732690811157}   | {accuracy}   |         0.925000011921 |      0.161811202765 | {0.925000011920929} | {0.161811202764511} |                          |                       |                    |
        5 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {120.357484817505}   | {accuracy}   |         0.833333313465 |        0.5542948246 | {0.833333313465118} | {0.55429482460022}  |                          |                       |                    |
        3 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {118.928852796555}   | {accuracy}   |         0.824999988079 |      0.301002770662 | {0.824999988079071} | {0.301002770662308} |                          |                       |                    |
        6 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {120.566634893417}   | {accuracy}   |         0.816666662693 |      0.875298440456 | {0.816666662693024} | {0.87529844045639}  |                          |                       |                    |
       12 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {119.182703018188}   | {accuracy}   |         0.774999976158 |      0.785651266575 | {0.774999976158142} | {0.78565126657486}  |                          |                       |                    |
        1 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {119.643137931824}   | {accuracy}   |         0.508333325386 |      0.762569189072 | {0.508333325386047} | {0.762569189071655} |                          |                       |                    |
        7 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {120.15305685997}    | {accuracy}   |         0.333333343267 |       1.09794270992 | {0.333333343267441} | {1.09794270992279}  |                          |                       |                    |
        8 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {119.911739826202}   | {accuracy}   |         0.333333343267 |       1.10344016552 | {0.333333343267441} | {1.10344016551971}  |                          |                       |                    |
 (12 rows)
 </pre>

 -# Evaluate.  Now run evaluate using models we built above:
 <pre class="example">
 DROP TABLE IF EXISTS iris_validate;
 SELECT madlib.madlib_keras_evaluate('iris_multi_model',  -- model
                                     'iris_test_packed',  -- test table
                                     'iris_validate',     -- output table
                                      NULL,               -- use gpus
                                      3                   -- mst_key to use
                                    );
 SELECT * FROM iris_validate;
 </pre>
 <pre class="result">
        loss        |      metric       | metrics_type
 -------------------+-------------------+--------------
  0.103803977370262 | 0.966666638851166 | {accuracy}
 </pre>

 -# Predict.  Now predict using one of the models we built. We will use the validation data set
 for prediction as well, which is not usual but serves to show the syntax.
 The prediction is in the estimated_class_text column:
 <pre class="example">
 DROP TABLE IF EXISTS iris_predict;
 SELECT madlib.madlib_keras_predict('iris_multi_model', -- model
                                    'iris_test',        -- test_table
                                    'id',               -- id column
                                    'attributes',       -- independent var
                                    'iris_predict',     -- output table
                                     'response',        -- prediction type
                                     FALSE,             -- use gpus
                                     3                  -- mst_key to use
                                    );
 SELECT * FROM iris_predict ORDER BY id;
 </pre>
 <pre class="result">
  id  | estimated_class_text
 -----+----------------------
    9 | Iris-setosa
   18 | Iris-setosa
   22 | Iris-setosa
   26 | Iris-setosa
   35 | Iris-setosa
   38 | Iris-setosa
   42 | Iris-setosa
   43 | Iris-setosa
   45 | Iris-setosa
   46 | Iris-setosa
   50 | Iris-setosa
   53 | Iris-versicolor
   60 | Iris-versicolor
   68 | Iris-versicolor
   77 | Iris-versicolor
   78 | Iris-versicolor
   79 | Iris-versicolor
   81 | Iris-versicolor
   82 | Iris-versicolor
   85 | Iris-virginica
   95 | Iris-versicolor
   97 | Iris-versicolor
   98 | Iris-versicolor
  113 | Iris-virginica
  117 | Iris-virginica
  118 | Iris-virginica
  127 | Iris-virginica
  136 | Iris-virginica
  143 | Iris-virginica
  145 | Iris-virginica
 (30 rows)
 </pre>
 Count missclassifications:
 <pre class="example">
 SELECT COUNT(*) FROM iris_predict JOIN iris_test USING (id)
 WHERE iris_predict.estimated_class_text != iris_test.class_text;
 </pre>
 <pre class="result">
  count
 -------+
      1
 </pre>
 Percent missclassifications:
 <pre class="example">
 SELECT round(count(*)*100/(150*0.2),2) as test_accuracy_percent from
     (select iris_test.class_text as actual, iris_predict.estimated_class_text as estimated
      from iris_predict inner join iris_test
      on iris_test.id=iris_predict.id) q
 WHERE q.actual=q.estimated;
 </pre>
 <pre class="result">
  test_accuracy_percent
 -----------------------+
                  96.67
 </pre>

 <h4>Classification with Other Parameters</h4>

 -# Validation dataset.  Now use a validation dataset
 and compute metrics every 3rd iteration using
 the 'metrics_compute_frequency' parameter. This can
 help reduce run time if you do not need metrics
 computed at every iteration.  Also turn on image caching.
 <pre class="example">
 DROP TABLE IF EXISTS iris_multi_model, iris_multi_model_summary, iris_multi_model_info;
 SELECT madlib.madlib_keras_fit_multiple_model('iris_train_packed',    -- source_table
                                               'iris_multi_model',     -- model_output_table
                                               'mst_table',            -- model_selection_table
                                                10,                     -- num_iterations
                                                FALSE,                 -- use gpus
                                               'iris_test_packed',     -- validation dataset
                                                3,                     -- metrics compute frequency
                                                FALSE,                 -- warm start
                                               'Sophie L.',            -- name
                                               'Model selection for iris dataset',  -- description
                                                TRUE                   -- use caching
                                              );
 </pre>
 View the model summary:
 <pre class="example">
 SELECT * FROM iris_multi_model_summary;
 </pre>
 <pre class="result">
 source_table              | iris_train_packed
 validation_table          | iris_test_packed
 model                     | iris_multi_model
 model_info                | iris_multi_model_info
 dependent_varname         | class_text
 independent_varname       | attributes
 model_arch_table          | model_arch_library
 num_iterations            | 10
 metrics_compute_frequency | 3
 warm_start                | f
 name                      | Sophie L.
 description               | Model selection for iris dataset
 start_training_time       | 2019-12-16 19:28:16.219137
 end_training_time         | 2019-12-16 19:30:19.238692
 madlib_version            | 1.17.0
 num_classes               | 3
 class_values              | {Iris-setosa,Iris-versicolor,Iris-virginica}
 dependent_vartype         | character varying
 normalizing_const         | 1
 metrics_iters             | {3,6,9,10}
 </pre>
 View results for each model:
 <pre class="example">
 SELECT * FROM iris_multi_model_info ORDER BY training_metrics_final DESC, training_loss_final;
 </pre>
 <pre class="result">
  mst_key | model_id |                                 compile_params                                  |      fit_params       |  model_type  |  model_size  |                         metrics_elapsed_time                          | metrics_type | training_metrics_final | training_loss_final |                             training_metrics                              |                               training_loss                               | validation_metrics_final | validation_loss_final |                            validation_metrics                             |                              validation_loss
 ---------+----------+---------------------------------------------------------------------------------+-----------------------+--------------+--------------+-----------------------------------------------------------------------+--------------+------------------------+---------------------+---------------------------------------------------------------------------+---------------------------------------------------------------------------+--------------------------+-----------------------+---------------------------------------------------------------------------+---------------------------------------------------------------------------
        4 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {37.0420558452606,78.2046208381653,116.242669820786,134.287139892578} | {accuracy}   |         0.975000023842 |      0.165132269263 | {0.75,0.958333313465118,0.958333313465118,0.975000023841858}              | {0.618549585342407,0.319452553987503,0.223872095346451,0.165132269263268} |           0.966666638851 |        0.213689729571 | {0.733333349227905,0.933333337306976,0.933333337306976,0.966666638851166} | {0.683791160583496,0.370491921901703,0.255890935659409,0.213689729571342}
        2 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {36.3931469917297,77.5780539512634,115.430645942688,133.599857807159} | {accuracy}   |         0.966666638851 |      0.277698725462 | {0.591666638851166,0.966666638851166,0.666666686534882,0.966666638851166} | {0.634598553180695,0.334936827421188,0.615665555000305,0.27769872546196}  |           0.966666638851 |         0.34405490756 | {0.5,0.966666638851166,0.566666662693024,0.966666638851166}               | {0.643225967884064,0.41021603345871,0.805291295051575,0.344054907560349}
       10 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {36.8482949733734,78.0155048370361,115.83317399025,134.079672813416}  | {accuracy}   |         0.958333313465 |      0.122385449708 | {0.883333325386047,0.941666662693024,0.858333349227905,0.958333313465118} | {0.291894346475601,0.146935686469078,0.270052850246429,0.122385449707508} |           0.933333337307 |        0.181496843696 | {0.766666650772095,0.866666674613953,0.899999976158142,0.933333337306976} | {0.395013928413391,0.245234906673431,0.301119148731232,0.181496843695641}
        3 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {37.2318170070648,78.3925468921661,116.45490694046,134.491376876831}  | {accuracy}   |         0.941666662693 |      0.193545326591 | {0.966666638851166,0.941666662693024,0.941666662693024,0.941666662693024} | {0.39665362238884,0.213271111249924,0.190151125192642,0.193545326590538}  |           0.933333337307 |        0.151459023356 | {1,0.966666638851166,0.933333337306976,0.933333337306976}                 | {0.464315593242645,0.198051139712334,0.138570576906204,0.151459023356438}
        9 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {37.6678929328918,78.820240020752,116.939878940582,134.959810972214}  | {accuracy}   |         0.925000011921 |      0.192344605923 | {0.824999988079071,0.774999976158142,0.966666638851166,0.925000011920929} | {0.434513121843338,0.326292037963867,0.131333693861961,0.192344605922699} |           0.899999976158 |        0.209528595209 | {0.800000011920929,0.766666650772095,0.966666638851166,0.899999976158142} | {0.52033931016922,0.344535797834396,0.170280396938324,0.209528595209122}
        8 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {38.0689258575439,79.4995639324188,117.36315202713,135.380483865738}  | {accuracy}   |         0.866666674614 |      0.390509605408 | {0.691666662693024,0.691666662693024,0.633333325386047,0.866666674613953} | {0.490214675664902,0.444783747196198,0.627961099147797,0.390509605407715} |           0.933333337307 |        0.376114845276 | {0.566666662693024,0.566666662693024,0.533333361148834,0.933333337306976} | {0.575542628765106,0.54660427570343,0.785183191299438,0.376114845275879}
        5 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {38.474328994751,79.9709329605103,117.766183853149,135.803887844086}  | {accuracy}   |         0.841666638851 |      0.576696753502 | {0.616666674613953,0.699999988079071,0.758333325386047,0.841666638851166} | {0.90448260307312,0.750164151191711,0.616493880748749,0.576696753501892}  |           0.899999976158 |        0.631914675236 | {0.666666686534882,0.699999988079071,0.733333349227905,0.899999976158142} | {0.871200919151306,0.780709445476532,0.665971457958221,0.631914675235748}
       11 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {36.6214678287506,77.7987759113312,115.631717920303,133.83836388588}  | {accuracy}   |         0.758333325386 |      0.881635427475 | {0.308333337306976,0.316666662693024,0.75,0.758333325386047}              | {1.12997460365295,1.02749967575073,0.923768699169159,0.881635427474976}   |           0.766666650772 |        0.878168046474 | {0.433333337306976,0.433333337306976,0.766666650772095,0.766666650772095} | {1.07487094402313,0.974115014076233,0.916269063949585,0.878168046474457}
        7 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {38.2849600315094,79.7524738311768,117.580325841904,135.606695890427} | {accuracy}   |         0.691666662693 |      0.444524824619 | {0.908333361148834,0.391666680574417,0.691666662693024,0.691666662693024} | {0.335082054138184,2.02327847480774,0.444351017475128,0.444524824619293}  |           0.566666662693 |        0.539750337601 | {0.800000011920929,0.266666680574417,0.566666662693024,0.566666662693024} | {0.433189332485199,2.3276960849762,0.534160375595093,0.539750337600708}
        6 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {38.6593668460846,80.1789360046387,117.957875013351,135.995815992355} | {accuracy}   |         0.683333337307 |      0.841839790344 | {0.316666662693024,0.366666674613953,0.666666686534882,0.683333337306976} | {1.07646071910858,0.963329672813416,0.87216705083847,0.841839790344238}   |           0.666666686535 |        0.840192914009 | {0.433333337306976,0.533333361148834,0.666666686534882,0.666666686534882} | {1.02845978736877,0.941896677017212,0.861787617206573,0.840192914009094}
        1 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {37.8553328514099,79.2480089664459,117.139881849289,135.155915975571} | {accuracy}   |         0.358333319426 |       1.11013436317 | {0.358333319425583,0.333333343267441,0.333333343267441,0.358333319425583} | {1.10554325580597,1.11694586277008,1.09756696224213,1.11013436317444}     |           0.233333334327 |         1.17629003525 | {0.233333334326744,0.333333343267441,0.333333343267441,0.233333334326744} | {1.16081762313843,1.14324629306793,1.11625325679779,1.1762900352478}
       12 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {37.4500079154968,78.6058378219604,116.700626850128,134.72905087471}  | {accuracy}   |         0.308333337307 |       1.06241953373 | {0.150000005960464,0.333333343267441,0.333333343267441,0.308333337306976} | {1.13338851928711,1.09694564342499,1.07030868530273,1.06241953372955}     |           0.433333337307 |         1.03659796715 | {0.16666667163372,0.333333343267441,0.433333337306976,0.433333337306976}  | {1.06262242794037,1.07252764701843,1.05843663215637,1.03659796714783}
 (12 rows)
 </pre>

 -# Predict probabilities for each class:
 <pre class="example">
 DROP TABLE IF EXISTS iris_predict;
 SELECT madlib.madlib_keras_predict('iris_multi_model', -- model
                                    'iris_test',        -- test_table
                                    'id',               -- id column
                                    'attributes',       -- independent var
                                    'iris_predict',     -- output table
                                     'prob',            -- prediction type
                                     FALSE,             -- use gpus
                                     3                  -- mst_key to use
                                    );
 SELECT * FROM iris_predict ORDER BY id;
 </pre>
 <pre class="result">
  id  | prob_Iris-setosa | prob_Iris-versicolor | prob_Iris-virginica
 -----+------------------+----------------------+---------------------
    9 |       0.99931216 |        0.00068789057 |       6.2587335e-10
   18 |       0.99984336 |        0.00015656587 |        7.969957e-12
   22 |        0.9998497 |        0.00015029701 |       6.4133347e-12
   26 |        0.9995004 |        0.00049964694 |       2.2795305e-10
   35 |       0.99964666 |        0.00035332117 |       9.4490485e-11
   38 |       0.99964666 |        0.00035332117 |       9.4490485e-11
   42 |        0.9985154 |         0.0014845316 |        5.293262e-09
   43 |       0.99964476 |         0.0003552362 |        9.701174e-11
   45 |        0.9997311 |        0.00026883607 |        3.076166e-11
   46 |        0.9995486 |        0.00045140853 |       1.6814435e-10
   50 |        0.9997856 |        0.00021441824 |       2.1316622e-11
   53 |     9.837335e-06 |           0.97109175 |         0.028898403
   60 |    0.00014028326 |           0.96552837 |         0.034331344
   68 |    0.00087942625 |            0.9883348 |         0.010785843
   77 |      6.08114e-06 |           0.94356424 |         0.056429718
   78 |     7.116364e-07 |            0.8596206 |          0.14037873
   79 |    1.3918722e-05 |           0.94052655 |          0.05945957
   81 |    0.00045687397 |            0.9794796 |         0.020063542
   82 |     0.0015463434 |           0.98768973 |         0.010763981
   85 |    1.0929693e-05 |           0.87866926 |         0.121319845
   95 |    6.3600986e-05 |           0.95264935 |         0.047287125
   97 |    0.00020298029 |             0.981617 |         0.018180028
   98 |    0.00019721613 |           0.98902065 |          0.01078211
  113 |    1.0388683e-09 |           0.23626474 |           0.7637353
  117 |     4.598902e-09 |           0.25669694 |           0.7433031
  118 |    3.7139156e-11 |           0.13193987 |           0.8680601
  127 |    2.1297862e-07 |             0.670349 |          0.32965073
  136 |    7.1760774e-12 |           0.07074605 |            0.929254
  143 |    1.2568385e-09 |          0.113820426 |           0.8861796
  145 |      6.17019e-11 |          0.117578305 |          0.88242173
 (30 rows)
 </pre>

 -# Warm start.  Next, use the warm_start parameter
 to continue learning, using the coefficients from
 the run above. Note that we don't drop the
 model table or model summary table:
 <pre class="example">
 SELECT madlib.madlib_keras_fit_multiple_model('iris_train_packed',    -- source_table
                                               'iris_multi_model',     -- model_output_table
                                               'mst_table',            -- model_selection_table
                                                3,                     -- num_iterations
                                                FALSE,                 -- use gpus
                                               'iris_test_packed',     -- validation dataset
                                                1,                     -- metrics compute frequency
                                                TRUE,                  -- warm start
                                               'Sophie L.',            -- name
                                               'Simple MLP for iris dataset',  -- description
                                                TRUE                   -- use caching
                                              );
 SELECT * FROM iris_multi_model_summary;
 </pre>
 <pre class="result">
 source_table              | iris_train_packed
 validation_table          | iris_test_packed
 model                     | iris_multi_model
 model_info                | iris_multi_model_info
 dependent_varname         | class_text
 independent_varname       | attributes
 model_arch_table          | model_arch_library
 num_iterations            | 3
 metrics_compute_frequency | 1
 warm_start                | t
 name                      | Sophie L.
 description               | Simple MLP for iris dataset
 start_training_time       | 2019-12-16 20:07:41.488587
 end_training_time         | 2019-12-16 20:08:27.20651
 madlib_version            | 1.17.0
 num_classes               | 3
 class_values              | {Iris-setosa,Iris-versicolor,Iris-virginica}
 dependent_vartype         | character varying
 normalizing_const         | 1
 metrics_iters             | {1,2,3}
 </pre>
 View results for each model:
 <pre class="example">
 SELECT * FROM iris_multi_model_info ORDER BY training_metrics_final DESC, training_loss_final;
 </pre>
 <pre class="result">
  mst_key | model_id |                                 compile_params                                  |      fit_params       |  model_type  |  model_size  |                 metrics_elapsed_time                 | metrics_type | training_metrics_final | training_loss_final |                    training_metrics                     |                      training_loss                       | validation_metrics_final | validation_loss_final |                   validation_metrics                    |                     validation_loss
 ---------+----------+---------------------------------------------------------------------------------+-----------------------+--------------+--------------+------------------------------------------------------+--------------+------------------------+---------------------+---------------------------------------------------------+----------------------------------------------------------+--------------------------+-----------------------+---------------------------------------------------------+---------------------------------------------------------
        5 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {19.451014995575,37.2563629150391,54.7182998657227}  | {accuracy}   |         0.975000023842 |      0.490673750639 | {0.958333313465118,0.691666662693024,0.975000023841858} | {0.541427075862885,0.517450392246246,0.490673750638962}  |           0.933333337307 |        0.557333409786 | {0.933333337306976,0.666666686534882,0.933333337306976} | {0.60710871219635,0.570206344127655,0.557333409786224}
        9 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {18.2973220348358,36.3793680667877,54.0178129673004} | {accuracy}   |         0.966666638851 |     0.0894369781017 | {0.966666638851166,0.966666638851166,0.966666638851166} | {0.133233144879341,0.111788973212242,0.0894369781017303} |           0.899999976158 |        0.195293620229 | {0.933333337306976,0.966666638851166,0.899999976158142} | {0.156044512987137,0.132803827524185,0.195293620228767}
        4 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {17.6080539226532,35.6788699626923,53.3836889266968} | {accuracy}   |         0.966666638851 |      0.147051945329 | {0.908333361148834,0.958333313465118,0.966666638851166} | {0.225205257534981,0.168186634778976,0.147051945328712}  |           0.866666674614 |        0.250319689512 | {0.899999976158142,0.933333337306976,0.866666674613953} | {0.23467344045639,0.182851999998093,0.250319689512253}
        8 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {18.7529940605164,36.8255958557129,54.3704080581665} | {accuracy}   |         0.966666638851 |      0.244641214609 | {0.691666662693024,0.891666650772095,0.966666638851166} | {0.939713299274445,0.462556451559067,0.244641214609146}  |           0.966666638851 |        0.298279434443 | {0.566666662693024,0.966666638851166,0.966666638851166} | {1.30671143531799,0.412235885858536,0.29827943444252}
       10 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {17.4004180431366,35.4556438922882,53.1877279281616} | {accuracy}   |         0.958333313465 |      0.123381219804 | {0.949999988079071,0.766666650772095,0.958333313465118} | {0.0919980704784393,0.576169073581696,0.123381219804287} |           0.933333337307 |        0.203262642026 | {0.866666674613953,0.766666650772095,0.933333337306976} | {0.199721112847328,0.959742486476898,0.203262642025948}
        3 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {17.81547498703,35.8978669643402,53.5737180709839}   | {accuracy}   |         0.933333337307 |      0.150664463639 | {0.941666662693024,0.925000011920929,0.933333337306976} | {0.117781177163124,0.163000836968422,0.150664463639259}  |           0.833333313465 |        0.365329563618 | {0.866666674613953,0.833333313465118,0.833333313465118} | {0.249404579401016,0.375173389911652,0.365329563617706}
        6 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {19.686233997345,37.4543249607086,54.8708770275116}  | {accuracy}   |         0.858333349228 |      0.743227303028 | {0.675000011920929,0.708333313465118,0.858333349227905} | {0.808507084846497,0.774080872535706,0.743227303028107}  |           0.966666638851 |        0.770158529282 | {0.666666686534882,0.666666686534882,0.966666638851166} | {0.808504283428192,0.791898012161255,0.770158529281616}
       11 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {17.1583528518677,35.0312390327454,52.96133685112}   | {accuracy}   |         0.816666662693 |      0.739802956581 | {0.774999976158142,0.816666662693024,0.816666662693024} | {0.83727890253067,0.792884111404419,0.739802956581116}   |           0.800000011921 |        0.758302807808 | {0.766666650772095,0.800000011920929,0.800000011920929} | {0.837629973888397,0.801746726036072,0.758302807807922}
        2 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {16.9146749973297,34.794900894165,52.7328250408173}  | {accuracy}   |         0.808333337307 |      0.303489625454 | {0.683333337306976,0.966666638851166,0.808333337306976} | {1.05107569694519,0.189959138631821,0.303489625453949}   |           0.866666674614 |        0.285375326872 | {0.666666686534882,0.966666638851166,0.866666674613953} | {1.01942157745361,0.238933652639389,0.285375326871872}
       12 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {18.0590150356293,36.1394078731537,53.7930529117584} | {accuracy}   |         0.699999988079 |       1.02253305912 | {0.550000011920929,0.691666662693024,0.699999988079071} | {1.0493084192276,1.03803598880768,1.02253305912018}      |           0.666666686535 |         1.02013540268 | {0.633333325386047,0.600000023841858,0.666666686534882} | {1.03952574729919,1.03439521789551,1.02013540267944}
        7 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {19.2141709327698,37.0566499233246,54.5629329681396} | {accuracy}   |         0.691666662693 |      0.448221176863 | {0.691666662693024,0.691666662693024,0.691666662693024} | {0.447027385234833,0.444605946540833,0.448221176862717}  |           0.566666662693 |        0.555035352707 | {0.566666662693024,0.566666662693024,0.566666662693024} | {0.551217257976532,0.540408432483673,0.555035352706909}
        1 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {18.501914024353,36.5938439369202,54.194118976593}   | {accuracy}   |         0.358333319426 |       1.09730923176 | {0.333333343267441,0.333333343267441,0.358333319425583} | {1.09999334812164,1.10405397415161,1.09730923175812}     |           0.233333334327 |         1.12532019615 | {0.333333343267441,0.333333343267441,0.233333334326744} | {1.12446486949921,1.13782525062561,1.12532019615173}
 (12 rows)
 </pre>
 Note that the loss and accuracy values pick up from where the previous run left off.

 @anchor notes
 @par Notes

 1. Refer to the deep learning section of the Apache MADlib
 wiki [6] for important information including supported libraries
 and versions.

 2. Classification is currently supported, not regression.

 3. Reminder about the distinction between warm start and transfer learning.  Warm start uses model
 state (weights) from the model output table from a previous training run -
 set the 'warm_start' parameter to TRUE in the fit function.
 Transfer learning uses initial model state (weights) stored in the 'model_arch_table' - in this case set the
 'warm_start' parameter to FALSE in the fit function.

 4. Here are some more details on how warm start works.  These details are mostly applicable when implementing autoML algorithms on top of MADlib's model selection.  In short, the 'model_selection_table' dictates which models get trained and output to the 'model_output_table' and associated summary and info tables.  When 'warm_start' is TRUE, models are built for each 'mst_key' in the 'model_selection_table'.  If there are prior runs for an 'mst_key' then the weights from that run will be used.  If there are no prior runs for an 'mst_key' then random initialization will be used.  For example, let's say we start with 'mst_keys' of 1, 2, 3, and 4 in the 'model_selection_table'. We run fit once to get model and info tables for 1, 2, 3, and 4.  Then we modify the 'model_selection_table' as part of an autoML scheme, in which we remove the 'mst_key' for 1 and add a new 'mst_key' for 5.  Next we run fit with warm start.  The result will be models created for 'mst_keys' of 2, 3, 4, and 5.  Warm start will be used for 2, 3, and 4 (using prior run) and random initialization will be used for 5 (no prior run).  The 'mst_key' of 1 will be dropped.

 5. The 'num_iterations' parameter and the Keras fit parameter 'epochs' can substantially affect accuracy and run-time.
 In general, increasing the number of 'epochs' for a fixed 'num_iterations' will speed up training, but may result
 in lower accuracy.  It's best to do some experimentation to find out what works for your models and dataset.

 @anchor background
 @par Technical Background

 For an introduction to deep learning foundations, including MLP and CNN,
 refer to [7].

 This module trains many models a time across the database cluster in order
 to explore network architectures and hyperparameters.  It uses model hopper
 parallelism (MOP) and has high convergence efficiency since it does not do
 model averaging [2].

 On the effect of database cluster size: as the database cluster size increases,
 it will be faster to train a set of models, as long as you have at
 least as many model selection tuples as segments. This is because model state is "hopped" from
 segment to segment and training takes place in parallel [1,2]. If you have fewer model
 selection tuples to train than segments, then some
 segments may not be busy 100% of the time so speedup will not necessarily increase
 on a larger cluster. Inference (predict) is an embarrassingly parallel operation so
 inference runtimes will be proportionally faster as the number of segments increases.

 @anchor literature
 @literature

 @anchor mlp-lit-1
 [1] "Cerebro: Efficient and Reproducible Model Selection on Deep Learning Systems,"
 Supun Nakandala, Yuhao Zhang, and Arun Kumar, ACM SIGMOD 2019 DEEM Workshop,
 https://adalabucsd.github.io/papers/2019_Cerebro_DEEM.pdf

 [2] "Cerebro: A Data System for Optimized Deep Learning Model Selection,"
 Supun Nakandala, Yuhao Zhang, and Arun Kumar, Proceedings of the VLDB Endowment (2020), Vol. 13, No. 11
 https://adalabucsd.github.io/papers/2020_Cerebro_VLDB.pdf

 [3] https://keras.io/

 [4] https://www.tensorflow.org/

 [5] "Neural Networks for Machine Learning", Lectures 6a and 6b on mini-batch gradient descent,
 Geoffrey Hinton with Nitish Srivastava and Kevin Swersky,
 http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf

 [6] Deep learning section of Apache MADlib wiki https://cwiki.apache.org/confluence/display/MADLIB/Deep+Learning

 [7] Deep Learning, Ian Goodfellow, Yoshua Bengio and Aaron Courville, MIT Press, 2016.

 [8] Greenplum Database server configuration parameters https://gpdb.docs.pivotal.io/latest/ref_guide/config_params/guc-list.html

 @anchor related
 @par Related Topics

 File madlib_keras_fit_multiple_model.sql_in documents training, evaluate and predict functions.

 */

 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.madlib_keras_fit_multiple_model(
     source_table            VARCHAR,
     model_output_table      VARCHAR,
     model_selection_table   VARCHAR,
     num_iterations          INTEGER,
     use_gpus                BOOLEAN,
     validation_table        VARCHAR,
     metrics_compute_frequency  INTEGER,
     warm_start              BOOLEAN,
     name                    VARCHAR,
     description             VARCHAR,
     use_caching             BOOLEAN DEFAULT FALSE
 ) RETURNS VOID AS $$
     PythonFunctionBodyOnly(`deep_learning', `madlib_keras_fit_multiple_model')
     from utilities.control import SetGUC
     with AOControl(False):
         with SetGUC("plan_cache_mode", "force_generic_plan"):
             fit_obj = madlib_keras_fit_multiple_model.FitMultipleModel(**globals())
 $$ LANGUAGE plpythonu VOLATILE
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA', `');

 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.madlib_keras_fit_multiple_model(
     source_table            VARCHAR,
     model_output_table      VARCHAR,
     model_selection_table   VARCHAR,
     num_iterations          INTEGER,
     use_gpus                BOOLEAN,
     validation_table        VARCHAR,
     metrics_compute_frequency  INTEGER,
     warm_start              BOOLEAN,
     name                    VARCHAR
 ) RETURNS VOID AS $$
     SELECT MADLIB_SCHEMA.madlib_keras_fit_multiple_model($1, $2, $3, $4, $5, $6, $7, $8, $9, NULL);
 $$ LANGUAGE sql VOLATILE
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA');

 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.madlib_keras_fit_multiple_model(
     source_table            VARCHAR,
     model_output_table      VARCHAR,
     model_selection_table   VARCHAR,
     num_iterations          INTEGER,
     use_gpus                BOOLEAN,
     validation_table        VARCHAR,
     metrics_compute_frequency  INTEGER,
     warm_start              BOOLEAN
 ) RETURNS VOID AS $$
     SELECT MADLIB_SCHEMA.madlib_keras_fit_multiple_model($1, $2, $3, $4, $5, $6, $7, $8, NULL, NULL);
 $$ LANGUAGE sql VOLATILE
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA');

 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.madlib_keras_fit_multiple_model(
     source_table            VARCHAR,
     model_output_table      VARCHAR,
     model_selection_table   VARCHAR,
     num_iterations          INTEGER,
     use_gpus                BOOLEAN,
     validation_table        VARCHAR,
     metrics_compute_frequency  INTEGER
 ) RETURNS VOID AS $$
     SELECT MADLIB_SCHEMA.madlib_keras_fit_multiple_model($1, $2, $3, $4, $5, $6, $7, FALSE, NULL, NULL);
 $$ LANGUAGE sql VOLATILE
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA');

 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.madlib_keras_fit_multiple_model(
     source_table            VARCHAR,
     model_output_table      VARCHAR,
     model_selection_table   VARCHAR,
     num_iterations          INTEGER,
     use_gpus                BOOLEAN,
     validation_table        VARCHAR
 ) RETURNS VOID AS $$
     SELECT MADLIB_SCHEMA.madlib_keras_fit_multiple_model($1, $2, $3, $4, $5, $6, NULL, FALSE, NULL, NULL);
 $$ LANGUAGE sql VOLATILE
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA');

 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.madlib_keras_fit_multiple_model(
     source_table            VARCHAR,
     model_output_table      VARCHAR,
     model_selection_table   VARCHAR,
     num_iterations          INTEGER,
     use_gpus                BOOLEAN
 ) RETURNS VOID AS $$
     SELECT MADLIB_SCHEMA.madlib_keras_fit_multiple_model($1, $2, $3, $4, $5, NULL, NULL, FALSE, NULL, NULL);
 $$ LANGUAGE sql VOLATILE
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA');

 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.fit_transition_multiple_model(
     state                      BYTEA,
     dependent_var              BYTEA,
     independent_var            BYTEA,
     dependent_var_shape        INTEGER[],
     independent_var_shape      INTEGER[],
     model_architecture         TEXT,
     compile_params             TEXT,
     fit_params                 TEXT,
     dist_key                   INTEGER,
     dist_key_mapping           INTEGER[],
     current_seg_id             INTEGER,
     segments_per_host          INTEGER,
     images_per_seg             INTEGER[],
     use_gpus                   BOOLEAN,
     accessible_gpus_for_seg    INTEGER[],
     prev_serialized_weights    BYTEA,
     is_final_training_call     BOOLEAN,
     use_caching                BOOLEAN,
     custom_function_map        BYTEA
 ) RETURNS BYTEA AS $$
 PythonFunctionBodyOnlyNoSchema(`deep_learning', `madlib_keras')
     if use_caching:
         return madlib_keras.fit_multiple_transition_caching(**globals())
     else:
         return madlib_keras.fit_transition(is_final_iteration = True, is_multiple_model = True, **globals())
 $$ LANGUAGE plpythonu
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `NO SQL', `');

 DROP AGGREGATE IF EXISTS MADLIB_SCHEMA.fit_step_multiple_model(
     BYTEA,
     BYTEA,
     INTEGER[],
     INTEGER[],
     TEXT,
     TEXT,
     TEXT,
     INTEGER,
     INTEGER[],
     INTEGER,
     INTEGER,
     INTEGER[],
     BOOLEAN,
     INTEGER[],
     BYTEA,
     BOOLEAN,
     BOOLEAN,
     BYTEA);
 CREATE AGGREGATE MADLIB_SCHEMA.fit_step_multiple_model(
     /* dependent_var */              BYTEA,
     /* independent_var */            BYTEA,
     /* dependent_var_shape */        INTEGER[],
     /* independent_var_shape */      INTEGER[],
     /* model_architecture */         TEXT,
     /* compile_params */             TEXT,
     /* fit_params */                 TEXT,
     /* dist_key */                   INTEGER,
     /* dist_key_mapping */           INTEGER[],
     /* current_seg_id */             INTEGER,
     /* segments_per_host */          INTEGER,
     /* images_per_seg */             INTEGER[],
     /* use_gpus */                   BOOLEAN,
     /* accessible_gpus_for_seg */    INTEGER[],
     /* prev_serialized_weights */    BYTEA,
     /* is_final_training_call */     BOOLEAN,
     /* use_caching */                BOOLEAN,
     /* custom_function_obj_map */    BYTEA
 )(
     STYPE=BYTEA,
     SFUNC=MADLIB_SCHEMA.fit_transition_multiple_model
 );