Docs: Various documentation changes

Added missing DL docs on new features
Rearranged the DL tab on the docs
Added DBSCAN clarification

Co-authored-by: Frank McQuillan <fmcquillan@pivotal.io>
diff --git a/doc/mainpage.dox.in b/doc/mainpage.dox.in
index f7f6889..ba60162 100644
--- a/doc/mainpage.dox.in
+++ b/doc/mainpage.dox.in
@@ -17,8 +17,7 @@
     <a href="../v1.16/index.html">v1.16</a>,
     <a href="../v1.15.1/index.html">v1.15.1</a>,
     <a href="../v1.15/index.html">v1.15</a>,
-    <a href="../v1.14/index.html">v1.14</a>,
-    <a href="../v1.13/index.html">v1.13</a>
+    <a href="../v1.14/index.html">v1.14</a>
 </li>
 </ul>
 
@@ -113,6 +112,79 @@
         @defgroup grp_stemmer Stemming
         @ingroup grp_datatrans
 
+@defgroup grp_dl Deep Learning
+@brief A collection of modules for deep learning.
+@details
+There are three main steps in order to run deep learning workloads with MADlib:
+1. <b>Preparation</b>, which includes data preprocessing and model definition.
+Data preprocessing is required to format training data for use by frameworks 
+like Keras and TensorFlow that support mini-batching as an optimization option.
+Model definition involves describing model architectures (and optionally
+custom functions) and loading them into tables.
+2. <b>Model training</b>, either one model at a time or multiple models in parallel.  
+In the latter case, you will need to define the configurations for the multiple models
+that you want to train - this can be done manually or in an automated way using autoML methods.
+The trained models can then be used for evaluation and inference.
+
+This flowchart shows the workflow in more detail:
+
+\dot
+    digraph {
+    node [fontname=Helvetica, fontsize=10 color=green];
+    subgraph cluster_0 {
+        label="1. Model Preparation" fontname=Helvetica color=blue;
+        b [ label="Preprocess data"];
+        c [ label="Define model architectures"];
+        d [ label="Define custom functions (optional)"];
+        b -> c -> d;
+    }
+
+    subgraph cluster_1 {
+        label="2a. Train Single Model" fontname=Helvetica color=blue;
+        e [ label="Fit"];
+        f [ label="Inference"];
+        d -> e -> f;
+    }
+
+    subgraph cluster_2 {
+        label="2b. Train Multiple Models" fontname=Helvetica color=blue;
+        g [ label="Define model configurations"];
+        h [ label="Fit Multiple"];
+        i [ label="Inference"];
+        j [ label="AutoML"];
+        d -> g -> h -> i;
+        d -> j -> i;
+    } 
+}
+ \enddot
+
+@{
+
+    @defgroup grp_model_prep Model Preparation
+    @brief Prepare models and data for deep learning.
+    @details Prepare models and data for deep learning.
+    @{
+        @defgroup grp_input_preprocessor_dl Preprocess Data
+        @defgroup grp_keras_model_arch Define Model Architectures
+        @defgroup grp_custom_function Define Custom Functions
+    @}
+    @defgroup grp_keras Train Single Model
+    @defgroup grp_model_selection Train Multiple Models
+    @brief Train multiple deep learning models at the same time for model architecture search and hyperparameter selection.
+    @details Train multiple deep learning models at the same time for model architecture search and hyperparameter selection.
+    @{
+        @defgroup grp_keras_setup_model_selection Define Model Configurations
+        @defgroup grp_keras_run_model_selection Train Model Configurations
+        @defgroup grp_automl AutoML
+    @}
+    @defgroup grp_dl_utilities Utilities for Deep Learning
+    @brief Utilities specific to deep learning workflows.
+    @details Utilities specific to deep learning workflows.
+    @{
+        @defgroup grp_gpu_configuration Show GPU Configuration
+    @}
+@}
+
 @defgroup grp_graph Graph
 @brief Graph algorithms and measures associated with graphs.
 @details Graph algorithms and measures associated with graphs.
@@ -241,7 +313,6 @@
     @brief Methods for clustering data.
     @details Methods for clustering data.
     @{
-        @defgroup grp_dbscan DBSCAN
         @defgroup grp_kmeans k-Means Clustering
     @}
 
@@ -287,24 +358,7 @@
 Interface and implementation are subject to change.
 @{
     @defgroup grp_cg Conjugate Gradient
-    @defgroup grp_dl Deep Learning
-    @brief A collection of modules for deep learning.
-    @details A collection of modules for deep learning.
-    @{
-        @defgroup grp_gpu_configuration GPU Configuration
-        @defgroup grp_keras Keras
-        @defgroup grp_custom_function Load Custom Functions
-        @defgroup grp_keras_model_arch Load Models
-        @defgroup grp_model_selection Model Selection for DL
-        @brief Train multiple deep learning models at the same time for model architecture search and hyperparameter selection.
-        @details Train multiple deep learning models at the same time for model architecture search and hyperparameter selection.
-        @{
-            @defgroup grp_automl AutoML
-            @defgroup grp_keras_setup_model_selection Generate Model Configurations
-            @defgroup grp_keras_run_model_selection Run Model Selection
-        @}
-        @defgroup grp_input_preprocessor_dl Preprocessor for Images
-    @}
+    @defgroup grp_dbscan DBSCAN
     @defgroup grp_bayes Naive Bayes Classification
     @defgroup grp_sample Random Sampling
 @}
diff --git a/src/ports/postgres/modules/dbscan/dbscan.sql_in b/src/ports/postgres/modules/dbscan/dbscan.sql_in
index 01d0a55..f4539b2 100644
--- a/src/ports/postgres/modules/dbscan/dbscan.sql_in
+++ b/src/ports/postgres/modules/dbscan/dbscan.sql_in
@@ -34,16 +34,18 @@
 <li class="level1"><a href="#assignment">Cluster Assignment</a></li>
 <li class="level1"><a href="#examples">Examples</a></li>
 <li class="level1"><a href="#literature">Literature</a></li>
-<li class="level1"><a href="#related">Related Topics</a></li>
 </ul>
 </div>
 
 @brief Partitions a set of observations into clusters of arbitrary
 shape based on the density of nearby neighbors.
 
+\warning <em> This MADlib method is still in early stage development.
+Interface and implementation are subject to change. </em>
+
 Density-based spatial clustering of applications with noise (DBSCAN)
 is a data clustering algorithm designed to discover clusters of arbitrary
-shape [1]. It places minimum requirements on domain knowledge to determine
+shape [1,2]. It places minimum requirements on domain knowledge to determine
 input parameters and has good efficiency on large databases.
 
 Given a set of points, DBSCAN groups together points that are closely packed with many
@@ -53,6 +55,10 @@
 This method tends to be good for data which contains clusters
 of similar density.
 
+Currently only a brute force approach is implemented, suitable for small datasets.  
+Other approaches for larger datasets will be implemented in 
+a future release.
+
 @anchor cluster
 @par Clustering Function
 
@@ -64,8 +70,7 @@
         eps,
         min_samples,
         metric,
-        algorithm,
-        algorithm_params
+        algorithm
       )
 </pre>
 
@@ -125,26 +130,15 @@
 <li><b>\ref dist_tanimoto</b>: tanimoto (element-wise mean of normalized points)</li>
 </dd>
 
-<dt>algorithm TBD??? (optional)</dt>
+<dt>algorithm (optional)</dt>
 <dd>TEXT, default: 'brute_force'. The name of the algorithm
-used to compute clusters. The following options are supported:
+used to compute clusters. Currently only brute force is supported:
 <ul>
 <li><b>\ref brute_force</b>: Produces an exact result by searching
 all points in the search space.  Brute force can be slow and is intended
 to be used for small datasets only.  You can also use a short
 form "b" or "brute" etc. to select brute force.</li>
-<li><b>\ref kd_tree</b>: Uses a tree structure to reduce the amount
-of search required to form clusters
-The depth of the kd-tree to search is specified
-by the "algorithm_params" below.
-You can also use a short
-form "k" or "kd" etc. to select kd-tree.</li></ul></dd>
-
-<dt>algorithm_params (optional)</dt>
-<dd>TEXT, default: 'depth=3'. This parameters apply to the
-kd-tree algorithm only.  Increasing the depth of the tree will
-decrease the run-time but reduce the accuracy. TBD???</dd>
-
+</ul></dd>
 </dl>
 
 <b>Output</b>
@@ -152,7 +146,7 @@
 The output table for DBSCAN module has the following columns:
 <table class="output">
     <tr>
-        <th>id_in</th>
+        <th>id_column</th>
         <td>INTEGER. Test data point id.</td>
     </tr>
     <tr>
@@ -167,7 +161,7 @@
         table.</td>
     </tr>
     <tr>
-        <th>points</th>
+        <th>__points__</th>
         <td>TEXT. Column or expression for the data point.</td>
     </tr>
 </table>
@@ -196,7 +190,6 @@
 <dd>TEXT. Name of the table containing the input data points.
 </dd>
 
-
 <dt>id_column</dt>
 <dd>TEXT. Name of the column containing a unique integer id for each training point.
 </dd>
@@ -209,8 +202,9 @@
 <dt>output_table</dt>
 <dd>TEXT. Name of the table containing the clustering results.
 </dd>
+</dl>
 
-<b>Output TBD</b>
+<b>Output</b>
 <br>
 The output is a table with the following columns:
 <table class="output">
@@ -228,6 +222,7 @@
     </tr>
 </table>
 
+
 @anchor examples
 @par Examples
 
@@ -331,10 +326,8 @@
 -----------+------+------------
  pid       | 1.75 | dist_norm2
 </pre>
--#  Find the cluster assignment.  In this example we use the same source
-points for demonstration purposes:
+-#  Find the cluster assignment for the test data points:
 <pre class="example">
-
 SELECT madlib.dbscan_predict(
                         'dbscan_result',        -- from DBSCAN run
                         'dbscan_test_data',     -- test dataset
@@ -342,36 +335,34 @@
                         'points',               -- data point
                         'dbscan_predict_out'    -- output table
                         );
-
+SELECT * FROM dbscan_predict_out ORDER BY pid;
 </pre>
 <pre class="result">
-TBD???
-</pre>
--#  Now let's run DBSCAN using the kd-tree method with a Euclidean
-distance function:
-<pre class="example">
-DROP TABLE IF EXISTS dbscan_result_kd, dbscan_result_kd_summary;
-SELECT madlib.dbscan(
-                'dbscan_train_data',    -- source table
-                'dbscan_result_kd',     -- output table
-                'pid',                  -- point id column
-                'points',               -- data point
-                 1.75,                  -- epsilon
-                 4,                     -- min samples
-                'dist_norm2',           -- metric
-                'kd_tree',              -- algorithm
-                'depth=3');             -- depth of kd-tree
-SELECT * FROM dbscan_result_kd ORDER BY pid;
-</pre>
-<pre class="result">
-TBD???
+ pid | cluster_id | distance 
+-----+------------+----------
+   1 |          0 |        0
+   2 |          0 |        0
+   3 |          0 |        1
+   4 |          0 |        0
+  10 |          1 |        1
+  13 |          2 |        0
+  14 |          2 |        0
+  15 |          2 |        0
+(8 rows)
 </pre>
 
 @anchor literature
 @literature
 
-@anchor related
-@par Related Topics
+[1] Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu,
+"A Density-Based Algorithm for Discovering Clusters
+in Large Spatial Databases with Noise", KDD-96 Proceedings, 
+https://www.aaai.org/Papers/KDD/1996/KDD96-037.pdf
+
+[2] Erich Schubert, Jörg Sander, Martin Ester, Hans-Peter Kriegel, Xiaowei Xu,
+"DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN",
+ACM Transactions on Database Systems, July 2017, Article No. 19,
+https://dl.acm.org/doi/10.1145/3068335
 
 */
 
diff --git a/src/ports/postgres/modules/deep_learning/input_data_preprocessor.sql_in b/src/ports/postgres/modules/deep_learning/input_data_preprocessor.sql_in
index 28464ef..447ee21 100644
--- a/src/ports/postgres/modules/deep_learning/input_data_preprocessor.sql_in
+++ b/src/ports/postgres/modules/deep_learning/input_data_preprocessor.sql_in
@@ -18,7 +18,7 @@
  * under the License.
  *
  * @file input_preprocessor_dl.sql_in
- * @brief Utilities to prepare input image data for use by deep learning modules.
+ * @brief Prepare training data for use by deep learning modules.
  * @date December 2018
  *
  */
@@ -29,29 +29,27 @@
 /**
 @addtogroup grp_input_preprocessor_dl
 
-@brief Utilities that prepare input image data for use by deep learning
-modules.
-
-\warning <em> This MADlib method is still in early stage development.
-Interface and implementation are subject to change. </em>
+@brief Prepare training data for use by deep learning modules.
 
 <div class="toc"><b>Contents</b><ul>
-<li class="level1"><a href="#training_preprocessor_dl">Preprocessor for Training Image Data</a></li>
-<li class="level1"><a href="#validation_preprocessor_dl">Preprocessor for Validation Image Data</a></li>
+<li class="level1"><a href="#training_preprocessor_dl">Preprocessor for Training Data</a></li>
+<li class="level1"><a href="#validation_preprocessor_dl">Preprocessor for Validation Data</a></li>
 <li class="level1"><a href="#output">Output Tables</a></li>
 <li class="level1"><a href="#example">Examples</a></li>
 <li class="level1"><a href="#references">References</a></li>
 <li class="level1"><a href="#related">Related Topics</a></li>
 </ul></div>
 
-This preprocessor is a utility that prepares image data for use
-by frameworks like Keras and TensorFlow that support mini-batching
+This preprocessor prepares training data for deep learning.
+
+It packs multiple training examples into the same row for
+frameworks like Keras and TensorFlow that support mini-batching
 as an optimization option.  The advantage of using mini-batching is that
 it can perform better than stochastic gradient descent
 because it uses more than one training example at a time, typically
 resulting in faster and smoother convergence [1].
 
-Images can be
+In the case of image processing, images can be
 represented as an array of numbers
 where each element represents grayscale,
 RGB or other channel values for each
@@ -61,14 +59,18 @@
 so it can be set depending on
 the format of image data used.
 
+This preprocessor also sets the distribution rules
+for the training data.  For example, you may only want
+to train models on segments that reside on hosts that are GPU enabled.
+
 There are two versions of the preprocessor:
-training_preprocessor_dl() preprocesses input image data to be
+training_preprocessor_dl() preprocesses input data to be
 used for training a deep learning model, while
 validation_preprocessor_dl() preprocesses validation
-image data used for model evaluation.
+data used for model evaluation.
 
 @anchor training_preprocessor_dl
-@par Preprocessor for Training Image Data
+@par Preprocessor for Training Data
 
 <pre class="syntax">
 training_preprocessor_dl(source_table,
@@ -101,12 +103,15 @@
   In the case a validation data set is used (see
   later on this page), this output table is also used
   as an input to the validation preprocessor
-  so that the validation and training image data are
+  so that the validation and training data are
   both preprocessed in an identical manner.
   </dd>
 
   <dt>dependent_varname</dt>
   <dd>TEXT. Name of the dependent variable column.
+  In the case that there are multiple dependent variable columns,
+  representing a multi-output neural network, 
+  put the columns as a comma separated list, e.g., 'dep_var1, dep_var2, dep_var3'.
   @note The mini-batch preprocessor automatically 1-hot encodes
   dependent variables of all types.  The exception is numeric array types
   (integer and float), where we assume these are already 1-hot encoded,
@@ -115,7 +120,9 @@
 
   <dt>independent_varname</dt>
   <dd>TEXT. Name of the independent variable column. The column must be
-  a numeric array type.
+  a numeric array type. In the case that there are multiple independent variable columns,
+  representing a multi-input neural network,  
+  put the columns as a comma separated list, e.g., 'indep_var1, indep_var2, indep_var3'.
   </dd>
 
   <dt>buffer_size (optional)</dt>
@@ -142,8 +149,8 @@
   </dd>
 
   <dt>num_classes (optional)</dt>
-  <dd>INTEGER, default: NULL. Number of class labels for 1-hot
-  encoding. If NULL, the 1-hot encoded array
+  <dd>INTEGER[], default: NULL. Number of class labels of each dependent
+  variable for 1-hot encoding. If NULL, the 1-hot encoded array
   length will be equal to the number
   of distinct class values found in the input table.
   </dd>
@@ -169,7 +176,7 @@
 </dl>
 
 @anchor validation_preprocessor_dl
-@par Preprocessor for Validation Image Data
+@par Preprocessor for Validation Data
 <pre class="syntax">
 validation_preprocessor_dl(source_table,
                            output_table,
@@ -217,9 +224,11 @@
   <dt>training_preprocessor_table</dt>
   <dd>TEXT. The output table obtained by
   running training_preprocessor_dl().
-  Validation data is preprocessed in the same way as
-  training data, i.e., same normalizing constant and dependent
-  variable class values.
+  Validation data is preprocessed in the same way as training data, i.e., same
+  normalizing constant and dependent variable class values. Note that even if
+  the validation dataset is missing some of the class values completely,
+  this parameter will ensure that the ordering and labels still match with
+  training dataset.
   </dd>
 
  <dt>buffer_size (optional)</dt>
@@ -267,7 +276,7 @@
     validation_preprocessor_dl() contain the following columns:
     <table class="output">
       <tr>
-        <th>independent_var</th>
+        <th><independent_varname></th>
         <td>BYTEA. Packed array of independent variables in PostgreSQL bytea format.
         Arrays of independent variables packed into the output table are
         normalized by dividing each element in the independent variable array by the
@@ -276,7 +285,7 @@
         </td>
       </tr>
       <tr>
-        <th>dependent_var</th>
+        <th><dependent_varname></th>
         <td>BYTEA. Packed array of dependent variables in PostgreSQL bytea format.
         The dependent variable is always one-hot encoded as an
         integer array. For now, we are assuming that
@@ -288,14 +297,14 @@
         </td>
       </tr>
       <tr>
-        <th>independent_var_shape</th>
+        <th><independent_varname>_shape</th>
         <td>INTEGER[]. Shape of the independent variable array after preprocessing.
         The first element is the number of images packed per row, and subsequent
         elements will depend on how the image is described (e.g., channels first or last).
         </td>
       </tr>
       <tr>
-        <th>dependent_var_shape</th>
+        <th><dependent_varname>_shape</th>
         <td>INTEGER[]. Shape of the dependent variable array after preprocessing.
         The first element is the number of images packed per row, and the second
         element is the number of class values.
@@ -333,8 +342,9 @@
         <td>Type of the dependent variable from the source table.</td>
     </tr>
     <tr>
-        <th>class_values</th>
-        <td>The dependent level values that one-hot encoding maps to.</td>
+        <th><dependent_varname>_class_values</th>
+        <td>The dependent level values that one-hot encoding maps to
+        for the dependent variable.</td>
     </tr>
     <tr>
         <th>buffer_size</th>
@@ -472,14 +482,15 @@
 number of segments.
 Here is the packed output table of training data for our simple example:
 <pre class="example">
-SELECT independent_var_shape, dependent_var_shape, buffer_id FROM image_data_packed ORDER BY buffer_id;
+SELECT rgb_shape, species_shape, buffer_id FROM image_data_packed ORDER BY buffer_id;
 </pre>
 <pre class="result">
- independent_var_shape | dependent_var_shape | buffer_id
------------------------+---------------------+-----------
- {26,2,2,3}            | {26,3}              |         0
- {26,2,2,3}            | {26,3}              |         1
-(2 rows)
+ rgb_shape  | species_shape | buffer_id
+------------+---------------+-----------
+ {18,2,2,3} | {18,3}        |         0
+ {18,2,2,3} | {18,3}        |         1
+ {16,2,2,3} | {16,3}        |         2
+(3 rows)
 </pre>
 Review the output summary table:
 <pre class="example">
@@ -490,13 +501,13 @@
 -[ RECORD 1 ]-----------+------------------
 source_table            | image_data
 output_table            | image_data_packed
-dependent_varname       | species
-independent_varname     | rgb
-dependent_vartype       | text
-class_values            | {bird,cat,dog}
-buffer_size             | 26
+dependent_varname       | {species}
+independent_varname     | {rgb}
+dependent_vartype       | {text}
+species_class_values    | {bird,cat,dog}
+buffer_size             | 18
 normalizing_const       | 255
-num_classes             | 3
+num_classes             | {3}
 distribution_rules      | all_segments
 __internal_gpu_config__ | all_segments
 </pre>
@@ -523,14 +534,15 @@
 automatically inferred using the image_data_packed param that is passed.
 Here is the packed output table of validation data for our simple example:
 <pre class="example">
-SELECT independent_var_shape, dependent_var_shape, buffer_id FROM val_image_data_packed ORDER BY buffer_id;
+SELECT rgb_shape, species_shape, buffer_id FROM val_image_data_packed ORDER BY buffer_id;
 </pre>
 <pre class="result">
- independent_var_shape | dependent_var_shape | buffer_id
------------------------+---------------------+-----------
- {26,2,2,3}            | {26,3}              |         0
- {26,2,2,3}            | {26,3}              |         1
-(2 rows)
+ rgb_shape  | species_shape | buffer_id
+------------+---------------+-----------
+ {18,2,2,3} | {18,3}        |         0
+ {18,2,2,3} | {18,3}        |         1
+ {16,2,2,3} | {16,3}        |         2
+(3 rows)
 </pre>
 Review the output summary table:
 <pre class="example">
@@ -541,13 +553,13 @@
 -[ RECORD 1 ]-----------+----------------------
 source_table            | image_data
 output_table            | val_image_data_packed
-dependent_varname       | species
-independent_varname     | rgb
-dependent_vartype       | text
-class_values            | {bird,cat,dog}
-buffer_size             | 26
+dependent_varname       | {species}
+independent_varname     | {rgb}
+dependent_vartype       | {text}
+species_class_values    | {bird,cat,dog}
+buffer_size             | 18
 normalizing_const       | 255
-num_classes             | 3
+num_classes             | {3}
 distribution_rules      | all_segments
 __internal_gpu_config__ | all_segments
 </pre>
@@ -646,14 +658,15 @@
 </pre>
 Here is a sample of the packed output table:
 <pre class="example">
-SELECT independent_var_shape, dependent_var_shape, buffer_id FROM image_data_packed ORDER BY buffer_id;
+SELECT rgb_shape, species_shape, buffer_id FROM image_data_packed ORDER BY buffer_id;
 </pre>
 <pre class="result">
- independent_var_shape | dependent_var_shape | buffer_id
------------------------+---------------------+-----------
- {26,12}               | {26,3}              |         0
- {26,12}               | {26,3}              |         1
-(2 rows)
+ rgb_shape | species_shape | buffer_id
+-----------+---------------+-----------
+ {18,12}   | {18,3}        |         0
+ {18,12}   | {18,3}        |         1
+ {16,12}   | {16,3}        |         2
+(3 rows)
 </pre>
 
 -#  Run the preprocessor for the validation dataset.
@@ -673,14 +686,15 @@
 </pre>
 Here is a sample of the packed output summary table:
 <pre class="example">
-SELECT independent_var_shape, dependent_var_shape, buffer_id FROM val_image_data_packed ORDER BY buffer_id;
+SELECT rgb_shape, species_shape, buffer_id FROM val_image_data_packed ORDER BY buffer_id;
 </pre>
 <pre class="result">
- independent_var_shape | dependent_var_shape | buffer_id
------------------------+---------------------+-----------
- {26,12}               | {26,3}              |         0
- {26,12}               | {26,3}              |         1
-(2 rows)
+ rgb_shape | species_shape | buffer_id
+-----------+---------------+-----------
+ {18,12}   | {18,3}        |         0
+ {18,12}   | {18,3}        |         1
+ {16,12}   | {16,3}        |         2
+(3 rows)
 </pre>
 
 -# Generally the default buffer size will work well,
@@ -694,17 +708,17 @@
                                         10,                   -- Buffer size
                                         255                   -- Normalizing constant
                                         );
-SELECT independent_var_shape, dependent_var_shape, buffer_id FROM image_data_packed ORDER BY buffer_id;
+SELECT rgb_shape, species_shape, buffer_id FROM image_data_packed ORDER BY buffer_id;
 </pre>
 <pre class="result">
- independent_var_shape | dependent_var_shape | buffer_id
------------------------+---------------------+-----------
- {8,12}                | {8,3}               |         0
- {9,12}                | {9,3}               |         1
- {9,12}                | {9,3}               |         2
- {9,12}                | {9,3}               |         3
- {9,12}                | {9,3}               |         4
- {8,12}                | {8,3}               |         5
+ rgb_shape | species_shape | buffer_id
+-----------+---------------+-----------
+ {9,12}    | {9,3}         |         0
+ {9,12}    | {9,3}         |         1
+ {9,12}    | {9,3}         |         2
+ {9,12}    | {9,3}         |         3
+ {9,12}    | {9,3}         |         4
+ {7,12}    | {7,3}         |         5
 (6 rows)
 </pre>
 Review the output summary table:
@@ -716,13 +730,13 @@
 -[ RECORD 1 ]-----------+------------------
 source_table            | image_data
 output_table            | image_data_packed
-dependent_varname       | species
-independent_varname     | rgb
-dependent_vartype       | text
-class_values            | {bird,cat,dog}
-buffer_size             | 10
+dependent_varname       | {species}
+independent_varname     | {rgb}
+dependent_vartype       | {text}
+species_class_values    | {bird,cat,dog}
+buffer_size             | 9
 normalizing_const       | 255
-num_classes             | 3
+num_classes             | {3}
 distribution_rules      | all_segments
 __internal_gpu_config__ | all_segments
 </pre>
@@ -736,19 +750,20 @@
                                         'rgb',                -- Independent variable
                                         NULL,                 -- Buffer size
                                         255,                  -- Normalizing constant
-                                        5                     -- Number of desired class values
+                                        ARRAY[5]              -- Number of desired class values
                                         );
 </pre>
 Here is a sample of the packed output table with the padded 1-hot vector:
 <pre class="example">
-SELECT independent_var_shape, dependent_var_shape, buffer_id FROM image_data_packed ORDER BY buffer_id;
+SELECT rgb_shape, species_shape, buffer_id FROM image_data_packed ORDER BY buffer_id;
 </pre>
 <pre class="result">
- independent_var_shape | dependent_var_shape | buffer_id
------------------------+---------------------+-----------
- {26,12}               | {26,5}              |         0
- {26,12}               | {26,5}              |         1
-(2 rows)
+ rgb_shape | species_shape | buffer_id
+-----------+---------------+-----------
+ {18,12}   | {18,5}        |         0
+ {18,12}   | {18,5}        |         1
+ {16,12}   | {16,5}        |         2
+(3 rows)
 </pre>
 Review the output summary table:
 <pre class="example">
@@ -759,13 +774,13 @@
 -[ RECORD 1 ]-----------+-------------------------
 source_table            | image_data
 output_table            | image_data_packed
-dependent_varname       | species
-independent_varname     | rgb
-dependent_vartype       | text
-class_values            | {bird,cat,dog,NULL,NULL}
-buffer_size             | 26
+dependent_varname       | {species}
+independent_varname     | {rgb}
+dependent_vartype       | {text}
+species_class_values    | {bird,cat,dog,NULL,NULL}
+buffer_size             | 18
 normalizing_const       | 255
-num_classes             | 5
+num_classes             | {5}
 distribution_rules      | all_segments
 __internal_gpu_config__ | all_segments
 </pre>
@@ -792,13 +807,13 @@
 -[ RECORD 1 ]-----------+------------------
 source_table            | image_data
 output_table            | image_data_packed
-dependent_varname       | species
-independent_varname     | rgb
-dependent_vartype       | text
-class_values            | {bird,cat,dog}
+dependent_varname       | {species}
+independent_varname     | {rgb}
+dependent_vartype       | {text}
+species_class_values    | {bird,cat,dog}
 buffer_size             | 26
 normalizing_const       | 255
-num_classes             | 3
+num_classes             | {3}
 distribution_rules      | {2,3,4,5}
 __internal_gpu_config__ | {0,1,2,3}
 </pre>
@@ -831,13 +846,13 @@
 -[ RECORD 1 ]-----------+------------------
 source_table            | image_data
 output_table            | image_data_packed
-dependent_varname       | species
-independent_varname     | rgb
-dependent_vartype       | text
-class_values            | {bird,cat,dog}
+dependent_varname       | {species}
+independent_varname     | {rgb}
+dependent_vartype       | {text}
+species_class_values    | {bird,cat,dog}
 buffer_size             | 26
 normalizing_const       | 255
-num_classes             | 3
+num_classes             | {3}
 distribution_rules      | {2,3}
 __internal_gpu_config__ | {0,1}
 </pre>
diff --git a/src/ports/postgres/modules/deep_learning/keras_model_arch_table.sql_in b/src/ports/postgres/modules/deep_learning/keras_model_arch_table.sql_in
index cc915bb..ee30f94 100644
--- a/src/ports/postgres/modules/deep_learning/keras_model_arch_table.sql_in
+++ b/src/ports/postgres/modules/deep_learning/keras_model_arch_table.sql_in
@@ -20,8 +20,8 @@
  *
  * @file model_arch_table.sql_in
  *
- * @brief SQL functions for multilayer perceptron
- * @date June 2012
+ * @brief Function to load model architectures and weights into a table.
+ * @date Feb 2021
  *
  *
  *//* ----------------------------------------------------------------------- */
@@ -30,10 +30,7 @@
 /**
 @addtogroup grp_keras_model_arch
 
-@brief Utility function to load model architectures and weights into a table.
-
-\warning <em> This MADlib method is still in early stage development.
-Interface and implementation are subject to change. </em>
+@brief Function to load model architectures and weights into a table.
 
 <div class="toc"><b>Contents</b><ul>
 <li class="level1"><a href="#load_keras_model">Load Model</a></li>
@@ -42,16 +39,20 @@
 <li class="level1"><a href="#related">Related Topics</a></li>
 </ul></div>
 
-This utility function loads model architectures and
+This function loads model architectures and
 weights into a table for use by deep learning algorithms.
+
 Model architecture is in JSON form
 and model weights are in the form of PostgreSQL binary data types (bytea).
 If the output table already exists, a new row is inserted
 into the table so it can act as a repository for multiple model
 architectures and weights.
 
-There is also a utility function to delete a model
-from the table.
+There is also a function to delete a model from the table.
+
+MADlib's deep learning methods are designed to use the TensorFlow package and its built in Keras
+functions.  To ensure consistency, please use tensorflow.keras objects (models, layers, etc.) 
+instead of importing Keras and using its objects.
 
 @anchor load_keras_model
 @par Load Model
@@ -73,6 +74,11 @@
 
   <dt>model_arch</dt>
   <dd>JSON. JSON of the model architecture to load.
+  @note Please note that every input layer must have the 'input_shape' stated explicitly 
+  in the model architecture. MADlib has this requirement because, in some cases, 
+  the JSON representation may not have the input shape by default and it has to 
+  be read from the JSON for fit() type functions.
+
   </dd>
 
   <dt>model_weights (optional)</dt>
@@ -153,12 +159,12 @@
 
 @anchor example
 @par Examples
--# Define model architecture.  Use Keras to define
+-# Define model architecture.  Use tensorflow.keras to define
 the model architecture:
 <pre class="example">
-import keras
-from keras.models import Sequential
-from keras.layers import Dense
+from tensorflow import keras
+from tensorflow.keras.models import Sequential
+from tensorflow.keras.layers import Dense
 model_simple = Sequential()
 model_simple.add(Dense(10, activation='relu', input_shape=(4,)))
 model_simple.add(Dense(10, activation='relu'))
@@ -241,8 +247,8 @@
 <pre class="example">
 CREATE OR REPLACE FUNCTION load_weights() RETURNS VOID AS
 $$
-from keras.layers import *
-from keras import Sequential
+from tensorflow.keras.layers import *
+from tensorflow.keras import Sequential
 import numpy as np
 import plpy
 \#
@@ -284,8 +290,8 @@
 import psycopg2 as p2
 conn = p2.connect('postgresql://gpadmin@35.239.240.26:5432/madlib')
 cur = conn.cursor()
-from keras.layers import *
-from keras import Sequential
+from tensorflow.keras.layers import *
+from tensorflow.keras import Sequential
 import numpy as np
 \#
 \# create model
@@ -324,7 +330,7 @@
 <pre class="result">
  count
 -------+
-     3
+     2
 </pre>
 
 @anchor related
diff --git a/src/ports/postgres/modules/deep_learning/madlib_keras.py_in b/src/ports/postgres/modules/deep_learning/madlib_keras.py_in
index 50aa651..cd2d075 100644
--- a/src/ports/postgres/modules/deep_learning/madlib_keras.py_in
+++ b/src/ports/postgres/modules/deep_learning/madlib_keras.py_in
@@ -340,6 +340,7 @@
     is_metrics_specified = True if metrics_list else False
     metrics_type = 'ARRAY{0}'.format(metrics_list) if is_metrics_specified else 'NULL'
     metrics_iters = metrics_iters if metrics_iters else 'NULL'
+    loss_type = get_loss_from_compile_param(compile_params)
 
     # We always compute the training loss and metrics, at least once.
     training_metrics_final, training_metrics = get_metrics_sql_string(
@@ -394,6 +395,7 @@
             ARRAY{dep_vartype}::TEXT[] AS {dependent_vartype_colname},
             {norm_const}::{FLOAT32_SQL_TYPE} AS {normalizing_const_colname},
             {metrics_type}::TEXT[] AS metrics_type,
+            '{loss_type}'::TEXT AS loss_type,
             {training_metrics_final}::DOUBLE PRECISION AS training_metrics_final,
             {training_loss_final}::DOUBLE PRECISION AS training_loss_final,
             {training_metrics}::DOUBLE PRECISION[] AS training_metrics,
@@ -1264,6 +1266,7 @@
 metric:         Metric value on evaluation dataset, where 'metrics_type'
                 below identifies the type of metric.
 metrics_type:   Type of metric used that was used in the training step.
+loss_type:      Type of loss used that was used in the training step.
 """
     else:
         help_string = "No such option. Use {schema_madlib}.madlib_keras_evaluate()"
diff --git a/src/ports/postgres/modules/deep_learning/madlib_keras.sql_in b/src/ports/postgres/modules/deep_learning/madlib_keras.sql_in
index 2620706..429c0f0 100644
--- a/src/ports/postgres/modules/deep_learning/madlib_keras.sql_in
+++ b/src/ports/postgres/modules/deep_learning/madlib_keras.sql_in
@@ -20,7 +20,7 @@
  *
  * @file madlib_keras.sql_in
  *
- * @brief SQL functions for distributed deep learning with keras
+ * @brief Fit, evaluate and predict for one model.
  * @date June 2019
  *
  *
@@ -31,7 +31,7 @@
 /**
 @addtogroup grp_keras
 
-@brief Fit, evaluate and predict using the Keras API.
+@brief Fit, evaluate and predict for one model.
 
 <div class="toc"><b>Contents</b><ul>
 <li class="level1"><a href="#keras_fit">Fit</a></li>
@@ -45,32 +45,29 @@
 <li class="level1"><a href="#related">Related Topics</a></li>
 </ul></div>
 
-\warning <em> This MADlib method is still in early stage development.
-Interface and implementation are subject to change. </em>
-
 This module allows you to use SQL to call deep learning
 models designed in Keras [1], which is a high-level neural
 network API written in Python.
-Keras was developed for fast experimentation.  It can run
-on top of different backends and the one that is currently
+
+Keras can run on top of different backends and the one that is currently
 supported by MADlib is TensorFlow [2].  The implementation
-in MADlib is distributed and designed to train
+in MADlib is designed to train
 a single model across multiple segments (workers)
 in Greenplum database.  (PostgreSQL is also supported.)
 Alternatively, to train multiple models at the same time for model
 architecture search or hyperparameter tuning, you can
-use <a href="group__grp__keras__run__model__selection.html">Model Selection</a>.
+use the methods in <a href="group__grp__model__selection.html">Train Multiple Models</a>.
 
-The main use case is image classification
+The main use case supported is classification
 using sequential models, which are made up of a
 linear stack of layers.  This includes multilayer perceptrons (MLPs)
 and convolutional neural networks (CNNs).  Regression is not
 currently supported.
 
-Before using Keras in MADlib you will need to mini-batch
-your training and evaluation datasets by calling the
-<a href="group__grp__input__preprocessor__dl.html">Preprocessor
-for Images</a> which is a utility that prepares image data for
+Before using Keras in MADlib you will need to preprocess
+your training and evaluation datasets by using the method called
+<a href="group__grp__input__preprocessor__dl.html">Preprocess
+Data</a> which is a utility that prepares data for
 use by models that support mini-batch as an optimization option.
 This is a one-time operation and you would only
 need to re-run the preprocessor if your input data has changed.
@@ -80,12 +77,15 @@
 typically resulting faster and smoother convergence [3].
 
 You can also do inference on models that have not been trained in MADlib,
-but rather imported from an external source.  This is in the section
+but rather imported from an external source.  This is described in the section
 called "Predict BYOM" below, where "BYOM" stands for "Bring Your Own Model."
 
-Note that the following MADlib functions are targeting a specific Keras
-version (2.2.4) with a specific TensorFlow kernel version (1.14).
-Using a newer or older version may or may not work as intended.
+Note that the following MADlib functions are targeting a specific TensorFlow
+kernel version (1.14). Using a newer or older version may or may not work as intended.
+
+MADlib's deep learning methods are designed to use the TensorFlow package and its built in Keras
+functions.  To ensure consistency, please use tensorflow.keras objects (models, layers, etc.) 
+instead of importing Keras and using its objects.
 
 @note CUDA GPU memory cannot be released until the process holding it is terminated.
 When a MADlib deep learning function is called with GPUs, Greenplum internally
@@ -116,7 +116,8 @@
     metrics_compute_frequency,
     warm_start,
     name,
-    description
+    description,
+    object_table
     )
 </pre>
 
@@ -125,7 +126,7 @@
   <dt>source_table</dt>
   <dd>TEXT. Name of the table containing the training data.
   This is the name of the output
-  table from the image preprocessor.  Independent
+  table from the data preprocessor.  Independent
   and dependent variables are specified in the preprocessor
   step which is why you do not need to explictly state
   them here as part of the fit function.</dd>
@@ -162,10 +163,19 @@
     or <em>optimizer='keras.optmizers.adam'</em>.
 
     @note
-    The following loss function is
+    - Custom loss functions and custom metrics can be used as defined in
+    <a href="group__grp__custom__function.html">Define Custom Functions.</a>
+    List the custom function name and provide the name of the table where the 
+    serialized Python objects reside using the parameter 'object_table' below.
+    - The following loss function is
     not supported: <em>sparse_categorical_crossentropy</em>.
     The following metrics are not
-    supported: <em>sparse_categorical_accuracy, top_k_categorical_accuracy, sparse_top_k_categorical_accuracy</em> and custom metrics.
+    supported: <em>sparse_categorical_accuracy, sparse_top_k_categorical_accuracy</em>.
+    - The Keras accuracy parameter <em>top_k_categorical_accuracy</em> returns top 5 accuracy by 
+    default.  If you want a different top k value, use the helper function
+    <a href="group__grp__custom__function.html#top_k_function">Top k Accuracy Function</a> 
+    to create a custom
+    Python function to compute the top k accuracy that you want.
 
   </DD>
 
@@ -178,6 +188,11 @@
     There are no mandatory parameters so
     if you specify NULL, it will use all default
     values as per Keras.
+
+    @note
+    Callbacks are not currently supported except for TensorBoard
+    which you can specify in the usual way,
+    e.g., callbacks=[TensorBoard(log_dir="/tmp/logs/fit")]
   </DD>
 
   <DT>num_iterations</DT>
@@ -207,7 +222,7 @@
   Note that the validation dataset must be preprocessed
   in the same way as the training dataset, so this
   is the name of the output
-  table from running the image preprocessor on the validation dataset.
+  table from running the data preprocessor on the validation dataset.
   Using a validation dataset can mean a
   longer training time, depending on its size.
   This can be controlled using the 'metrics_compute_frequency'
@@ -257,6 +272,14 @@
   <DD>TEXT, default: NULL.
     Free text string to provide a description, if desired.
   </DD>
+
+  <DT>object_table (optional)</DT>
+  <DD>TEXT, default: NULL.
+    Name of the table that contains the custom functions. Note that this table
+    must be created using the <a href="group__grp__custom__function.html">Define Custom Functions</a> method. Do not
+    qualify with a schema name, since schema will be automatically pulled from
+    the function definition.
+  </DD>
 </dl>
 
 <b>Output tables</b>
@@ -285,14 +308,14 @@
         <td>Model output table produced by training.</td>
     </tr>
     <tr>
-        <th>independent_varname</th>
-        <td>Independent variables column from the original
-        source table in the image preprocessing step.</td>
-    </tr>
-    <tr>
         <th>dependent_varname</th>
         <td>Dependent variable column from the original
-        source table in the image preprocessing step.</td>
+        source table in the data preprocessing step.</td>
+    </tr>
+    <tr>
+        <th>independent_varname</th>
+        <td>Independent variables column from the original
+        source table in the data preprocessing step.</td>
     </tr>
     <tr>
         <th>model_arch_table</th>
@@ -363,7 +386,9 @@
         other metrics as a function of time.
         For example, if 'metrics_compute_frequency=5'
         this would be an array of elapsed time for every 5th
-        iteration, plus the last iteration.</td>
+        iteration, plus the last iteration.
+        Note that this field reports the time for training +
+        validation, if there is a validation table provided.</td>
     </tr>
     <tr>
         <th>madlib_version</th>
@@ -371,26 +396,26 @@
     </tr>
     <tr>
         <th>num_classes</th>
-        <td>Count of distinct classes values used.</td>
-    </tr>
-    <tr>
-        <th>class_values</th>
-        <td>Array of actual class values used.</td>
+        <td>Count of distinct classes values used for each dependent variable.</td>
     </tr>
     <tr>
         <th>dependent_vartype</th>
-        <td>Data type of the dependent variable.</td>
+        <td>Data type for each dependent variable.</td>
     </tr>
     <tr>
         <th>normalizing_constant</th>
         <td>Normalizing constant used from the
-        image preprocessing step.</td>
+        data preprocessing step.</td>
     </tr>
     <tr>
         <th>metrics_type</th>
         <td>Metric specified in the 'compile_params'.</td>
     </tr>
     <tr>
+        <th>loss_type</th>
+        <td>Loss specified in the 'compile_params'.</td>
+    </tr>
+    <tr>
         <th>training_metrics_final</th>
         <td>Final value of the training
         metric after all iterations have completed.
@@ -460,6 +485,11 @@
         would be {1,2,3,4,5} indicating that metrics were computed
         at every iteration.</td>
     </tr>
+    <tr>
+        <th><dependent_varname>_class_values</th>
+        <td>Array of class values used for a particular dependent
+        variable. A column will be generated for each dependent variable.</td>
+    </tr>
    </table>
 
 @anchor keras_evaluate
@@ -471,8 +501,7 @@
     model_table,
     test_table,
     output_table,
-    use_gpus,
-    mst_key
+    use_gpus
     )
 </pre>
 
@@ -489,7 +518,7 @@
   Note that test/validation data must be preprocessed in the same
   way as the training dataset, so
   this is the name of the output
-  table from the image preprocessor.  Independent
+  table from the data preprocessor.  Independent
   and dependent variables are specified in the preprocessor
   step which is why you do not need to explictly state
   them here as part of the fit function.</dd>
@@ -500,7 +529,8 @@
     <table class="output">
       <tr>
         <th>loss</th>
-        <td>Loss value on evaluation dataset.</td>
+        <td>Loss value on evaluation dataset, where 'loss_type'
+        below identifies the type of loss.</td>
       </tr>
       <tr>
         <th>metric</th>
@@ -509,7 +539,15 @@
       </tr>
       <tr>
         <th>metrics_type</th>
-        <td>Type of metric used that was used in the training step.</td>
+        <td>Type of metric function that was used in the training step.
+        (It means you cannot have a different metric in evaluate compared
+        to training.)</td>
+      </tr>
+      <tr>
+        <th>loss_type</th>
+        <td>Type of loss function that was used in the training step.
+        (It means you cannot have a different loss in evaluate compared
+        to training.)</td>
       </tr>
 
   <DT>use_gpus (optional)</DT>
@@ -529,13 +567,6 @@
     configuration is 1 GPU per segment.
   </DD>
 
-  <DT>mst_key (optional)</DT>
-  <DD>INTEGER, default: NULL. ID that defines a unique tuple for
-  model architecture-compile parameters-fit parameters in a model
-  selection table.  Do not use this if training one model at a time using madlib_keras_fit().
-  See the <a href="group__grp__keras__run__model__selection.html">Model Selection</a> section
-  for more details on model selection by training multiple models at a time.
-  </DD>
 </DL>
 
 @anchor keras_predict
@@ -549,8 +580,7 @@
     independent_varname,
     output_table,
     pred_type,
-    use_gpus,
-    mst_key
+    use_gpus
     )
 </pre>
 
@@ -579,7 +609,12 @@
   <DD>TEXT. Column with independent variables in the test table.
   If a 'normalizing_const' is specified when preprocessing the
   training dataset, this same normalization will be applied to
-  the independent variables used in predict.
+  the independent variables used in predict. In the case that there 
+  are multiple independent variables, 
+  representing a multi-input neural network,
+  put the columns as a comma 
+  separated list, e.g., 'indep_var1, indep_var2, indep_var3' in the same
+  way as was done in the preprocessor step for the training data.
   </DD>
 
   <DT>output_table</DT>
@@ -588,31 +623,41 @@
     <table class="output">
       <tr>
         <th>id</th>
-        <td>Gives the 'id' for each prediction, corresponding to each row from the test_table.</td>
+        <td>Gives the 'id' for each prediction, corresponding to each row from
+        the test_table.</td>
       </tr>
       <tr>
-        <th>estimated_COL_NAME</th>
+        <th>class_name</th>
         <td>
-        (For pred_type='response') The estimated class
-         for classification, where
-         COL_NAME is the name of the column to be
-         predicted from test data.
+        Name of variable being predicted.
         </td>
       </tr>
       <tr>
-        <th>prob_CLASS</th>
+        <th>class_value</th>
         <td>
-        (For pred_type='prob' for classification) The
-        probability of a given class.
-        There will be one column for each class
-        in the training data.
+        Estimated class value.
+        </td>
+      </tr>
+      <tr>
+        <th>prob</th>
+        <td>
+        Probability of a given class value.
+        </td>
+      </tr>
+      <tr>
+        <th>rank</th>
+        <td>
+        The rank of a given class based on the ordering of probabilities.
         </td>
       </tr>
 
   <DT>pred_type (optional)</DT>
-  <DD>TEXT, default: 'response'. The type of output
-  desired, where 'response' gives the actual prediction
-  and 'prob' gives the probability value for each class.
+  <DD>TEXT or INTEGER or DOUBLE PRECISION default: 'prob'.
+  The type and range of output desired. This parameter allows the following options.
+  - 'response': the actual prediction
+  - 'prob': the probability value for each class
+  - 0<value<1: the lower limit for the probability (double precision)
+  - 1<=value: the lower limit for the rank of the prediction (integer)
   </DD>
 
   <DT>use_gpus (optional)</DT>
@@ -631,14 +676,6 @@
     GPU. The current recommended
     configuration is 1 GPU per segment.
   </DD>
-
-  <DT>mst_key (optional)</DT>
-  <DD>INTEGER, default: NULL. ID that defines a unique tuple for
-  model architecture-compile parameters-fit parameters in a model
-  selection table.  Do not use this if training one model at a time using madlib_keras_fit().
-  See the <a href="group__grp__keras__run__model__selection.html">Model Selection</a> section
-  for more details on model selection by training multiple models at a time.
-  </DD>
 </DL>
 
 
@@ -670,7 +707,7 @@
   <DD>TEXT. Name of the architecture table containing the model
   to use for prediction. The model weights and architecture can be loaded to
   this table by using the
-  <a href="group__grp__keras__model__arch.html">load_keras_model</a> function.
+  <a href="group__grp__keras__model__arch.html">Define Model Architectures</a> function.
   </DD>
 
   <DT>model_id</DT>
@@ -700,26 +737,31 @@
     <table class="output">
       <tr>
         <th>id</th>
-        <td>Gives the 'id' for each prediction, corresponding to each row from the 'test_table'.</td>
+        <td>Gives the 'id' for each prediction, corresponding to each row from
+        the test_table.</td>
       </tr>
       <tr>
-        <th>estimated_dependent_var</th>
+        <th>class_name</th>
         <td>
-        (For pred_type='response') Estimated class for classification. If
-        the 'class_values' parameter is passed in as NULL, then we assume that the class
-        labels are [0,1,2...,n-1] where n-1 is the number of classes in the model
-        architecture.
+        The estimated variable.
         </td>
       </tr>
       <tr>
-        <th>prob_CLASS</th>
+        <th>class_value</th>
         <td>
-         (For pred_type='prob' for classification)
-         Probability of a given class.
-         If 'class_values' is passed in as NULL, we create one column called
-         'prob' which is an array of probabilities for each class.
-         If 'class_values' is not NULL, then there will be one
-         column for each class.
+        The estimated class for classification.
+        </td>
+      </tr>
+      <tr>
+        <th>prob</th>
+        <td>
+        Probability of a given class.
+        </td>
+      </tr>
+      <tr>
+        <th>rank</th>
+        <td>
+        The rank of a given class based on the ordering of probabilities.
         </td>
       </tr>
 
@@ -747,13 +789,14 @@
 
   <DT>class_values (optional)</DT>
   <DD>TEXT[], default: NULL.
-    List of class labels that were used while training the model. See the 'output_table'
-    column above for more details.
+    Two dimensional list of class labels that were used while training the model
+    for each dependent variable.
 
     @note
     If you specify the class values parameter,
-    it must reflect how the dependent variable was 1-hot encoded for training. If you accidently
-    pick another order that does not match the 1-hot encoding, the predictions would be wrong.
+    it must reflect how the dependent variable was 1-hot encoded for training.
+    If you accidently pick another order that does not match the 1-hot encoding,
+    the predictions would be wrong.
   </DD>
 
   <DT>normalizing_const (optional)</DT>
@@ -963,26 +1006,28 @@
 
 -# Call the preprocessor for deep learning.  For the training dataset:
 <pre class="example">
-\\x off
 DROP TABLE IF EXISTS iris_train_packed, iris_train_packed_summary;
 SELECT madlib.training_preprocessor_dl('iris_train',         -- Source table
                                        'iris_train_packed',  -- Output table
                                        'class_text',         -- Dependent variable
                                        'attributes'          -- Independent variable
                                         );
+\\x on
 SELECT * FROM iris_train_packed_summary;
 </pre>
 <pre class="result">
--[ RECORD 1 ]-------+---------------------------------------------
-source_table        | iris_train
-output_table        | iris_train_packed
-dependent_varname   | class_text
-independent_varname | attributes
-dependent_vartype   | character varying
-class_values        | {Iris-setosa,Iris-versicolor,Iris-virginica}
-buffer_size         | 60
-normalizing_const   | 1.0
-num_classes         | 3
+-[ RECORD 1 ]-----------+---------------------------------------------
+source_table            | iris_train
+output_table            | iris_train_packed
+dependent_varname       | {class_text}
+independent_varname     | {attributes}
+dependent_vartype       | {"character varying"}
+class_text_class_values | {Iris-setosa,Iris-versicolor,Iris-virginica}
+buffer_size             | 40
+normalizing_const       | 1
+num_classes             | {3}
+distribution_rules      | all_segments
+__internal_gpu_config__ | all_segments
 </pre>
 For the validation dataset:
 <pre class="example">
@@ -996,24 +1041,26 @@
 SELECT * FROM iris_test_packed_summary;
 </pre>
 <pre class="result">
--[ RECORD 1 ]-------+---------------------------------------------
-source_table        | iris_test
-output_table        | iris_test_packed
-dependent_varname   | class_text
-independent_varname | attributes
-dependent_vartype   | character varying
-class_values        | {Iris-setosa,Iris-versicolor,Iris-virginica}
-buffer_size         | 15
-normalizing_const   | 1.0
-num_classes         | 3
+-[ RECORD 1 ]-----------+---------------------------------------------
+source_table            | iris_test
+output_table            | iris_test_packed
+dependent_varname       | {class_text}
+independent_varname     | {attributes}
+dependent_vartype       | {"character varying"}
+class_text_class_values | {Iris-setosa,Iris-versicolor,Iris-virginica}
+buffer_size             | 10
+normalizing_const       | 1
+num_classes             | {3}
+distribution_rules      | all_segments
+__internal_gpu_config__ | all_segments
 </pre>
 
 -# Define and load model architecture.  Use Keras to define
 the model architecture:
 <pre class="example">
-import keras
-from keras.models import Sequential
-from keras.layers import Dense
+from tensorflow import keras
+from tensorflow.keras.models import Sequential
+from tensorflow.keras.layers import Dense
 model_simple = Sequential()
 model_simple.add(Dense(10, activation='relu', input_shape=(4,)))
 model_simple.add(Dense(10, activation='relu'))
@@ -1073,41 +1120,44 @@
 -[ RECORD 1 ]-------------+--------------------------------------------------------------------------
 source_table              | iris_train_packed
 model                     | iris_model
-dependent_varname         | class_text
-independent_varname       | attributes
+dependent_varname         | {class_text}
+independent_varname       | {attributes}
 model_arch_table          | model_arch_library
 model_id                  | 1
 compile_params            |  loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']
 fit_params                |  batch_size=5, epochs=3
 num_iterations            | 10
 validation_table          |
+object_table              |
 metrics_compute_frequency | 10
 name                      |
 description               |
 model_type                | madlib_keras
 model_size                | 0.7900390625
-start_training_time       | 2019-06-05 20:55:15.785034
-end_training_time         | 2019-06-05 20:55:25.373035
-metrics_elapsed_time      | {9.58799290657043}
-madlib_version            | 1.17.0
-num_classes               | 3
-class_values              | {Iris-setosa,Iris-versicolor,Iris-virginica}
-dependent_vartype         | character varying
+start_training_time       | 2021-02-01 15:58:43.760568
+end_training_time         | 2021-02-01 15:58:44.470054
+metrics_elapsed_time      | {0.709463119506836}
+madlib_version            | 1.18.0-dev
+num_classes               | {3}
+dependent_vartype         | {"character varying"}
 normalizing_const         | 1
 metrics_type              | {accuracy}
-training_metrics_final    | 0.766666650772
-training_loss_final       | 0.721103310585
-training_metrics          | {0.766666650772095}
-training_loss             | {0.721103310585022}
+loss_type                 | categorical_crossentropy
+training_metrics_final    | 0.800000011920929
+training_loss_final       | 0.519745767116547
+training_metrics          | {0.800000011920929}
+training_loss             | {0.519745767116547}
 validation_metrics_final  |
 validation_loss_final     |
 validation_metrics        |
 validation_loss           |
 metrics_iters             | {10}
+class_text_class_values   | {Iris-setosa,Iris-versicolor,Iris-virginica}
 </pre>
 
 -# Use the test dataset to evaluate the model we built above:
 <pre class="example">
+\\x off
 DROP TABLE IF EXISTS iris_validate;
 SELECT madlib.madlib_keras_evaluate('iris_model',       -- model
                                    'iris_test_packed',  -- test table
@@ -1116,9 +1166,9 @@
 SELECT * FROM iris_validate;
 </pre>
 <pre class="result">
-       loss        |      metric       | metrics_type
--------------------+-------------------+--------------
- 0.719491899013519 | 0.800000011920929 | {accuracy}
+       loss        |      metric       | metrics_type |        loss_type
+-------------------+-------------------+--------------+--------------------------
+ 0.566911578178406 | 0.699999988079071 | {accuracy}   | categorical_crossentropy
 (1 row)
 </pre>
 
@@ -1134,73 +1184,133 @@
                                    'attributes', -- independent var
                                    'iris_predict'  -- output table
                                    );
-SELECT * FROM iris_predict ORDER BY id;
+SELECT * FROM iris_predict ORDER BY id, rank;
 </pre>
 <pre class="result">
- id  | estimated_class_text
------+----------------------
-   4 | Iris-setosa
-   6 | Iris-setosa
-   8 | Iris-setosa
-  12 | Iris-setosa
-  13 | Iris-setosa
-  15 | Iris-setosa
-  24 | Iris-setosa
-  30 | Iris-setosa
-  38 | Iris-setosa
-  49 | Iris-setosa
-  60 | Iris-virginica
-  68 | Iris-versicolor
-  69 | Iris-versicolor
-  76 | Iris-versicolor
-  78 | Iris-versicolor
-  81 | Iris-versicolor
-  85 | Iris-virginica
-  90 | Iris-versicolor
-  91 | Iris-versicolor
-  94 | Iris-virginica
- 104 | Iris-virginica
- 106 | Iris-versicolor
- 107 | Iris-virginica
- 110 | Iris-virginica
- 119 | Iris-versicolor
- 127 | Iris-virginica
- 129 | Iris-virginica
- 134 | Iris-versicolor
- 139 | Iris-virginica
- 144 | Iris-virginica
-(30 rows)
+ id  | class_name |   class_value   |    prob     | rank
+-----+------------+-----------------+-------------+------
+   4 | class_text | Iris-setosa     |   0.7689959 |    1
+   4 | class_text | Iris-virginica  |  0.15600422 |    2
+   4 | class_text | Iris-versicolor |  0.07499986 |    3
+   9 | class_text | Iris-setosa     |   0.7642913 |    1
+   9 | class_text | Iris-virginica  |  0.15844841 |    2
+   9 | class_text | Iris-versicolor | 0.077260315 |    3
+  13 | class_text | Iris-setosa     |   0.8160971 |    1
+  13 | class_text | Iris-virginica  |  0.13053566 |    2
+  13 | class_text | Iris-versicolor |  0.05336728 |    3
+  14 | class_text | Iris-setosa     |    0.804419 |    1
+  14 | class_text | Iris-virginica  |  0.13591427 |    2
+  14 | class_text | Iris-versicolor |  0.05966685 |    3
+  15 | class_text | Iris-setosa     |  0.88610095 |    1
+  15 | class_text | Iris-virginica  |  0.08893245 |    2
+  15 | class_text | Iris-versicolor | 0.024966586 |    3
+  25 | class_text | Iris-setosa     |  0.68195176 |    1
+  25 | class_text | Iris-virginica  |  0.20334557 |    2
+  25 | class_text | Iris-versicolor | 0.114702694 |    3
+  34 | class_text | Iris-setosa     |   0.8619849 |    1
+  34 | class_text | Iris-virginica  |   0.1032386 |    2
+  34 | class_text | Iris-versicolor | 0.034776475 |    3
+  36 | class_text | Iris-setosa     |  0.84423053 |    1
+  36 | class_text | Iris-virginica  | 0.114072084 |    2
+  36 | class_text | Iris-versicolor |  0.04169741 |    3
+  39 | class_text | Iris-setosa     |  0.79559565 |    1
+  39 | class_text | Iris-virginica  |  0.13950573 |    2
+  39 | class_text | Iris-versicolor | 0.064898595 |    3
+  48 | class_text | Iris-setosa     |   0.8010248 |    1
+  48 | class_text | Iris-virginica  |  0.13615999 |    2
+  48 | class_text | Iris-versicolor |  0.06281526 |    3
+  56 | class_text | Iris-versicolor |  0.47732472 |    1
+  56 | class_text | Iris-virginica  |  0.46635315 |    2
+  56 | class_text | Iris-setosa     | 0.056322116 |    3
+  63 | class_text | Iris-virginica  |   0.5329179 |    1
+  63 | class_text | Iris-versicolor |  0.38090497 |    2
+  63 | class_text | Iris-setosa     | 0.086177126 |    3
+  65 | class_text | Iris-virginica  |   0.4516514 |    1
+  65 | class_text | Iris-versicolor |   0.4330772 |    2
+  65 | class_text | Iris-setosa     |  0.11527142 |    3
+  69 | class_text | Iris-virginica  |  0.57348573 |    1
+  69 | class_text | Iris-versicolor |  0.36967018 |    2
+  69 | class_text | Iris-setosa     |  0.05684407 |    3
+  72 | class_text | Iris-virginica  |   0.4918356 |    1
+  72 | class_text | Iris-versicolor |  0.42640963 |    2
+  72 | class_text | Iris-setosa     |  0.08175478 |    3
+  73 | class_text | Iris-virginica  |   0.5534297 |    1
+  73 | class_text | Iris-versicolor |  0.39819974 |    2
+  73 | class_text | Iris-setosa     |  0.04837051 |    3
+  75 | class_text | Iris-virginica  |   0.4986787 |    1
+  75 | class_text | Iris-versicolor |  0.43546444 |    2
+  75 | class_text | Iris-setosa     |  0.06585683 |    3
+  82 | class_text | Iris-virginica  |  0.47533202 |    1
+  82 | class_text | Iris-versicolor |  0.43122545 |    2
+  82 | class_text | Iris-setosa     |  0.09344252 |    3
+  90 | class_text | Iris-virginica  |  0.47962278 |    1
+  90 | class_text | Iris-versicolor |  0.45068985 |    2
+  90 | class_text | Iris-setosa     |  0.06968742 |    3
+  91 | class_text | Iris-virginica  |  0.47005868 |    1
+  91 | class_text | Iris-versicolor |   0.4696341 |    2
+  91 | class_text | Iris-setosa     | 0.060307216 |    3
+  97 | class_text | Iris-versicolor |  0.49070656 |    1
+  97 | class_text | Iris-virginica  |  0.44852367 |    2
+  97 | class_text | Iris-setosa     | 0.060769808 |    3
+ 100 | class_text | Iris-versicolor |  0.47884703 |    1
+ 100 | class_text | Iris-virginica  |   0.4577389 |    2
+ 100 | class_text | Iris-setosa     |  0.06341412 |    3
+ 102 | class_text | Iris-virginica  |   0.5396443 |    1
+ 102 | class_text | Iris-versicolor |  0.40945858 |    2
+ 102 | class_text | Iris-setosa     | 0.050897114 |    3
+ 109 | class_text | Iris-virginica  |  0.61228466 |    1
+ 109 | class_text | Iris-versicolor |   0.3522025 |    2
+ 109 | class_text | Iris-setosa     |  0.03551281 |    3
+ 114 | class_text | Iris-virginica  |    0.562418 |    1
+ 114 | class_text | Iris-versicolor |  0.38269255 |    2
+ 114 | class_text | Iris-setosa     |  0.05488944 |    3
+ 128 | class_text | Iris-virginica  |  0.50814027 |    1
+ 128 | class_text | Iris-versicolor |  0.44240898 |    2
+ 128 | class_text | Iris-setosa     |  0.04945076 |    3
+ 138 | class_text | Iris-virginica  |  0.52319044 |    1
+ 138 | class_text | Iris-versicolor |  0.43786547 |    2
+ 138 | class_text | Iris-setosa     |  0.03894412 |    3
+ 140 | class_text | Iris-virginica  |   0.5677875 |    1
+ 140 | class_text | Iris-versicolor |   0.3936515 |    2
+ 140 | class_text | Iris-setosa     | 0.038560882 |    3
+ 141 | class_text | Iris-virginica  |  0.58414406 |    1
+ 141 | class_text | Iris-versicolor |   0.3770253 |    2
+ 141 | class_text | Iris-setosa     |  0.03883058 |    3
+ 150 | class_text | Iris-virginica  |   0.5025033 |    1
+ 150 | class_text | Iris-versicolor |   0.4495215 |    2
+ 150 | class_text | Iris-setosa     | 0.047975186 |    3
+(90 rows)
 </pre>
 Count missclassifications:
 <pre class="example">
 SELECT COUNT(*) FROM iris_predict JOIN iris_test USING (id)
-WHERE iris_predict.estimated_class_text != iris_test.class_text;
+WHERE iris_predict.class_value != iris_test.class_text AND iris_predict.rank = 1;
 </pre>
 <pre class="result">
  count
 -------+
-     6
+     9
 (1 row)
 </pre>
 Accuracy:
 <pre class="example">
 SELECT round(count(*)*100/(150*0.2),2) as test_accuracy_percent from
-    (select iris_test.class_text as actual, iris_predict.estimated_class_text as estimated
+    (select iris_test.class_text as actual, iris_predict.class_value as estimated
      from iris_predict inner join iris_test
-     on iris_test.id=iris_predict.id) q
+     on iris_test.id=iris_predict.id where iris_predict.rank = 1) q
 WHERE q.actual=q.estimated;
 </pre>
 <pre class="result">
  test_accuracy_percent
 -----------------------+
-                 80.00
+                 70.00
 (1 row)
 </pre>
 
 -# Predict BYOM.
 We will use the validation dataset for prediction
 as well, which is not usual but serves to show the
-syntax. See <a href="group__grp__keras__model__arch.html">load_keras_model</a>
+syntax. See <a href="group__grp__keras__model__arch.html">Define Model Architectures</a>
 for details on how to load the model architecture and weights.
 In this example we will use weights we already have:
 <pre class="example">
@@ -1226,62 +1336,62 @@
                                         'iris_predict_byom',   -- output table
                                         'response',            -- prediction type
                                          FALSE,                -- use GPUs
-                                         ARRAY['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'], -- class values
+                                         ARRAY[ARRAY['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']], -- class values
                                          1.0                   -- normalizing const
                                    );
 SELECT * FROM iris_predict_byom ORDER BY id;
 </pre>
 The prediction is in the 'estimated_dependent_var' column:
 <pre class="result">
- id  | estimated_dependent_var
------+----------------------
-   4 | Iris-setosa
-   6 | Iris-setosa
-   8 | Iris-setosa
-  12 | Iris-setosa
-  13 | Iris-setosa
-  15 | Iris-setosa
-  24 | Iris-setosa
-  30 | Iris-setosa
-  38 | Iris-setosa
-  49 | Iris-setosa
-  60 | Iris-virginica
-  68 | Iris-versicolor
-  69 | Iris-versicolor
-  76 | Iris-versicolor
-  78 | Iris-versicolor
-  81 | Iris-versicolor
-  85 | Iris-virginica
-  90 | Iris-versicolor
-  91 | Iris-versicolor
-  94 | Iris-virginica
- 104 | Iris-virginica
- 106 | Iris-versicolor
- 107 | Iris-virginica
- 110 | Iris-virginica
- 119 | Iris-versicolor
- 127 | Iris-virginica
- 129 | Iris-virginica
- 134 | Iris-versicolor
- 139 | Iris-virginica
- 144 | Iris-virginica
+ id  |  class_name   |   class_value   |    prob
+-----+---------------+-----------------+------------
+   4 | dependent_var | Iris-setosa     |  0.7689959
+   9 | dependent_var | Iris-setosa     |  0.7642913
+  13 | dependent_var | Iris-setosa     |  0.8160971
+  14 | dependent_var | Iris-setosa     |   0.804419
+  15 | dependent_var | Iris-setosa     | 0.88610095
+  25 | dependent_var | Iris-setosa     | 0.68195176
+  34 | dependent_var | Iris-setosa     |  0.8619849
+  36 | dependent_var | Iris-setosa     | 0.84423053
+  39 | dependent_var | Iris-setosa     | 0.79559565
+  48 | dependent_var | Iris-setosa     |  0.8010248
+  56 | dependent_var | Iris-versicolor | 0.47732472
+  63 | dependent_var | Iris-virginica  |  0.5329179
+  65 | dependent_var | Iris-virginica  |  0.4516514
+  69 | dependent_var | Iris-virginica  | 0.57348573
+  72 | dependent_var | Iris-virginica  |  0.4918356
+  73 | dependent_var | Iris-virginica  |  0.5534297
+  75 | dependent_var | Iris-virginica  |  0.4986787
+  82 | dependent_var | Iris-virginica  | 0.47533202
+  90 | dependent_var | Iris-virginica  | 0.47962278
+  91 | dependent_var | Iris-virginica  | 0.47005868
+  97 | dependent_var | Iris-versicolor | 0.49070656
+ 100 | dependent_var | Iris-versicolor | 0.47884703
+ 102 | dependent_var | Iris-virginica  |  0.5396443
+ 109 | dependent_var | Iris-virginica  | 0.61228466
+ 114 | dependent_var | Iris-virginica  |   0.562418
+ 128 | dependent_var | Iris-virginica  | 0.50814027
+ 138 | dependent_var | Iris-virginica  | 0.52319044
+ 140 | dependent_var | Iris-virginica  |  0.5677875
+ 141 | dependent_var | Iris-virginica  | 0.58414406
+ 150 | dependent_var | Iris-virginica  |  0.5025033
 (30 rows)
  </pre>
 Count missclassifications:
 <pre class="example">
 SELECT COUNT(*) FROM iris_predict_byom JOIN iris_test USING (id)
-WHERE iris_predict_byom.estimated_dependent_var != iris_test.class_text;
+WHERE iris_predict_byom.class_value != iris_test.class_text;
 </pre>
 <pre class="result">
  count
 -------+
-     6
+     9
 (1 row)
 </pre>
 Accuracy:
 <pre class="example">
 SELECT round(count(*)*100/(150*0.2),2) as test_accuracy_percent from
-    (select iris_test.class_text as actual, iris_predict_byom.estimated_dependent_var as estimated
+    (select iris_test.class_text as actual, iris_predict_byom.class_value as estimated
      from iris_predict_byom inner join iris_test
      on iris_test.id=iris_predict_byom.id) q
 WHERE q.actual=q.estimated;
@@ -1289,7 +1399,7 @@
 <pre class="result">
  test_accuracy_percent
 -----------------------+
-                 80.00
+                 70.00
 (1 row)
 </pre>
 
@@ -1317,43 +1427,46 @@
                                'Sophie L.',           -- name
                                'Simple MLP for iris dataset'  -- description
                               );
+\\x on
 SELECT * FROM iris_model_summary;
 </pre>
 <pre class="result">
 -[ RECORD 1 ]-------------+--------------------------------------------------------------------------
 source_table              | iris_train_packed
 model                     | iris_model
-dependent_varname         | class_text
-independent_varname       | attributes
+dependent_varname         | {class_text}
+independent_varname       | {attributes}
 model_arch_table          | model_arch_library
 model_id                  | 1
 compile_params            |  loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']
 fit_params                |  batch_size=5, epochs=3
 num_iterations            | 10
 validation_table          | iris_test_packed
+object_table              |
 metrics_compute_frequency | 3
 name                      | Sophie L.
 description               | Simple MLP for iris dataset
 model_type                | madlib_keras
 model_size                | 0.7900390625
-start_training_time       | 2019-06-05 20:58:23.224629
-end_training_time         | 2019-06-05 20:58:35.477499
-metrics_elapsed_time      | {4.69859290122986,8.2062520980835,10.8104848861694,12.2528700828552}
-madlib_version            | 1.17.0
-num_classes               | 3
-class_values              | {Iris-setosa,Iris-versicolor,Iris-virginica}
-dependent_vartype         | character varying
+start_training_time       | 2021-01-29 14:41:16.943861
+end_training_time         | 2021-01-29 14:41:19.478149
+metrics_elapsed_time      | {2.3377411365509,2.42358803749084,2.49885511398315,2.53427410125732}
+madlib_version            | 1.18.0-dev
+num_classes               | {3}
+dependent_vartype         | {"character varying"}
 normalizing_const         | 1
 metrics_type              | {accuracy}
-training_metrics_final    | 0.941666662693
-training_loss_final       | 0.40586027503
-training_metrics          | {0.699999988079071,0.800000011920929,0.899999976158142,0.941666662693024}
-training_loss             | {0.825238645076752,0.534248650074005,0.427499741315842,0.405860275030136}
-validation_metrics_final  | 0.866666674614
-validation_loss_final     | 0.409001916647
-validation_metrics        | {0.733333349227905,0.733333349227905,0.866666674613953,0.866666674613953}
-validation_loss           | {0.827081918716431,0.536275088787079,0.431326270103455,0.409001916646957}
+loss_type                 | categorical_crossentropy
+training_metrics_final    | 0.883333325386047
+training_loss_final       | 0.584357917308807
+training_metrics          | {0.733333349227905,0.774999976158142,0.883333325386047,0.883333325386047}
+training_loss             | {0.765825688838959,0.664925456047058,0.605871021747589,0.584357917308807}
+validation_metrics_final  | 0.899999976158142
+validation_loss_final     | 0.590348184108734
+validation_metrics        | {0.699999988079071,0.866666674613953,0.899999976158142,0.899999976158142}
+validation_loss           | {0.81381630897522,0.691304981708527,0.616305589675903,0.590348184108734}
 metrics_iters             | {3,6,9,10}
+class_text_class_values   | {Iris-setosa,Iris-versicolor,Iris-virginica}
 </pre>
 
 -# Predict probabilities for each class:
@@ -1366,41 +1479,102 @@
                                    'iris_predict',    -- output table
                                    'prob'             -- response type
                                    );
+\\x off
 SELECT * FROM iris_predict ORDER BY id;
 </pre>
 <pre class="result">
- id  | prob_Iris-setosa | prob_Iris-versicolor | prob_Iris-virginica
------+------------------+----------------------+---------------------
-   4 |        0.9241953 |          0.059390426 |          0.01641435
-   6 |        0.9657151 |           0.02809224 |        0.0061926916
-   8 |        0.9543316 |           0.03670931 |         0.008959154
-  12 |       0.93851465 |          0.048681837 |         0.012803554
-  13 |       0.93832576 |           0.04893658 |         0.012737647
-  15 |       0.98717564 |           0.01091238 |        0.0019119986
-  24 |        0.9240628 |          0.060805064 |         0.015132156
-  30 |       0.92063266 |          0.062279057 |         0.017088294
-  38 |        0.9353765 |          0.051353406 |         0.013270103
-  49 |        0.9709265 |          0.023811856 |         0.005261566
-  60 |      0.034395564 |            0.5260507 |          0.43955377
-  68 |      0.031360663 |           0.53689945 |          0.43173987
-  69 |     0.0098787155 |           0.46121457 |          0.52890676
-  76 |      0.031186827 |            0.5644549 |          0.40435827
-  78 |       0.00982633 |           0.48929632 |           0.5008774
-  81 |       0.03658528 |           0.53248984 |           0.4309249
-  85 |      0.015423619 |           0.48452598 |           0.5000504
-  90 |      0.026857043 |            0.5155698 |          0.45757324
-  91 |      0.013675574 |           0.47155368 |           0.5147708
-  94 |      0.073440716 |            0.5418821 |           0.3846772
- 104 |     0.0021637122 |            0.3680499 |          0.62978643
- 106 |    0.00052832486 |           0.30891812 |           0.6905536
- 107 |      0.007315576 |           0.40949163 |           0.5831927
- 110 |     0.0022259138 |            0.4058138 |          0.59196025
- 119 |    0.00018505375 |           0.24510723 |           0.7547077
- 127 |      0.009542585 |           0.46958733 |          0.52087003
- 129 |     0.0019719477 |           0.36288205 |            0.635146
- 134 |     0.0056418083 |           0.43401477 |          0.56034344
- 139 |       0.01067015 |            0.4755573 |          0.51377255
- 144 |     0.0018909549 |           0.37689638 |           0.6212126
+ id  | class_name |   class_value   |    prob     | rank
+-----+------------+-----------------+-------------+------
+   4 | class_text | Iris-versicolor |  0.34548566 |    2
+   4 | class_text | Iris-setosa     |  0.57626975 |    1
+   4 | class_text | Iris-virginica  |  0.07824467 |    3
+   7 | class_text | Iris-versicolor |  0.34442508 |    2
+   7 | class_text | Iris-setosa     |  0.57735515 |    1
+   7 | class_text | Iris-virginica  | 0.078219764 |    3
+   9 | class_text | Iris-versicolor |   0.3453845 |    2
+   9 | class_text | Iris-virginica  |  0.08293749 |    3
+   9 | class_text | Iris-setosa     |  0.57167804 |    1
+  12 | class_text | Iris-versicolor |  0.34616387 |    2
+  12 | class_text | Iris-setosa     |   0.5793855 |    1
+  12 | class_text | Iris-virginica  | 0.074450605 |    3
+  18 | class_text | Iris-versicolor |  0.34597218 |    2
+  18 | class_text | Iris-virginica  |  0.07100027 |    3
+  18 | class_text | Iris-setosa     |  0.58302754 |    1
+  20 | class_text | Iris-versicolor |  0.34480608 |    2
+  20 | class_text | Iris-setosa     |   0.5856424 |    1
+  20 | class_text | Iris-virginica  |  0.06955151 |    3
+  24 | class_text | Iris-versicolor |  0.38339624 |    2
+  24 | class_text | Iris-setosa     |   0.5330486 |    1
+  24 | class_text | Iris-virginica  | 0.083555184 |    3
+  30 | class_text | Iris-versicolor |  0.35101113 |    2
+  30 | class_text | Iris-setosa     |  0.56958234 |    1
+  30 | class_text | Iris-virginica  |  0.07940655 |    3
+  31 | class_text | Iris-versicolor |   0.3503181 |    2
+  31 | class_text | Iris-setosa     |   0.5733414 |    1
+  31 | class_text | Iris-virginica  |  0.07634052 |    3
+  33 | class_text | Iris-versicolor |  0.34489658 |    2
+  33 | class_text | Iris-setosa     |   0.5847962 |    1
+  33 | class_text | Iris-virginica  |  0.07030724 |    3
+  35 | class_text | Iris-versicolor |  0.34719768 |    2
+  35 | class_text | Iris-setosa     |    0.577414 |    1
+  35 | class_text | Iris-virginica  |  0.07538838 |    3
+  40 | class_text | Iris-versicolor |   0.3464746 |    2
+  40 | class_text | Iris-setosa     |  0.58250487 |    1
+  40 | class_text | Iris-virginica  | 0.071020484 |    3
+  41 | class_text | Iris-versicolor |  0.34581655 |    2
+  41 | class_text | Iris-setosa     |   0.5805128 |    1
+  41 | class_text | Iris-virginica  |  0.07367061 |    3
+  45 | class_text | Iris-versicolor |  0.38146245 |    2
+  45 | class_text | Iris-setosa     |  0.52559936 |    1
+  45 | class_text | Iris-virginica  |  0.09293811 |    3
+  51 | class_text | Iris-virginica  |  0.41811863 |    2
+  51 | class_text | Iris-setosa     |  0.07617204 |    3
+  51 | class_text | Iris-versicolor |   0.5057093 |    1
+  53 | class_text | Iris-virginica  |  0.47048044 |    2
+  53 | class_text | Iris-versicolor |  0.47150916 |    1
+  53 | class_text | Iris-setosa     | 0.058010455 |    3
+  57 | class_text | Iris-versicolor |   0.4443615 |    2
+  57 | class_text | Iris-setosa     | 0.055230834 |    3
+  57 | class_text | Iris-virginica  |   0.5004077 |    1
+  58 | class_text | Iris-virginica  |  0.35905617 |    2
+  58 | class_text | Iris-setosa     |  0.15329117 |    3
+  58 | class_text | Iris-versicolor |   0.4876526 |    1
+  69 | class_text | Iris-versicolor |   0.4485282 |    2
+  69 | class_text | Iris-virginica  |   0.4913048 |    1
+  69 | class_text | Iris-setosa     | 0.060167026 |    3
+  72 | class_text | Iris-virginica  |  0.38764492 |    2
+  72 | class_text | Iris-versicolor |   0.5052213 |    1
+  72 | class_text | Iris-setosa     |  0.10713379 |    3
+  74 | class_text | Iris-versicolor |  0.44894043 |    2
+  74 | class_text | Iris-setosa     |  0.06307102 |    3
+  74 | class_text | Iris-virginica  |  0.48798853 |    1
+  93 | class_text | Iris-virginica  |  0.40836224 |    2
+  93 | class_text | Iris-setosa     | 0.102442265 |    3
+  93 | class_text | Iris-versicolor |  0.48919544 |    1
+  94 | class_text | Iris-virginica  |  0.35238466 |    2
+  94 | class_text | Iris-versicolor |  0.49192256 |    1
+  94 | class_text | Iris-setosa     |  0.15569273 |    3
+ 110 | class_text | Iris-versicolor |   0.2917817 |    2
+ 110 | class_text | Iris-virginica  |   0.6972358 |    1
+ 110 | class_text | Iris-setosa     | 0.010982483 |    3
+ 113 | class_text | Iris-versicolor |  0.35037678 |    2
+ 113 | class_text | Iris-setosa     | 0.021288367 |    3
+ 113 | class_text | Iris-virginica  |   0.6283349 |    1
+ 117 | class_text | Iris-versicolor |  0.35009244 |    2
+ 117 | class_text | Iris-virginica  |   0.6264066 |    1
+ 117 | class_text | Iris-setosa     | 0.023500985 |    3
+ 123 | class_text | Iris-versicolor |   0.2849912 |    2
+ 123 | class_text | Iris-virginica  |  0.70571697 |    1
+ 123 | class_text | Iris-setosa     | 0.009291774 |    3
+ 127 | class_text | Iris-versicolor |   0.4041788 |    2
+ 127 | class_text | Iris-virginica  |  0.55537915 |    1
+ 127 | class_text | Iris-setosa     | 0.040441982 |    3
+ 130 | class_text | Iris-versicolor |  0.38396156 |    2
+ 130 | class_text | Iris-virginica  |  0.59018326 |    1
+ 130 | class_text | Iris-setosa     | 0.025855187 |    3
+ 143 | class_text | Iris-versicolor |  0.33123586 |    2
+ 143 | class_text | Iris-virginica  |   0.6445185 |    1
+ 143 | class_text | Iris-setosa     | 0.024245638 |    3
 (30 rows)
 </pre>
 
@@ -1423,43 +1597,46 @@
                                'Sophie L.',           -- name
                                'Simple MLP for iris dataset'  -- description
                               );
+\\x on
 SELECT * FROM iris_model_summary;
 </pre>
 <pre class="result">
 -[ RECORD 1 ]-------------+--------------------------------------------------------------------------------------------
 source_table              | iris_train_packed
 model                     | iris_model
-dependent_varname         | class_text
-independent_varname       | attributes
+dependent_varname         | {class_text}
+independent_varname       | {attributes}
 model_arch_table          | model_arch_library
 model_id                  | 1
 compile_params            |  loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']
 fit_params                |  batch_size=5, epochs=3
 num_iterations            | 5
 validation_table          | iris_test_packed
+object_table              |
 metrics_compute_frequency | 1
 name                      | Sophie L.
 description               | Simple MLP for iris dataset
 model_type                | madlib_keras
 model_size                | 0.7900390625
-start_training_time       | 2019-06-05 20:59:43.971792
-end_training_time         | 2019-06-05 20:59:51.654586
-metrics_elapsed_time      | {2.89326310157776,4.14273309707642,5.24781513214111,6.34498596191406,7.68279695510864}
-madlib_version            | 1.17.0
-num_classes               | 3
-class_values              | {Iris-setosa,Iris-versicolor,Iris-virginica}
-dependent_vartype         | character varying
+start_training_time       | 2021-01-29 14:42:28.780276
+end_training_time         | 2021-01-29 14:42:31.177561
+metrics_elapsed_time      | {2.24628114700317,2.28473520278931,2.32178020477295,2.35844302177429,2.39726710319519}
+madlib_version            | 1.18.0-dev
+num_classes               | {3}
+dependent_vartype         | {"character varying"}
 normalizing_const         | 1
 metrics_type              | {accuracy}
-training_metrics_final    | 0.933333337307
-training_loss_final       | 0.334455043077
-training_metrics          | {0.933333337306976,0.933333337306976,0.975000023841858,0.975000023841858,0.933333337306976}
-training_loss             | {0.386842548847198,0.370587915182114,0.357161343097687,0.344598710536957,0.334455043077469}
-validation_metrics_final  | 0.866666674614
-validation_loss_final     | 0.34414178133
-validation_metrics        | {0.866666674613953,0.866666674613953,0.933333337306976,0.866666674613953,0.866666674613953}
-validation_loss           | {0.391442179679871,0.376414686441422,0.362262904644012,0.351912915706635,0.344141781330109}
+loss_type                 | categorical_crossentropy
+training_metrics_final    | 0.916666686534882
+training_loss_final       | 0.456518471240997
+training_metrics          | {0.883333325386047,0.891666650772095,0.908333361148834,0.916666686534882,0.916666686534882}
+training_loss             | {0.559914350509644,0.537041485309601,0.513083755970001,0.47985765337944,0.456518471240997}
+validation_metrics_final  | 0.966666638851166
+validation_loss_final     | 0.432968735694885
+validation_metrics        | {0.899999976158142,0.899999976158142,0.933333337306976,0.966666638851166,0.966666638851166}
+validation_loss           | {0.558336615562439,0.529355347156525,0.496939331293106,0.462678134441376,0.432968735694885}
 metrics_iters             | {1,2,3,4,5}
+class_text_class_values   | {Iris-setosa,Iris-versicolor,Iris-virginica}
 </pre>
 Note that the loss and accuracy values pick up from where the previous run left off.
 
@@ -1535,43 +1712,46 @@
                                 $$ batch_size=5, epochs=3 $$,  -- fit_params
                                 10                    -- num_iterations
                               );
+\\x on
 SELECT * FROM iris_model_summary;
 </pre>
 <pre class="result">
 -[ RECORD 1 ]-------------+--------------------------------------------------------------------------
 source_table              | iris_train_packed
 model                     | iris_model
-dependent_varname         | class_text
-independent_varname       | attributes
+dependent_varname         | {class_text}
+independent_varname       | {attributes}
 model_arch_table          | model_arch_library
 model_id                  | 2
 compile_params            |  loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']
 fit_params                |  batch_size=5, epochs=3
 num_iterations            | 10
 validation_table          |
+object_table              |
 metrics_compute_frequency | 10
 name                      |
 description               |
 model_type                | madlib_keras
 model_size                | 0.7900390625
-start_training_time       | 2019-06-05 21:01:03.998422
-end_training_time         | 2019-06-05 21:01:13.525838
-metrics_elapsed_time      | {9.52741599082947}
-madlib_version            | 1.17.0
-num_classes               | 3
-class_values              | {Iris-setosa,Iris-versicolor,Iris-virginica}
-dependent_vartype         | character varying
+start_training_time       | 2021-01-29 14:44:51.176983
+end_training_time         | 2021-01-29 14:44:53.666457
+metrics_elapsed_time      | {2.48945999145508}
+madlib_version            | 1.18.0-dev
+num_classes               | {3}
+dependent_vartype         | {"character varying"}
 normalizing_const         | 1
 metrics_type              | {accuracy}
-training_metrics_final    | 0.975000023842
-training_loss_final       | 0.245171800256
-training_metrics          | {0.975000023841858}
-training_loss             | {0.245171800255775}
+loss_type                 | categorical_crossentropy
+training_metrics_final    | 0.949999988079071
+training_loss_final       | 0.340020209550858
+training_metrics          | {0.949999988079071}
+training_loss             | {0.340020209550858}
 validation_metrics_final  |
 validation_loss_final     |
 validation_metrics        |
 validation_loss           |
 metrics_iters             | {10}
+class_text_class_values   | {Iris-setosa,Iris-versicolor,Iris-virginica}
 </pre>
 
 @anchor notes
@@ -1606,7 +1786,7 @@
 
 Alternatively, to train multiple models at the same time for model
 architecture search or hyperparameter tuning, you can
-use <a href="group__grp__keras__run__model__selection.html">Model Selection</a>,
+use the methods in <a href="group__grp__model__selection.html">Train Multiple Models</a>,
 which does not do model averaging and hence may have better covergence efficiency.
 
 @anchor literature
diff --git a/src/ports/postgres/modules/deep_learning/madlib_keras_automl.sql_in b/src/ports/postgres/modules/deep_learning/madlib_keras_automl.sql_in
index 8a41328..05bf8c9 100644
--- a/src/ports/postgres/modules/deep_learning/madlib_keras_automl.sql_in
+++ b/src/ports/postgres/modules/deep_learning/madlib_keras_automl.sql_in
@@ -20,7 +20,7 @@
  *
  * @file madlib_keras_automl.sql_in
  *
- * @brief SQL functions for training with autoML methods
+ * @brief Functions to run automated machine learning (autoML) methods for model architecture search and hyperparameter tuning.
  * @date August 2020
  *
  *
@@ -36,9 +36,6 @@
 @brief Functions to run automated machine learning (autoML) methods for
 model architecture search and hyperparameter tuning.
 
-\warning <em> This MADlib method is still in early stage development.
-Interface and implementation are subject to change. </em>
-
 <div class="toc"><b>Contents</b><ul>
 <li class="level1"><a href="#madlib_keras_automl">AutoML Function</a></li>
 <li class="level1"><a href="#hyperband_schedule">Print Hyperband Schedule</a></li>
@@ -49,12 +46,14 @@
 </ul></div>
 
 This module contains automated machine learning (autoML) methods for
-model architecture search and hyperparameter tuning.  The goal of autoML when
+model architecture search and hyperparameter tuning.  
+
+The goal of autoML when
 training deep nets is to reduce the amount of hand-tuning by data scientists
 to produce a model of acceptable accuracy, compared to manual
 methods like grid or random search.  The two autoML methods implemented
 here are Hyperband and Hyperopt.  If you want to use grid or random search,
-please refer to <a href="group__grp__keras__setup__model__selection.html">Generate
+please refer to <a href="group__grp__keras__setup__model__selection.html">Define
 Model Configurations</a>.
 
 Hyperband is an effective model selection algorithm that utilizes the idea
@@ -72,6 +71,8 @@
 estimated parameters.  Within Hyperopt we support random search and Tree
 of Parzen Estimators (TPE) approach.
 
+@note AutoML methods do not currently support multi-input or multi-output neural networks.
+
 @anchor madlib_keras_automl
 @par AutoML
 
@@ -118,7 +119,7 @@
   <dt>model_arch_table</dt>
   <dd>VARCHAR. Table containing model architectures and weights.
   For more information on this table
-  refer to <a href="group__grp__keras__model__arch.html">Load Models</a>.
+  refer to <a href="group__grp__keras__model__arch.html">Define Model Architectures</a>.
   </dd>
 
   <dt>model_selection_table</dt>
@@ -160,10 +161,20 @@
   than regular log-based sampling. However, 'log_near_one' is only supported
   for Hyperband, not for Hyperopt.
 
-  For custom loss functions, metrics or top k categorical accuracy,
-  list the custom function name and provide the name of the
-  table where the serialized Python objects reside using the
-  parameter 'object_table' below.
+  @note
+  - Custom loss functions and custom metrics can be used as defined in
+  <a href="group__grp__custom__function.html">Define Custom Functions.</a>
+  List the custom function name and provide the name of the table where the 
+  serialized Python objects reside using the parameter 'object_table' below.
+  - The following loss function is
+  not supported: <em>sparse_categorical_crossentropy</em>.
+  The following metrics are not
+  supported: <em>sparse_categorical_accuracy, sparse_top_k_categorical_accuracy</em>.
+  - The Keras accuracy parameter <em>top_k_categorical_accuracy</em> returns top 5 accuracy by 
+  default.  If you want a different top k value, use the helper function
+  <a href="group__grp__custom__function.html#top_k_function">Top k Accuracy Function</a> 
+  to create a custom
+  Python function to compute the top k accuracy that you want.
   </dd>
 
   <dt>fit_params_grid</dt>
@@ -180,6 +191,11 @@
     }
   $$
   </pre>
+
+  @note
+  Callbacks are not currently supported except for TensorBoard
+  which you can specify in the usual way,
+  e.g., 'callbacks': ['[TensorBoard(log_dir="/tmp/logs/fit")]']
   </dd>
 
   <dt>automl_method (optional)</dt>
@@ -471,7 +487,13 @@
         If 'num_iterations=5'
         and 'metrics_compute_frequency=1', then 'metrics_iters' value
         would be {1,2,3,4,5} indicating that metrics were computed
-        at every iteration.</td>
+        at every iteration.
+
+        Note that 'metrics_iters' values are for the overall iterations.
+        For some models, the count might start at a later iteration based on
+        the schedule. This representation is selected to simplify representing
+        the results in iteration-metric graphs.
+        </td>
     </tr>
     <tr>
         <th>s</th>
@@ -571,8 +593,9 @@
         <td>Count of distinct classes values used.</td>
     </tr>
     <tr>
-        <th>class_values</th>
-        <td>Array of actual class values used.</td>
+        <th><dependent_varname>_class_values</th>
+        <td>Array of actual class values used for a particular dependent
+        variable. A column will be generated for each dependent variable.</td>
     </tr>
     <tr>
         <th>dependent_vartype</th>
@@ -900,9 +923,9 @@
 -# Define and load model architecture.  Use Keras to define
 the model architecture with 1 hidden layer:
 <pre class="example">
-import keras
-from keras.models import Sequential
-from keras.layers import Dense
+from tensorflow import keras
+from tensorflow.keras.models import Sequential
+from tensorflow.keras.layers import Dense
 model1 = Sequential()
 model1.add(Dense(10, activation='relu', input_shape=(4,)))
 model1.add(Dense(10, activation='relu'))
@@ -991,14 +1014,14 @@
 -# Print Hyperband schedule for example input parameters 'R=9' and 'eta=3':
 <pre class="example">
 DROP TABLE IF EXISTS hb_schedule;
-SELECT madlib.hyperband_schedule ('hb_schedule', 
+SELECT madlib.hyperband_schedule ('hb_schedule',
                                    81,
                                    3,
                                    0);
 SELECT * FROM hb_schedule ORDER BY s DESC, i;
 </pre>
 <pre class="result">
- s | i | n_i | r_i 
+ s | i | n_i | r_i
 ---+---+-----+-----
  4 | 0 |  81 |   1
  4 | 1 |  27 |   3
@@ -1026,8 +1049,8 @@
                                   'automl_mst_table',                 -- model selection output table
                                   ARRAY[1,2],                         -- model IDs
                                   $${
-                                      'loss': ['categorical_crossentropy'], 
-                                      'optimizer_params_list': [ 
+                                      'loss': ['categorical_crossentropy'],
+                                      'optimizer_params_list': [
                                           {'optimizer': ['Adam'],'lr': [0.001, 0.1, 'log']},
                                           {'optimizer': ['RMSprop'],'lr': [0.001, 0.1, 'log']}
                                       ],
@@ -1060,12 +1083,12 @@
 model_selection_table     | automl_mst_table
 automl_method             | hyperband
 automl_params             | R=9, eta=3, skip_last=0
-random_state              | 
-object_table              | 
+random_state              |
+object_table              |
 use_gpus                  | f
 metrics_compute_frequency | 1
-name                      | 
-description               | 
+name                      |
+description               |
 start_training_time       | 2021-01-16 01:20:17
 end_training_time         | 2021-01-16 01:21:47
 madlib_version            | 1.18.0-dev
@@ -1155,8 +1178,8 @@
                                   'automl_mst_table',                 -- model selection output table
                                   ARRAY[1,2],                         -- model IDs
                                   $${
-                                      'loss': ['categorical_crossentropy'], 
-                                      'optimizer_params_list': [ 
+                                      'loss': ['categorical_crossentropy'],
+                                      'optimizer_params_list': [
                                           {'optimizer': ['Adam'],'lr': [0.001, 0.1, 'log']},
                                           {'optimizer': ['RMSprop'],'lr': [0.001, 0.1, 'log']}
                                       ],
@@ -1189,12 +1212,12 @@
 model_selection_table     | automl_mst_table
 automl_method             | hyperopt
 automl_params             | num_configs=20, num_iterations=10, algorithm=tpe
-random_state              | 
-object_table              | 
+random_state              |
+object_table              |
 use_gpus                  | f
 metrics_compute_frequency | 1
-name                      | 
-description               | 
+name                      |
+description               |
 start_training_time       | 2020-10-23 00:24:43
 end_training_time         | 2020-10-23 00:28:41
 madlib_version            | 1.18.0-dev
@@ -1282,7 +1305,7 @@
 SELECT * FROM iris_predict ORDER BY id;
 </pre>
 <pre class="result">
- id  |   class_text    |    prob    
+ id  |   class_text    |    prob
 -----+-----------------+------------
    5 | Iris-setosa     |  0.9998704
    7 | Iris-setosa     | 0.99953365
diff --git a/src/ports/postgres/modules/deep_learning/madlib_keras_custom_function.sql_in b/src/ports/postgres/modules/deep_learning/madlib_keras_custom_function.sql_in
index 979332b..3046891 100644
--- a/src/ports/postgres/modules/deep_learning/madlib_keras_custom_function.sql_in
+++ b/src/ports/postgres/modules/deep_learning/madlib_keras_custom_function.sql_in
@@ -20,7 +20,7 @@
  *
  * @file madlib_keras_custom_function.sql_in
  *
- * @brief Utility function to load serialized Python objects into a table
+ * @brief Function to load serialized Python objects into a table
  * @date May 2020
  *
  *
@@ -30,10 +30,7 @@
 /**
 @addtogroup grp_custom_function
 
-@brief Utility function to load serialized Python objects into a table.
-
-\warning <em> This MADlib method is still in early stage development.
-Interface and implementation are subject to change. </em>
+@brief Function to load serialized Python objects into a table.
 
 <div class="toc"><b>Contents</b><ul>
 <li class="level1"><a href="#load_function">Load Function</a></li>
@@ -44,8 +41,9 @@
 <li class="level1"><a href="#related">Related Topics</a></li>
 </ul></div>
 
-This utility function loads custom Python functions
+This function loads custom Python functions
 into a table for use by deep learning algorithms.
+
 Custom functions can be useful if, for example, you need loss functions
 or metrics that are not built into the standard libraries.
 The functions to be loaded must be in the form of serialized Python objects
@@ -55,7 +53,7 @@
 Custom functions are also used to return top k categorical accuracy rate
 in the case that you want a different k value than the default from Keras.
 This module includes a helper function to create the custom function
-automatically for a specified k. 
+automatically for a specified k.
 
 There is also a utility function to delete a function
 from the table.
@@ -229,12 +227,12 @@
 import dill
 \# custom loss
 def squared_error(y_true, y_pred):
-    import keras.backend as K
+    import tensorflow.keras.backend as K
     return K.square(y_pred - y_true)
 pb_squared_error=dill.dumps(squared_error)
 \# custom metric
 def rmse(y_true, y_pred):
-    import keras.backend as K
+    import tensorflow.keras.backend as K
     return K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1))
 pb_rmse=dill.dumps(rmse)
 \# call load function
@@ -260,7 +258,7 @@
 $$
 import dill
 def squared_error(y_true, y_pred):
-    import keras.backend as K
+    import tensorflow.keras.backend as K
     return K.square(y_pred - y_true)
 pb_squared_error=dill.dumps(squared_error)
 return pb_squared_error
@@ -270,7 +268,7 @@
 $$
 import dill
 def rmse(y_true, y_pred):
-    import keras.backend as K
+    import tensorflow.keras.backend as K
     return K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1))
 pb_rmse=dill.dumps(rmse)
 return pb_rmse
@@ -314,7 +312,7 @@
 SELECT id, name, description FROM custom_function_table ORDER BY id;
 </pre>
 <pre class="result">
- id |      name       |       description       
+ id |      name       |       description
 ----+-----------------+-------------------------
   1 | top_3_accuracy  | returns top_3_accuracy
   2 | top_10_accuracy | returns top_10_accuracy
diff --git a/src/ports/postgres/modules/deep_learning/madlib_keras_fit_multiple_model.sql_in b/src/ports/postgres/modules/deep_learning/madlib_keras_fit_multiple_model.sql_in
index 6d10da4..67ee2c7 100644
--- a/src/ports/postgres/modules/deep_learning/madlib_keras_fit_multiple_model.sql_in
+++ b/src/ports/postgres/modules/deep_learning/madlib_keras_fit_multiple_model.sql_in
@@ -20,7 +20,7 @@
  *
  * @file madlib_keras_model_selection.sql_in
  *
- * @brief SQL functions for model hopper distributed training
+ * @brief Explore network architectures and hyperparameters by training many models a time.
  * @date August 2019
  *
  *
@@ -44,12 +44,11 @@
 <li class="level1"><a href="#related">Related Topics</a></li>
 </ul></div>
 
-\warning <em> This MADlib method is still in early stage development.
-Interface and implementation are subject to change. </em>
-
 This module allows you to explore network architectures and
 hyperparameters by training many models a time across the
-database cluster.  The aim is to support efficient empirical comparison of multiple
+database cluster.  
+
+The aim is to support efficient empirical comparison of multiple
 training configurations. This process is called model selection,
 and the implementation here is based on a parallel execution strategy
 called model hopper parallelism (MOP) [1,2].
@@ -59,19 +58,15 @@
 on top of different backends and the one that is currently
 supported by MADlib is TensorFlow [4].
 
-The main use case is image classification
+The main use case is classification
 using sequential models, which are made up of a
 linear stack of layers.  This includes multilayer perceptrons (MLPs)
 and convolutional neural networks (CNNs).  Regression is not
 currently supported.
 
-Before doing model selection in MADlib you will need to run
-the mini-batch preprocessor, and create a table with the various models
-and hyperparameters to try.
-
-You can mini-batch the training and evaluation datasets by using the
-<a href="group__grp__input__preprocessor__dl.html">Preprocessor
-for Images</a> which is a utility that prepares image data for
+Before using this model selection method, you will need to 
+<a href="group__grp__input__preprocessor__dl.html">Prepocess Data</a>
+which prepares data for
 use by models that support mini-batch as an optimization option.
 This is a one-time operation and you would only
 need to re-run the preprocessor if your input data has changed.
@@ -79,13 +74,10 @@
 can perform better than stochastic gradient descent
 because it uses more than one training example at a time,
 typically resulting faster and smoother convergence [5].
-The input preprocessor also sets the distribution rules
-for the training data.  For example, you may only want
-to train models on segments that reside on hosts that are GPU enabled.
 
 You can set up the models and hyperparameters to try with the
-<a href="group__grp__keras__setup__model__selection.html">Setup
-Model Selection</a> utility to define the unique combinations
+<a href="group__grp__keras__setup__model__selection.html">Define Model
+Configurations</a> function to define the unique combinations
 of model architectures, compile and fit parameters.
 
 @note 1. If 'madlib_keras_fit_multiple_model()' is running on GPDB 5 and some versions
@@ -106,6 +98,9 @@
 wait for gp_vmem_idle_resource_timeout before you run another GPU query (you can
 also set it to a lower value).
 
+@note 3. This method does not currently support multi-input or multi-output neural networks.
+
+
 @anchor keras_fit
 @par Fit
 The fit (training) function has the following format:
@@ -131,7 +126,7 @@
   <dt>source_table</dt>
   <dd>TEXT. Name of the table containing the training data.
   This is the name of the output
-  table from the image preprocessor.  Independent
+  table from the data preprocessor.  Independent
   and dependent variables are specified in the preprocessor
   step which is why you do not need to explictly state
   them here as part of the fit function.</dd>
@@ -182,7 +177,7 @@
   Note that the validation dataset must be preprocessed
   in the same way as the training dataset, so this
   is the name of the output
-  table from running the image preprocessor on the validation dataset.
+  table from running the data preprocessor on the validation dataset.
   Using a validation dataset can mean a
   longer training time, depending on its size.
   This can be controlled using the 'metrics_compute_frequency'
@@ -234,14 +229,14 @@
   </DD>
 
   <DT>use_caching (optional)</DT>
-  <DD>BOOLEAN, default: FALSE. Use caching of images in memory on the
+  <DD>BOOLEAN, default: FALSE. Use caching of data in memory on the
   segment in order to speed up processing.
 
   @note
-  When set to TRUE, image byte arrays on each segment are maintained
+  When set to TRUE, byte arrays on each segment are maintained
   in cache (SD). This can speed up training significantly, however the
   memory usage per segment increases.  In effect, it
-  requires enough available memory on a segment so that all images
+  requires enough available memory on a segment so that all training data
   residing on that segment can be read into memory.
 </dl>
 
@@ -305,13 +300,19 @@
         other metrics as a function of time.
         For example, if 'metrics_compute_frequency=5'
         this would be an array of elapsed time for every 5th
-        iteration, plus the last iteration.</td>
+        iteration, plus the last iteration.
+        Note that this field reports the time for training +
+        validation if there is a validation table provided.</td>
     </tr>
     <tr>
         <th>metrics_type</th>
         <td>Metric specified in the 'compile_params'.</td>
     </tr>
     <tr>
+        <th>loss_type</th>
+        <td>Loss specified in the 'compile_params'.</td>
+    </tr>
+    <tr>
         <th>training_metrics_final</th>
         <td>Final value of the training
         metric after all iterations have completed.
@@ -395,12 +396,12 @@
     <tr>
         <th>dependent_varname</th>
         <td>Dependent variable column from the original
-        source table in the image preprocessing step.</td>
+        source table in the data preprocessing step.</td>
     </tr>
     <tr>
         <th>independent_varname</th>
         <td>Independent variables column from the original
-        source table in the image preprocessing step.</td>
+        source table in the data preprocessing step.</td>
     </tr>
     <tr>
         <th>model_arch_table</th>
@@ -447,8 +448,9 @@
         <td>Count of distinct classes values used.</td>
     </tr>
     <tr>
-        <th>class_values</th>
-        <td>Array of actual class values used.</td>
+        <th><dependent_varname>_class_values</th>
+        <td>Array of actual class values used for a particular dependent
+        variable. A column will be generated for each dependent variable.</td>
     </tr>
     <tr>
         <th>dependent_vartype</th>
@@ -457,7 +459,7 @@
     <tr>
         <th>normalizing_constant</th>
         <td>Normalizing constant used from the
-        image preprocessing step.</td>
+        data preprocessing step.</td>
     </tr>
     <tr>
         <th>metrics_iters</th>
@@ -484,7 +486,8 @@
     model_table,
     test_table,
     output_table,
-    use_gpus
+    use_gpus,
+    mst_key
     )
 </pre>
 
@@ -501,7 +504,7 @@
   Note that test/validation data must be preprocessed in the same
   way as the training dataset, so
   this is the name of the output
-  table from the image preprocessor.  Independent
+  table from the data preprocessor.  Independent
   and dependent variables are specified in the preprocessor
   step which is why you do not need to explictly state
   them here as part of the fit function.</dd>
@@ -512,7 +515,8 @@
     <table class="output">
       <tr>
         <th>loss</th>
-        <td>Loss value on evaluation dataset.</td>
+        <td>Loss value on evaluation dataset, where 'loss_type'
+        below identifies the type of loss.</td>
       </tr>
       <tr>
         <th>metric</th>
@@ -521,7 +525,15 @@
       </tr>
       <tr>
         <th>metrics_type</th>
-        <td>Type of metric used that was used in the training step.</td>
+        <td>Type of metric used that was used in the training step.
+        (It means you cannot have a different metric in evaluate compared
+        to training.)</td>
+      </tr>
+      <tr>
+        <th>loss_type</th>
+        <td>Type of loss function that was used in the training step.
+        (It means you cannot have a different loss in evaluate compared
+        to training.)</td>
       </tr>
 
   <DT>use_gpus (optional)</DT>
@@ -540,6 +552,12 @@
     GPU on each host. The current recommended
     configuration is 1 GPU per segment.
   </DD>
+
+  <DT>mst_key (optional)</DT>
+  <DD>INTEGER, default: NULL. ID that defines a unique tuple for
+  model architecture-compile parameters-fit parameters as defined in the model
+  selection table.
+  </DD>
 </DL>
 
 @anchor keras_predict
@@ -553,7 +571,8 @@
     independent_varname,
     output_table,
     pred_type,
-    use_gpus
+    use_gpus,    
+    mst_key
     )
 </pre>
 
@@ -591,31 +610,41 @@
     <table class="output">
       <tr>
         <th>id</th>
-        <td>Gives the 'id' for each prediction, corresponding to each row from the test_table.</td>
+        <td>Gives the 'id' for each prediction, corresponding to each row from
+        the test_table.</td>
       </tr>
       <tr>
-        <th>estimated_COL_NAME</th>
+        <th>class_name</th>
         <td>
-        (For pred_type='response') The estimated class
-         for classification, where
-         COL_NAME is the name of the column to be
-         predicted from test data.
+        The estimated variable.
         </td>
       </tr>
       <tr>
-        <th>prob_CLASS</th>
+        <th>class_value</th>
         <td>
-        (For pred_type='prob' for classification) The
-        probability of a given class.
-        There will be one column for each class
-        in the training data.
+        The estimated class for classification.
+        </td>
+      </tr>
+      <tr>
+        <th>prob</th>
+        <td>
+        Probability of a given class.
+        </td>
+      </tr>
+      <tr>
+        <th>rank</th>
+        <td>
+        The rank of a given class based on the ordering of probabilities.
         </td>
       </tr>
 
   <DT>pred_type (optional)</DT>
-  <DD>TEXT, default: 'response'. The type of output
-  desired, where 'response' gives the actual prediction
-  and 'prob' gives the probability value for each class.
+  <DD>TEXT or INTEGER or DOUBLE PRECISION default: 'prob'.
+  The type and range of output desired. This parameter allows the following options.
+  - 'response': the actual prediction
+  - 'prob': the probability value for each class
+  - 0<value<1: the lower limit for the probability (double precision)
+  - 1<=value: the lower limit for the rank of the prediction (integer)
   </DD>
 
   <DT>use_gpus (optional)</DT>
@@ -631,6 +660,12 @@
     Therefore, if you have GPUs only on some of the hosts, or an uneven numbers of GPUs per host, then
     set this parameter to FALSE to use CPUs.
   </DD>
+
+  <DT>mst_key (optional)</DT>
+  <DD>INTEGER, default: NULL. ID that defines a unique tuple for
+  model architecture-compile parameters-fit parameters as defined in the model
+  selection table.
+  </DD>
 </DL>
 
 @anchor example
@@ -879,9 +914,9 @@
 -# Define and load model architecture.  Use Keras to define
 the model architecture with 1 hidden layer:
 <pre class="example">
-import keras
-from keras.models import Sequential
-from keras.layers import Dense
+from tensorflow import keras
+from tensorflow.keras.models import Sequential
+from tensorflow.keras.layers import Dense
 model1 = Sequential()
 model1.add(Dense(10, activation='relu', input_shape=(4,)))
 model1.add(Dense(10, activation='relu'))
@@ -1027,23 +1062,25 @@
 </pre>
 <pre class="result">
 source_table              | iris_train_packed
-validation_table          |
+validation_table          | 
 model                     | iris_multi_model
 model_info                | iris_multi_model_info
-dependent_varname         | class_text
-independent_varname       | attributes
+dependent_varname         | {class_text}
+independent_varname       | {attributes}
 model_arch_table          | model_arch_library
+model_selection_table     | mst_table
+object_table              | 
 num_iterations            | 10
 metrics_compute_frequency | 10
 warm_start                | f
-name                      |
-description               |
-start_training_time       | 2019-12-16 18:54:33.826414
-end_training_time         | 2019-12-16 18:56:19.106321
-madlib_version            | 1.17.0
-num_classes               | 3
-class_values              | {Iris-setosa,Iris-versicolor,Iris-virginica}
-dependent_vartype         | character varying
+name                      | 
+description               | 
+start_training_time       | 2021-02-05 00:40:42.695613
+end_training_time         | 2021-02-05 00:42:20.796712
+madlib_version            | 1.18.0-dev
+num_classes               | {1}
+class_text_class_values   | {Iris-setosa,Iris-versicolor,Iris-virginica}
+dependent_vartype         | {"character varying"}
 normalizing_const         | 1
 metrics_iters             | {10}
 </pre>
@@ -1052,20 +1089,20 @@
 SELECT * FROM iris_multi_model_info ORDER BY training_metrics_final DESC, training_loss_final;
 </pre>
 <pre class="result">
- mst_key | model_id |                                 compile_params                                  |      fit_params       |  model_type  |  model_size  | metrics_elapsed_time | metrics_type | training_metrics_final | training_loss_final |  training_metrics   |    training_loss    | validation_metrics_final | validation_loss_final | validation_metrics | validation_loss
----------+----------+---------------------------------------------------------------------------------+-----------------------+--------------+--------------+----------------------+--------------+------------------------+---------------------+---------------------+---------------------+--------------------------+-----------------------+--------------------+-----------------
-       9 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {119.42963886261}    | {accuracy}   |         0.983333349228 |       0.07286978513 | {0.983333349227905} | {0.072869785130024} |                          |                       |                    |
-      10 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {118.485460996628}   | {accuracy}   |         0.975000023842 |     0.0798489004374 | {0.975000023841858} | {0.079848900437355} |                          |                       |                    |
-       4 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {118.707404851913}   | {accuracy}   |         0.975000023842 |      0.143356323242 | {0.975000023841858} | {0.143356323242188} |                          |                       |                    |
-      11 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {118.224883794785}   | {accuracy}   |         0.958333313465 |      0.636615753174 | {0.958333313465118} | {0.636615753173828} |                          |                       |                    |
-       2 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {117.732690811157}   | {accuracy}   |         0.925000011921 |      0.161811202765 | {0.925000011920929} | {0.161811202764511} |                          |                       |                    |
-       5 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {120.357484817505}   | {accuracy}   |         0.833333313465 |        0.5542948246 | {0.833333313465118} | {0.55429482460022}  |                          |                       |                    |
-       3 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {118.928852796555}   | {accuracy}   |         0.824999988079 |      0.301002770662 | {0.824999988079071} | {0.301002770662308} |                          |                       |                    |
-       6 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {120.566634893417}   | {accuracy}   |         0.816666662693 |      0.875298440456 | {0.816666662693024} | {0.87529844045639}  |                          |                       |                    |
-      12 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {119.182703018188}   | {accuracy}   |         0.774999976158 |      0.785651266575 | {0.774999976158142} | {0.78565126657486}  |                          |                       |                    |
-       1 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {119.643137931824}   | {accuracy}   |         0.508333325386 |      0.762569189072 | {0.508333325386047} | {0.762569189071655} |                          |                       |                    |
-       7 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {120.15305685997}    | {accuracy}   |         0.333333343267 |       1.09794270992 | {0.333333343267441} | {1.09794270992279}  |                          |                       |                    |
-       8 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {119.911739826202}   | {accuracy}   |         0.333333343267 |       1.10344016552 | {0.333333343267441} | {1.10344016551971}  |                          |                       |                    |
+  mst_key | model_id |                                 compile_params                                  |      fit_params       |  model_type  | model_size | metrics_elapsed_time | metrics_type |        loss_type         | training_metrics_final | training_loss_final |  training_metrics   |    training_loss     | validation_metrics_final | validation_loss_final | validation_metrics | validation_loss 
+---------+----------+---------------------------------------------------------------------------------+-----------------------+--------------+------------+----------------------+--------------+--------------------------+------------------------+---------------------+---------------------+----------------------+--------------------------+-----------------------+--------------------+-----------------
+       2 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 0.75390625 | {96.2744579315186}   | {accuracy}   | categorical_crossentropy |      0.966666638851166 |  0.0771341994404793 | {0.966666638851166} | {0.0771341994404793} |                          |                       |                    | 
+       3 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.75390625 | {95.0798950195312}   | {accuracy}   | categorical_crossentropy |      0.958333313465118 |    0.14112713932991 | {0.958333313465118} | {0.14112713932991}   |                          |                       |                    | 
+      10 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.18359375 | {95.7743279933929}   | {accuracy}   | categorical_crossentropy |      0.949999988079071 |   0.126085489988327 | {0.949999988079071} | {0.126085489988327}  |                          |                       |                    | 
+       5 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.75390625 | {97.8191220760345}   | {accuracy}   | categorical_crossentropy |      0.866666674613953 |   0.459462374448776 | {0.866666674613953} | {0.459462374448776}  |                          |                       |                    | 
+       4 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.75390625 | {96.7445840835571}   | {accuracy}   | categorical_crossentropy |      0.858333349227905 |   0.279698997735977 | {0.858333349227905} | {0.279698997735977}  |                          |                       |                    | 
+       9 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.18359375 | {95.3640351295471}   | {accuracy}   | categorical_crossentropy |      0.824999988079071 |   0.325970768928528 | {0.824999988079071} | {0.325970768928528}  |                          |                       |                    | 
+       7 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 1.18359375 | {94.8350050449371}   | {accuracy}   | categorical_crossentropy |      0.800000011920929 |   0.458843886852264 | {0.800000011920929} | {0.458843886852264}  |                          |                       |                    | 
+      12 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.18359375 | {96.0411529541016}   | {accuracy}   | categorical_crossentropy |      0.783333361148834 |   0.766786217689514 | {0.783333361148834} | {0.766786217689514}  |                          |                       |                    | 
+      11 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.18359375 | {98.098680973053}    | {accuracy}   | categorical_crossentropy |      0.683333337306976 |   0.607033967971802 | {0.683333337306976} | {0.607033967971802}  |                          |                       |                    | 
+       6 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.75390625 | {96.5071561336517}   | {accuracy}   | categorical_crossentropy |      0.683333337306976 |   0.704851150512695 | {0.683333337306976} | {0.704851150512695}  |                          |                       |                    | 
+       8 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 1.18359375 | {97.4749901294708}   | {accuracy}   | categorical_crossentropy |      0.641666650772095 |   0.473412454128265 | {0.641666650772095} | {0.473412454128265}  |                          |                       |                    | 
+       1 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 0.75390625 | {96.9749960899353}   | {accuracy}   | categorical_crossentropy |      0.358333319425583 |    1.09744954109192 | {0.358333319425583} | {1.09744954109192}   |                          |                       |                    | 
 (12 rows)
 </pre>
 
@@ -1081,9 +1118,9 @@
 SELECT * FROM iris_validate;
 </pre>
 <pre class="result">
-       loss        |      metric       | metrics_type
--------------------+-------------------+--------------
- 0.103803977370262 | 0.966666638851166 | {accuracy}
+        loss        | metric | metrics_type |        loss_type         
+--------------------+--------+--------------+--------------------------
+ 0.0789926201105118 |      1 | {accuracy}   | categorical_crossentropy
 </pre>
 
 -# Predict.  Now predict using one of the models we built. We will use the validation data set
@@ -1103,62 +1140,60 @@
 SELECT * FROM iris_predict ORDER BY id;
 </pre>
 <pre class="result">
- id  | estimated_class_text
------+----------------------
-   9 | Iris-setosa
-  18 | Iris-setosa
-  22 | Iris-setosa
-  26 | Iris-setosa
-  35 | Iris-setosa
-  38 | Iris-setosa
-  42 | Iris-setosa
-  43 | Iris-setosa
-  45 | Iris-setosa
-  46 | Iris-setosa
-  50 | Iris-setosa
-  53 | Iris-versicolor
-  60 | Iris-versicolor
-  68 | Iris-versicolor
-  77 | Iris-versicolor
-  78 | Iris-versicolor
-  79 | Iris-versicolor
-  81 | Iris-versicolor
-  82 | Iris-versicolor
-  85 | Iris-virginica
-  95 | Iris-versicolor
-  97 | Iris-versicolor
-  98 | Iris-versicolor
- 113 | Iris-virginica
- 117 | Iris-virginica
- 118 | Iris-virginica
- 127 | Iris-virginica
- 136 | Iris-virginica
- 143 | Iris-virginica
- 145 | Iris-virginica
+ id  | class_name |   class_value   |    prob    
+-----+------------+-----------------+------------
+   8 | class_text | Iris-setosa     |  0.9979984
+  15 | class_text | Iris-setosa     | 0.99929357
+  20 | class_text | Iris-setosa     |  0.9985701
+  21 | class_text | Iris-setosa     |  0.9984106
+  22 | class_text | Iris-setosa     |  0.9983991
+  27 | class_text | Iris-setosa     | 0.99763095
+  28 | class_text | Iris-setosa     |   0.998376
+  33 | class_text | Iris-setosa     | 0.99901116
+  37 | class_text | Iris-setosa     | 0.99871385
+  38 | class_text | Iris-setosa     | 0.99740535
+  46 | class_text | Iris-setosa     |  0.9968368
+  51 | class_text | Iris-versicolor | 0.93798196
+  57 | class_text | Iris-versicolor | 0.73391247
+  58 | class_text | Iris-versicolor |  0.9449931
+  59 | class_text | Iris-versicolor |  0.8938894
+  64 | class_text | Iris-versicolor |  0.6563309
+  65 | class_text | Iris-versicolor | 0.95828146
+  72 | class_text | Iris-versicolor |    0.94206
+  75 | class_text | Iris-versicolor | 0.93305194
+  87 | class_text | Iris-versicolor |  0.8458596
+  95 | class_text | Iris-versicolor | 0.76850986
+  97 | class_text | Iris-versicolor |  0.8467575
+  98 | class_text | Iris-versicolor |  0.9081358
+ 120 | class_text | Iris-virginica  | 0.86913925
+ 121 | class_text | Iris-virginica  | 0.96328884
+ 122 | class_text | Iris-virginica  |  0.9474733
+ 123 | class_text | Iris-virginica  | 0.97997576
+ 140 | class_text | Iris-virginica  |  0.8721462
+ 144 | class_text | Iris-virginica  |  0.9745266
+ 147 | class_text | Iris-virginica  |  0.8978669
 (30 rows)
 </pre>
 Count missclassifications:
 <pre class="example">
 SELECT COUNT(*) FROM iris_predict JOIN iris_test USING (id)
-WHERE iris_predict.estimated_class_text != iris_test.class_text;
-</pre>
-<pre class="result">
- count
+WHERE iris_predict.class_value != iris_test.class_text;
+ count 
 -------+
-     1
+     0
 </pre>
-Percent missclassifications:
+Accuracy:
 <pre class="example">
 SELECT round(count(*)*100/(150*0.2),2) as test_accuracy_percent from
-    (select iris_test.class_text as actual, iris_predict.estimated_class_text as estimated
+    (select iris_test.class_text as actual, iris_predict.class_value as estimated
      from iris_predict inner join iris_test
      on iris_test.id=iris_predict.id) q
 WHERE q.actual=q.estimated;
 </pre>
 <pre class="result">
- test_accuracy_percent
+ test_accuracy_percent 
 -----------------------+
-                 96.67
+                100.00
 </pre>
 
 <h4>Classification with Other Parameters</h4>
@@ -1167,7 +1202,7 @@
 and compute metrics every 3rd iteration using
 the 'metrics_compute_frequency' parameter. This can
 help reduce run time if you do not need metrics
-computed at every iteration.  Also turn on image caching.
+computed at every iteration.  Also turn on caching.
 <pre class="example">
 DROP TABLE IF EXISTS iris_multi_model, iris_multi_model_summary, iris_multi_model_info;
 SELECT madlib.madlib_keras_fit_multiple_model('iris_train_packed',    -- source_table
@@ -1188,24 +1223,27 @@
 SELECT * FROM iris_multi_model_summary;
 </pre>
 <pre class="result">
+-[ RECORD 1 ]-------------+---------------------------------------------
 source_table              | iris_train_packed
 validation_table          | iris_test_packed
 model                     | iris_multi_model
 model_info                | iris_multi_model_info
-dependent_varname         | class_text
-independent_varname       | attributes
+dependent_varname         | {class_text}
+independent_varname       | {attributes}
 model_arch_table          | model_arch_library
+model_selection_table     | mst_table
+object_table              | 
 num_iterations            | 10
 metrics_compute_frequency | 3
 warm_start                | f
 name                      | Sophie L.
 description               | Model selection for iris dataset
-start_training_time       | 2019-12-16 19:28:16.219137
-end_training_time         | 2019-12-16 19:30:19.238692
-madlib_version            | 1.17.0
-num_classes               | 3
-class_values              | {Iris-setosa,Iris-versicolor,Iris-virginica}
-dependent_vartype         | character varying
+start_training_time       | 2021-02-05 01:03:11.337798
+end_training_time         | 2021-02-05 01:05:14.988912
+madlib_version            | 1.18.0-dev
+num_classes               | {1}
+class_text_class_values   | {Iris-setosa,Iris-versicolor,Iris-virginica}
+dependent_vartype         | {"character varying"}
 normalizing_const         | 1
 metrics_iters             | {3,6,9,10}
 </pre>
@@ -1214,20 +1252,19 @@
 SELECT * FROM iris_multi_model_info ORDER BY training_metrics_final DESC, training_loss_final;
 </pre>
 <pre class="result">
- mst_key | model_id |                                 compile_params                                  |      fit_params       |  model_type  |  model_size  |                         metrics_elapsed_time                          | metrics_type | training_metrics_final | training_loss_final |                             training_metrics                              |                               training_loss                               | validation_metrics_final | validation_loss_final |                            validation_metrics                             |                              validation_loss
----------+----------+---------------------------------------------------------------------------------+-----------------------+--------------+--------------+-----------------------------------------------------------------------+--------------+------------------------+---------------------+---------------------------------------------------------------------------+---------------------------------------------------------------------------+--------------------------+-----------------------+---------------------------------------------------------------------------+---------------------------------------------------------------------------
-       4 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {37.0420558452606,78.2046208381653,116.242669820786,134.287139892578} | {accuracy}   |         0.975000023842 |      0.165132269263 | {0.75,0.958333313465118,0.958333313465118,0.975000023841858}              | {0.618549585342407,0.319452553987503,0.223872095346451,0.165132269263268} |           0.966666638851 |        0.213689729571 | {0.733333349227905,0.933333337306976,0.933333337306976,0.966666638851166} | {0.683791160583496,0.370491921901703,0.255890935659409,0.213689729571342}
-       2 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {36.3931469917297,77.5780539512634,115.430645942688,133.599857807159} | {accuracy}   |         0.966666638851 |      0.277698725462 | {0.591666638851166,0.966666638851166,0.666666686534882,0.966666638851166} | {0.634598553180695,0.334936827421188,0.615665555000305,0.27769872546196}  |           0.966666638851 |         0.34405490756 | {0.5,0.966666638851166,0.566666662693024,0.966666638851166}               | {0.643225967884064,0.41021603345871,0.805291295051575,0.344054907560349}
-      10 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {36.8482949733734,78.0155048370361,115.83317399025,134.079672813416}  | {accuracy}   |         0.958333313465 |      0.122385449708 | {0.883333325386047,0.941666662693024,0.858333349227905,0.958333313465118} | {0.291894346475601,0.146935686469078,0.270052850246429,0.122385449707508} |           0.933333337307 |        0.181496843696 | {0.766666650772095,0.866666674613953,0.899999976158142,0.933333337306976} | {0.395013928413391,0.245234906673431,0.301119148731232,0.181496843695641}
-       3 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {37.2318170070648,78.3925468921661,116.45490694046,134.491376876831}  | {accuracy}   |         0.941666662693 |      0.193545326591 | {0.966666638851166,0.941666662693024,0.941666662693024,0.941666662693024} | {0.39665362238884,0.213271111249924,0.190151125192642,0.193545326590538}  |           0.933333337307 |        0.151459023356 | {1,0.966666638851166,0.933333337306976,0.933333337306976}                 | {0.464315593242645,0.198051139712334,0.138570576906204,0.151459023356438}
-       9 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {37.6678929328918,78.820240020752,116.939878940582,134.959810972214}  | {accuracy}   |         0.925000011921 |      0.192344605923 | {0.824999988079071,0.774999976158142,0.966666638851166,0.925000011920929} | {0.434513121843338,0.326292037963867,0.131333693861961,0.192344605922699} |           0.899999976158 |        0.209528595209 | {0.800000011920929,0.766666650772095,0.966666638851166,0.899999976158142} | {0.52033931016922,0.344535797834396,0.170280396938324,0.209528595209122}
-       8 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {38.0689258575439,79.4995639324188,117.36315202713,135.380483865738}  | {accuracy}   |         0.866666674614 |      0.390509605408 | {0.691666662693024,0.691666662693024,0.633333325386047,0.866666674613953} | {0.490214675664902,0.444783747196198,0.627961099147797,0.390509605407715} |           0.933333337307 |        0.376114845276 | {0.566666662693024,0.566666662693024,0.533333361148834,0.933333337306976} | {0.575542628765106,0.54660427570343,0.785183191299438,0.376114845275879}
-       5 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {38.474328994751,79.9709329605103,117.766183853149,135.803887844086}  | {accuracy}   |         0.841666638851 |      0.576696753502 | {0.616666674613953,0.699999988079071,0.758333325386047,0.841666638851166} | {0.90448260307312,0.750164151191711,0.616493880748749,0.576696753501892}  |           0.899999976158 |        0.631914675236 | {0.666666686534882,0.699999988079071,0.733333349227905,0.899999976158142} | {0.871200919151306,0.780709445476532,0.665971457958221,0.631914675235748}
-      11 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {36.6214678287506,77.7987759113312,115.631717920303,133.83836388588}  | {accuracy}   |         0.758333325386 |      0.881635427475 | {0.308333337306976,0.316666662693024,0.75,0.758333325386047}              | {1.12997460365295,1.02749967575073,0.923768699169159,0.881635427474976}   |           0.766666650772 |        0.878168046474 | {0.433333337306976,0.433333337306976,0.766666650772095,0.766666650772095} | {1.07487094402313,0.974115014076233,0.916269063949585,0.878168046474457}
-       7 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {38.2849600315094,79.7524738311768,117.580325841904,135.606695890427} | {accuracy}   |         0.691666662693 |      0.444524824619 | {0.908333361148834,0.391666680574417,0.691666662693024,0.691666662693024} | {0.335082054138184,2.02327847480774,0.444351017475128,0.444524824619293}  |           0.566666662693 |        0.539750337601 | {0.800000011920929,0.266666680574417,0.566666662693024,0.566666662693024} | {0.433189332485199,2.3276960849762,0.534160375595093,0.539750337600708}
-       6 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {38.6593668460846,80.1789360046387,117.957875013351,135.995815992355} | {accuracy}   |         0.683333337307 |      0.841839790344 | {0.316666662693024,0.366666674613953,0.666666686534882,0.683333337306976} | {1.07646071910858,0.963329672813416,0.87216705083847,0.841839790344238}   |           0.666666686535 |        0.840192914009 | {0.433333337306976,0.533333361148834,0.666666686534882,0.666666686534882} | {1.02845978736877,0.941896677017212,0.861787617206573,0.840192914009094}
-       1 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {37.8553328514099,79.2480089664459,117.139881849289,135.155915975571} | {accuracy}   |         0.358333319426 |       1.11013436317 | {0.358333319425583,0.333333343267441,0.333333343267441,0.358333319425583} | {1.10554325580597,1.11694586277008,1.09756696224213,1.11013436317444}     |           0.233333334327 |         1.17629003525 | {0.233333334326744,0.333333343267441,0.333333343267441,0.233333334326744} | {1.16081762313843,1.14324629306793,1.11625325679779,1.1762900352478}
-      12 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {37.4500079154968,78.6058378219604,116.700626850128,134.72905087471}  | {accuracy}   |         0.308333337307 |       1.06241953373 | {0.150000005960464,0.333333343267441,0.333333343267441,0.308333337306976} | {1.13338851928711,1.09694564342499,1.07030868530273,1.06241953372955}     |           0.433333337307 |         1.03659796715 | {0.16666667163372,0.333333343267441,0.433333337306976,0.433333337306976}  | {1.06262242794037,1.07252764701843,1.05843663215637,1.03659796714783}
+ ---------+----------+---------------------------------------------------------------------------------+-----------------------+--------------+------------+-----------------------------------------------------------------------+--------------+--------------------------+------------------------+---------------------+---------------------------------------------------------------------------+---------------------------------------------------------------------------+--------------------------+-----------------------+---------------------------------------------------------------------------+------------------------------------------------------------------------------
+      10 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.18359375 | {34.347088098526,69.4049510955811,105.285098075867,121.389348983765}  | {accuracy}   | categorical_crossentropy |      0.975000023841858 |   0.166684880852699 | {0.949999988079071,0.949999988079071,0.908333361148834,0.975000023841858} | {0.615781545639038,0.269571483135223,0.242876216769218,0.166684880852699} |                        1 |     0.107727721333504 | {1,1,0.966666638851166,1}                                                 | {0.620104968547821,0.227639734745026,0.155238434672356,0.107727721333504}
+       3 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.75390625 | {33.7874810695648,68.735356092453,104.679323911667,120.741965055466}  | {accuracy}   | categorical_crossentropy |      0.958333313465118 |   0.151768147945404 | {0.958333313465118,0.816666662693024,0.941666662693024,0.958333313465118} | {0.232924744486809,0.356248170137405,0.160927340388298,0.151768147945404} |                        1 |    0.0933234840631485 | {1,0.866666674613953,0.966666638851166,1}                                 | {0.197644680738449,0.244517579674721,0.0969053283333778,0.0933234840631485}
+       4 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.75390625 | {35.5827050209045,70.4194140434265,106.230206012726,122.374103069305} | {accuracy}   | categorical_crossentropy |      0.941666662693024 |   0.146068558096886 | {0.824999988079071,0.958333313465118,0.941666662693024,0.941666662693024} | {0.339193284511566,0.171411842107773,0.154963329434395,0.146068558096886} |                        1 |    0.0618178397417068 | {0.933333337306976,1,1,1}                                                 | {0.302341401576996,0.0991373136639595,0.0709080845117569,0.0618178397417068}
+       5 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.75390625 | {36.3988909721375,71.1546559333801,107.219161987305,123.366652965546} | {accuracy}   | categorical_crossentropy |      0.908333361148834 |   0.656857788562775 | {0.358333319425583,0.683333337306976,0.766666650772095,0.908333361148834} | {0.996901094913483,0.852809607982635,0.694450318813324,0.656857788562775} |        0.899999976158142 |     0.630161166191101 | {0.233333334326744,0.600000023841858,0.633333325386047,0.899999976158142} | {1.05581676959991,0.876067101955414,0.700714349746704,0.630161166191101}
+       2 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 0.75390625 | {35.0734770298004,69.949450969696,105.773796081543,121.901873111725}  | {accuracy}   | categorical_crossentropy |      0.866666674613953 |    0.26227542757988 | {0.558333337306976,0.683333337306976,0.708333313465118,0.866666674613953} | {0.755041897296906,0.551798105239868,0.504159986972809,0.26227542757988}  |        0.899999976158142 |     0.156955525279045 | {0.633333325386047,0.600000023841858,0.633333325386047,0.899999976158142} | {0.663675665855408,0.674827337265015,0.613502621650696,0.156955525279045}
+       6 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.75390625 | {35.3355600833893,70.1827409267426,106.000793933868,122.135368108749} | {accuracy}   | categorical_crossentropy |      0.858333349227905 |   0.403681367635727 | {0.708333313465118,0.850000023841858,0.783333361148834,0.858333349227905} | {0.609176814556122,0.488206028938293,0.425172030925751,0.403681367635727} |        0.899999976158142 |     0.393621385097504 | {0.600000023841858,0.833333313465118,0.800000011920929,0.899999976158142} | {0.624664425849915,0.48302897810936,0.428876429796219,0.393621385097504}
+       9 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.18359375 | {34.0762810707092,69.0332140922546,105.02664899826,121.034147024155}  | {accuracy}   | categorical_crossentropy |      0.816666662693024 |    0.40071240067482 | {0.716666638851166,0.975000023841858,0.941666662693024,0.816666662693024} | {0.530809044837952,0.112755678594112,0.178483173251152,0.40071240067482}  |        0.899999976158142 |     0.160617485642433 | {0.666666686534882,1,1,0.899999976158142}                                 | {0.510058879852295,0.0435655005276203,0.0271952152252197,0.160617485642433}
+      11 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.18359375 | {36.664577960968,71.4262120723724,107.501835107803,123.646811962128}  | {accuracy}   | categorical_crossentropy |      0.816666662693024 |   0.683901190757751 | {0.358333319425583,0.574999988079071,0.758333325386047,0.816666662693024} | {0.969602465629578,0.828204095363617,0.7138671875,0.683901190757751}      |        0.833333313465118 |     0.720155417919159 | {0.233333334326744,0.466666668653488,0.666666686534882,0.833333313465118} | {1.04192113876343,0.878720223903656,0.755707621574402,0.720155417919159}
+       1 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 0.75390625 | {35.8240220546722,70.6528959274292,106.680663108826,122.609847068787} | {accuracy}   | categorical_crossentropy |      0.683333337306976 |   0.479730814695358 | {0.675000011920929,0.683333337306976,0.683333337306976,0.683333337306976} | {0.507641136646271,0.467351347208023,0.507468938827515,0.479730814695358} |        0.600000023841858 |     0.503782331943512 | {0.600000023841858,0.600000023841858,0.600000023841858,0.600000023841858} | {0.504352450370789,0.448328793048859,0.561446607112885,0.503782331943512}
+       7 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 1.18359375 | {33.5537171363831,68.2605030536652,104.443753004074,120.498863935471} | {accuracy}   | categorical_crossentropy |      0.683333337306976 |   0.940275907516479 | {0.733333349227905,0.641666650772095,0.933333337306976,0.683333337306976} | {0.419289141893387,0.746466338634491,0.25556743144989,0.940275907516479}  |        0.600000023841858 |       1.1002448797226 | {0.633333325386047,0.766666650772095,0.966666638851166,0.600000023841858} | {0.457020550966263,0.510054171085358,0.249405279755592,1.1002448797226}
+      12 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.18359375 | {34.8103830814362,69.698233127594,105.547440052032,121.659696102142}  | {accuracy}   | categorical_crossentropy |      0.633333325386047 |   0.711242735385895 | {0.458333343267441,0.5,0.683333337306976,0.633333325386047}               | {1.71241450309753,0.98362809419632,0.776128530502319,0.711242735385895}   |        0.566666662693024 |     0.721668004989624 | {0.433333337306976,0.5,0.600000023841858,0.566666662693024}               | {1.30801546573639,0.950868666172028,0.841940879821777,0.721668004989624}
+       8 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 1.18359375 | {36.0954039096832,70.921669960022,106.977710008621,123.112148046494}  | {accuracy}   | categorical_crossentropy |      0.358333319425583 |    1.10370969772339 | {0.324999988079071,0.358333319425583,0.358333319425583,0.358333319425583} | {1.10047388076782,1.09791541099548,1.09736382961273,1.10370969772339}     |        0.233333334326744 |      1.11518168449402 | {0.366666674613953,0.233333334326744,0.233333334326744,0.233333334326744} | {1.09480428695679,1.1150735616684,1.10827457904816,1.11518168449402}
 (12 rows)
 </pre>
 
@@ -1243,42 +1280,101 @@
                                     FALSE,             -- use gpus
                                     3                  -- mst_key to use
                                    );
-SELECT * FROM iris_predict ORDER BY id;
+SELECT * FROM iris_predict ORDER BY id, rank;
 </pre>
 <pre class="result">
- id  | prob_Iris-setosa | prob_Iris-versicolor | prob_Iris-virginica
------+------------------+----------------------+---------------------
-   9 |       0.99931216 |        0.00068789057 |       6.2587335e-10
-  18 |       0.99984336 |        0.00015656587 |        7.969957e-12
-  22 |        0.9998497 |        0.00015029701 |       6.4133347e-12
-  26 |        0.9995004 |        0.00049964694 |       2.2795305e-10
-  35 |       0.99964666 |        0.00035332117 |       9.4490485e-11
-  38 |       0.99964666 |        0.00035332117 |       9.4490485e-11
-  42 |        0.9985154 |         0.0014845316 |        5.293262e-09
-  43 |       0.99964476 |         0.0003552362 |        9.701174e-11
-  45 |        0.9997311 |        0.00026883607 |        3.076166e-11
-  46 |        0.9995486 |        0.00045140853 |       1.6814435e-10
-  50 |        0.9997856 |        0.00021441824 |       2.1316622e-11
-  53 |     9.837335e-06 |           0.97109175 |         0.028898403
-  60 |    0.00014028326 |           0.96552837 |         0.034331344
-  68 |    0.00087942625 |            0.9883348 |         0.010785843
-  77 |      6.08114e-06 |           0.94356424 |         0.056429718
-  78 |     7.116364e-07 |            0.8596206 |          0.14037873
-  79 |    1.3918722e-05 |           0.94052655 |          0.05945957
-  81 |    0.00045687397 |            0.9794796 |         0.020063542
-  82 |     0.0015463434 |           0.98768973 |         0.010763981
-  85 |    1.0929693e-05 |           0.87866926 |         0.121319845
-  95 |    6.3600986e-05 |           0.95264935 |         0.047287125
-  97 |    0.00020298029 |             0.981617 |         0.018180028
-  98 |    0.00019721613 |           0.98902065 |          0.01078211
- 113 |    1.0388683e-09 |           0.23626474 |           0.7637353
- 117 |     4.598902e-09 |           0.25669694 |           0.7433031
- 118 |    3.7139156e-11 |           0.13193987 |           0.8680601
- 127 |    2.1297862e-07 |             0.670349 |          0.32965073
- 136 |    7.1760774e-12 |           0.07074605 |            0.929254
- 143 |    1.2568385e-09 |          0.113820426 |           0.8861796
- 145 |      6.17019e-11 |          0.117578305 |          0.88242173
-(30 rows)
+-----+------------+-----------------+---------------+------
+   8 | class_text | Iris-setosa     |    0.99961567 |    1
+   8 | class_text | Iris-versicolor | 0.00038426166 |    2
+   8 | class_text | Iris-virginica  | 5.4132368e-08 |    3
+  15 | class_text | Iris-setosa     |     0.9999677 |    1
+  15 | class_text | Iris-versicolor |  3.235117e-05 |    2
+  15 | class_text | Iris-virginica  | 7.5747697e-10 |    3
+  20 | class_text | Iris-setosa     |    0.99956757 |    1
+  20 | class_text | Iris-versicolor |  0.0004323488 |    2
+  20 | class_text | Iris-virginica  | 6.8698306e-08 |    3
+  21 | class_text | Iris-setosa     |    0.99978334 |    1
+  21 | class_text | Iris-versicolor | 0.00021666527 |    2
+  21 | class_text | Iris-virginica  |  1.741537e-08 |    3
+  22 | class_text | Iris-setosa     |    0.99955314 |    1
+  22 | class_text | Iris-versicolor | 0.00044669537 |    2
+  22 | class_text | Iris-virginica  |  8.287442e-08 |    3
+  27 | class_text | Iris-setosa     |      0.999393 |    1
+  27 | class_text | Iris-versicolor | 0.00060693553 |    2
+  27 | class_text | Iris-virginica  | 1.4125514e-07 |    3
+  28 | class_text | Iris-setosa     |     0.9997589 |    1
+  28 | class_text | Iris-versicolor | 0.00024109415 |    2
+  28 | class_text | Iris-virginica  | 2.3418018e-08 |    3
+  33 | class_text | Iris-setosa     |    0.99966764 |    1
+  33 | class_text | Iris-versicolor | 0.00033237415 |    2
+  33 | class_text | Iris-virginica  | 3.2712443e-08 |    3
+  37 | class_text | Iris-setosa     |     0.9999347 |    1
+  37 | class_text | Iris-versicolor | 6.5290136e-05 |    2
+  37 | class_text | Iris-virginica  | 2.6893372e-09 |    3
+  38 | class_text | Iris-setosa     |    0.99963343 |    1
+  38 | class_text | Iris-versicolor |  0.0003665546 |    2
+  38 | class_text | Iris-virginica  | 4.7346873e-08 |    3
+  46 | class_text | Iris-setosa     |     0.9995722 |    1
+  46 | class_text | Iris-versicolor | 0.00042761036 |    2
+  46 | class_text | Iris-virginica  | 8.4179504e-08 |    3
+  51 | class_text | Iris-versicolor |     0.9678325 |    1
+  51 | class_text | Iris-virginica  |   0.031179406 |    2
+  51 | class_text | Iris-setosa     |  0.0009880556 |    3
+  57 | class_text | Iris-versicolor |     0.7971095 |    1
+  57 | class_text | Iris-virginica  |    0.20281655 |    2
+  57 | class_text | Iris-setosa     |  7.402597e-05 |    3
+  58 | class_text | Iris-versicolor |    0.93808955 |    1
+  58 | class_text | Iris-virginica  |   0.045088027 |    2
+  58 | class_text | Iris-setosa     |   0.016822474 |    3
+  59 | class_text | Iris-versicolor |    0.95377713 |    1
+  59 | class_text | Iris-virginica  |   0.045621604 |    2
+  59 | class_text | Iris-setosa     |  0.0006012956 |    3
+  64 | class_text | Iris-versicolor |     0.8078371 |    1
+  64 | class_text | Iris-virginica  |    0.19210756 |    2
+  64 | class_text | Iris-setosa     | 5.5370016e-05 |    3
+  65 | class_text | Iris-versicolor |      0.946594 |    1
+  65 | class_text | Iris-virginica  |    0.04081257 |    2
+  65 | class_text | Iris-setosa     |   0.012593448 |    3
+  72 | class_text | Iris-versicolor |     0.9616088 |    1
+  72 | class_text | Iris-virginica  |      0.033955 |    2
+  72 | class_text | Iris-setosa     |   0.004436169 |    3
+  75 | class_text | Iris-versicolor |    0.96245867 |    1
+  75 | class_text | Iris-virginica  |    0.03556654 |    2
+  75 | class_text | Iris-setosa     |  0.0019747794 |    3
+  87 | class_text | Iris-versicolor |     0.9264334 |    1
+  87 | class_text | Iris-virginica  |    0.07328824 |    2
+  87 | class_text | Iris-setosa     | 0.00027841402 |    3
+  95 | class_text | Iris-versicolor |    0.85156035 |    1
+  95 | class_text | Iris-virginica  |    0.14813933 |    2
+  95 | class_text | Iris-setosa     | 0.00030031145 |    3
+  97 | class_text | Iris-versicolor |    0.87470025 |    1
+  97 | class_text | Iris-virginica  |     0.1248041 |    2
+  97 | class_text | Iris-setosa     |  0.0004957556 |    3
+  98 | class_text | Iris-versicolor |     0.9439469 |    1
+  98 | class_text | Iris-virginica  |    0.05491364 |    2
+  98 | class_text | Iris-setosa     |   0.001139452 |    3
+ 120 | class_text | Iris-virginica  |    0.58107394 |    1
+ 120 | class_text | Iris-versicolor |    0.41892523 |    2
+ 120 | class_text | Iris-setosa     | 8.7891783e-07 |    3
+ 121 | class_text | Iris-virginica  |    0.90112364 |    1
+ 121 | class_text | Iris-versicolor |   0.098876335 |    2
+ 121 | class_text | Iris-setosa     | 5.8106266e-09 |    3
+ 122 | class_text | Iris-virginica  |     0.8664512 |    1
+ 122 | class_text | Iris-versicolor |    0.13354866 |    2
+ 122 | class_text | Iris-setosa     |  8.255242e-08 |    3
+ 123 | class_text | Iris-virginica  |    0.90162355 |    1
+ 123 | class_text | Iris-versicolor |    0.09837647 |    2
+ 123 | class_text | Iris-setosa     |  2.745874e-10 |    3
+ 140 | class_text | Iris-virginica  |     0.7306292 |    1
+ 140 | class_text | Iris-versicolor |     0.2693706 |    2
+ 140 | class_text | Iris-setosa     | 1.9480031e-07 |    3
+ 144 | class_text | Iris-virginica  |    0.92295665 |    1
+ 144 | class_text | Iris-versicolor |    0.07704339 |    2
+ 144 | class_text | Iris-setosa     | 1.6369142e-09 |    3
+ 147 | class_text | Iris-virginica  |    0.69545543 |    1
+ 147 | class_text | Iris-versicolor |    0.30454406 |    2
+ 147 | class_text | Iris-setosa     | 4.5714978e-07 |    3
+(90 rows)
 </pre>
 
 -# Warm start.  Next, use the warm_start parameter
@@ -1301,24 +1397,27 @@
 SELECT * FROM iris_multi_model_summary;
 </pre>
 <pre class="result">
+-[ RECORD 1 ]-------------+---------------------------------------------
 source_table              | iris_train_packed
 validation_table          | iris_test_packed
 model                     | iris_multi_model
 model_info                | iris_multi_model_info
-dependent_varname         | class_text
-independent_varname       | attributes
+dependent_varname         | {class_text}
+independent_varname       | {attributes}
 model_arch_table          | model_arch_library
+model_selection_table     | mst_table
+object_table              | 
 num_iterations            | 3
 metrics_compute_frequency | 1
 warm_start                | t
 name                      | Sophie L.
 description               | Simple MLP for iris dataset
-start_training_time       | 2019-12-16 20:07:41.488587
-end_training_time         | 2019-12-16 20:08:27.20651
-madlib_version            | 1.17.0
-num_classes               | 3
-class_values              | {Iris-setosa,Iris-versicolor,Iris-virginica}
-dependent_vartype         | character varying
+start_training_time       | 2021-02-05 01:17:19.432839
+end_training_time         | 2021-02-05 01:18:09.062384
+madlib_version            | 1.18.0-dev
+num_classes               | {1}
+class_text_class_values   | {Iris-setosa,Iris-versicolor,Iris-virginica}
+dependent_vartype         | {"character varying"}
 normalizing_const         | 1
 metrics_iters             | {1,2,3}
 </pre>
@@ -1327,20 +1426,20 @@
 SELECT * FROM iris_multi_model_info ORDER BY training_metrics_final DESC, training_loss_final;
 </pre>
 <pre class="result">
- mst_key | model_id |                                 compile_params                                  |      fit_params       |  model_type  |  model_size  |                 metrics_elapsed_time                 | metrics_type | training_metrics_final | training_loss_final |                    training_metrics                     |                      training_loss                       | validation_metrics_final | validation_loss_final |                   validation_metrics                    |                     validation_loss
----------+----------+---------------------------------------------------------------------------------+-----------------------+--------------+--------------+------------------------------------------------------+--------------+------------------------+---------------------+---------------------------------------------------------+----------------------------------------------------------+--------------------------+-----------------------+---------------------------------------------------------+---------------------------------------------------------
-       5 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {19.451014995575,37.2563629150391,54.7182998657227}  | {accuracy}   |         0.975000023842 |      0.490673750639 | {0.958333313465118,0.691666662693024,0.975000023841858} | {0.541427075862885,0.517450392246246,0.490673750638962}  |           0.933333337307 |        0.557333409786 | {0.933333337306976,0.666666686534882,0.933333337306976} | {0.60710871219635,0.570206344127655,0.557333409786224}
-       9 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.7900390625 | {18.2973220348358,36.3793680667877,54.0178129673004} | {accuracy}   |         0.966666638851 |     0.0894369781017 | {0.966666638851166,0.966666638851166,0.966666638851166} | {0.133233144879341,0.111788973212242,0.0894369781017303} |           0.899999976158 |        0.195293620229 | {0.933333337306976,0.966666638851166,0.899999976158142} | {0.156044512987137,0.132803827524185,0.195293620228767}
-       4 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {17.6080539226532,35.6788699626923,53.3836889266968} | {accuracy}   |         0.966666638851 |      0.147051945329 | {0.908333361148834,0.958333313465118,0.966666638851166} | {0.225205257534981,0.168186634778976,0.147051945328712}  |           0.866666674614 |        0.250319689512 | {0.899999976158142,0.933333337306976,0.866666674613953} | {0.23467344045639,0.182851999998093,0.250319689512253}
-       8 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {18.7529940605164,36.8255958557129,54.3704080581665} | {accuracy}   |         0.966666638851 |      0.244641214609 | {0.691666662693024,0.891666650772095,0.966666638851166} | {0.939713299274445,0.462556451559067,0.244641214609146}  |           0.966666638851 |        0.298279434443 | {0.566666662693024,0.966666638851166,0.966666638851166} | {1.30671143531799,0.412235885858536,0.29827943444252}
-      10 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {17.4004180431366,35.4556438922882,53.1877279281616} | {accuracy}   |         0.958333313465 |      0.123381219804 | {0.949999988079071,0.766666650772095,0.958333313465118} | {0.0919980704784393,0.576169073581696,0.123381219804287} |           0.933333337307 |        0.203262642026 | {0.866666674613953,0.766666650772095,0.933333337306976} | {0.199721112847328,0.959742486476898,0.203262642025948}
-       3 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {17.81547498703,35.8978669643402,53.5737180709839}   | {accuracy}   |         0.933333337307 |      0.150664463639 | {0.941666662693024,0.925000011920929,0.933333337306976} | {0.117781177163124,0.163000836968422,0.150664463639259}  |           0.833333313465 |        0.365329563618 | {0.866666674613953,0.833333313465118,0.833333313465118} | {0.249404579401016,0.375173389911652,0.365329563617706}
-       6 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {19.686233997345,37.4543249607086,54.8708770275116}  | {accuracy}   |         0.858333349228 |      0.743227303028 | {0.675000011920929,0.708333313465118,0.858333349227905} | {0.808507084846497,0.774080872535706,0.743227303028107}  |           0.966666638851 |        0.770158529282 | {0.666666686534882,0.666666686534882,0.966666638851166} | {0.808504283428192,0.791898012161255,0.770158529281616}
-      11 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {17.1583528518677,35.0312390327454,52.96133685112}   | {accuracy}   |         0.816666662693 |      0.739802956581 | {0.774999976158142,0.816666662693024,0.816666662693024} | {0.83727890253067,0.792884111404419,0.739802956581116}   |           0.800000011921 |        0.758302807808 | {0.766666650772095,0.800000011920929,0.800000011920929} | {0.837629973888397,0.801746726036072,0.758302807807922}
-       2 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 1.2197265625 | {16.9146749973297,34.794900894165,52.7328250408173}  | {accuracy}   |         0.808333337307 |      0.303489625454 | {0.683333337306976,0.966666638851166,0.808333337306976} | {1.05107569694519,0.189959138631821,0.303489625453949}   |           0.866666674614 |        0.285375326872 | {0.666666686534882,0.966666638851166,0.866666674613953} | {1.01942157745361,0.238933652639389,0.285375326871872}
-      12 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.7900390625 | {18.0590150356293,36.1394078731537,53.7930529117584} | {accuracy}   |         0.699999988079 |       1.02253305912 | {0.550000011920929,0.691666662693024,0.699999988079071} | {1.0493084192276,1.03803598880768,1.02253305912018}      |           0.666666686535 |         1.02013540268 | {0.633333325386047,0.600000023841858,0.666666686534882} | {1.03952574729919,1.03439521789551,1.02013540267944}
-       7 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {19.2141709327698,37.0566499233246,54.5629329681396} | {accuracy}   |         0.691666662693 |      0.448221176863 | {0.691666662693024,0.691666662693024,0.691666662693024} | {0.447027385234833,0.444605946540833,0.448221176862717}  |           0.566666662693 |        0.555035352707 | {0.566666662693024,0.566666662693024,0.566666662693024} | {0.551217257976532,0.540408432483673,0.555035352706909}
-       1 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 1.2197265625 | {18.501914024353,36.5938439369202,54.194118976593}   | {accuracy}   |         0.358333319426 |       1.09730923176 | {0.333333343267441,0.333333343267441,0.358333319425583} | {1.09999334812164,1.10405397415161,1.09730923175812}     |           0.233333334327 |         1.12532019615 | {0.333333343267441,0.333333343267441,0.233333334326744} | {1.12446486949921,1.13782525062561,1.12532019615173}
+  mst_key | model_id |                                 compile_params                                  |      fit_params       |  model_type  | model_size |                 metrics_elapsed_time                 | metrics_type |        loss_type         | training_metrics_final | training_loss_final |                    training_metrics                     |                      training_loss                      | validation_metrics_final | validation_loss_final |                   validation_metrics                    |                      validation_loss                      
+---------+----------+---------------------------------------------------------------------------------+-----------------------+--------------+------------+------------------------------------------------------+--------------+--------------------------+------------------------+---------------------+---------------------------------------------------------+---------------------------------------------------------+--------------------------+-----------------------+---------------------------------------------------------+-----------------------------------------------------------
+      10 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.18359375 | {15.5192940235138,31.2597029209137,47.3274641036987} | {accuracy}   | categorical_crossentropy |      0.966666638851166 |   0.122704714536667 | {0.925000011920929,0.808333337306976,0.966666638851166} | {0.172832280397415,0.354962348937988,0.122704714536667} |                        1 |    0.0603121444582939 | {1,0.866666674613953,1}                                 | {0.100369863212109,0.210344776511192,0.0603121444582939}
+       4 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.75390625 | {16.485249042511,32.207494020462,48.3005399703979}   | {accuracy}   | categorical_crossentropy |      0.958333313465118 |   0.104984156787395 | {0.949999988079071,0.866666674613953,0.958333313465118} | {0.126872479915619,0.29683381319046,0.104984156787395}  |                        1 |    0.0291607566177845 | {1,0.933333337306976,1}                                 | {0.0433581434190273,0.127146035432816,0.0291607566177845}
+       3 |        1 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.75390625 | {14.8933939933777,30.7211890220642,46.7588970661163} | {accuracy}   | categorical_crossentropy |      0.958333313465118 |   0.107885278761387 | {0.941666662693024,0.966666638851166,0.958333313465118} | {0.155905768275261,0.129546254873276,0.107885278761387} |                        1 |    0.0387893654406071 | {1,1,1}                                                 | {0.087645135819912,0.0574964620172977,0.0387893654406071}
+       7 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 1.18359375 | {14.657429933548,30.4864449501038,46.5082159042358}  | {accuracy}   | categorical_crossentropy |      0.958333313465118 |   0.215500101447105 | {0.891666650772095,0.649999976158142,0.958333313465118} | {0.2352494597435,0.490339070558548,0.215500101447105}   |                        1 |     0.194652214646339 | {0.899999976158142,0.766666650772095,1}                 | {0.235657200217247,0.335934072732925,0.194652214646339}
+       6 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 0.75390625 | {16.2535049915314,31.9760749340057,48.0661840438843} | {accuracy}   | categorical_crossentropy |      0.941666662693024 |   0.354692339897156 | {0.975000023841858,0.908333361148834,0.941666662693024} | {0.38957467675209,0.367432981729507,0.354692339897156}  |                        1 |     0.321579396724701 | {1,0.966666638851166,1}                                 | {0.348732680082321,0.345716863870621,0.321579396724701}
+       2 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 0.75390625 | {16.0124208927155,31.7479128837585,47.8348250389099} | {accuracy}   | categorical_crossentropy |      0.933333337306976 |   0.238343968987465 | {0.975000023841858,0.975000023841858,0.933333337306976} | {0.17331151664257,0.0775784701108932,0.238343968987465} |        0.966666638851166 |     0.182112962007523 | {0.966666638851166,1,0.966666638851166}                 | {0.114220358431339,0.0159573350101709,0.182112962007523}
+       9 |        2 | loss='categorical_crossentropy', optimizer='Adam(lr=0.01)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.18359375 | {15.1569800376892,30.9868409633636,47.0489699840546} | {accuracy}   | categorical_crossentropy |      0.908333361148834 |   0.254226505756378 | {0.858333349227905,0.833333313465118,0.908333361148834} | {0.450037389993668,0.500195503234863,0.254226505756378} |        0.966666638851166 |    0.0769383609294891 | {0.899999976158142,0.899999976158142,0.966666638851166} | {0.163723081350327,0.211927503347397,0.0769383609294891}
+      11 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 1.18359375 | {17.7095139026642,33.4267060756683,49.6263670921326} | {accuracy}   | categorical_crossentropy |      0.899999976158142 |   0.602623522281647 | {0.824999988079071,0.899999976158142,0.899999976158142} | {0.65915185213089,0.633350729942322,0.602623522281647}  |        0.899999976158142 |     0.626452326774597 | {0.833333313465118,0.899999976158142,0.899999976158142} | {0.695301294326782,0.661069989204407,0.626452326774597}
+       5 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=4,epochs=1 | madlib_keras | 0.75390625 | {17.4354989528656,33.1478018760681,49.3430600166321} | {accuracy}   | categorical_crossentropy |      0.741666674613953 |   0.532543838024139 | {0.916666686534882,0.875,0.741666674613953}             | {0.610130310058594,0.571656167507172,0.532543838024139} |        0.633333325386047 |     0.540634453296661 | {0.966666638851166,0.899999976158142,0.633333325386047} | {0.601030588150024,0.567581355571747,0.540634453296661}
+      12 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.001)',metrics=['accuracy'] | batch_size=8,epochs=1 | madlib_keras | 1.18359375 | {15.7747659683228,31.5215599536896,47.6013031005859} | {accuracy}   | categorical_crossentropy |      0.675000011920929 |   0.581513583660126 | {0.683333337306976,0.683333337306976,0.675000011920929} | {0.660648465156555,0.623969316482544,0.581513583660126} |        0.600000023841858 |     0.594986796379089 | {0.600000023841858,0.600000023841858,0.600000023841858} | {0.694275736808777,0.665920257568359,0.594986796379089}
+       1 |        1 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=4,epochs=1 | madlib_keras | 0.75390625 | {16.8983609676361,32.4361009597778,48.5357549190521} | {accuracy}   | categorical_crossentropy |      0.608333349227905 |   0.687089145183563 | {0.583333313465118,0.699999988079071,0.608333349227905} | {0.816642761230469,0.455933183431625,0.687089145183563} |        0.533333361148834 |     0.692391216754913 | {0.533333361148834,0.600000023841858,0.533333361148834} | {0.85290253162384,0.465981602668762,0.692391216754913}
+       8 |        2 | loss='categorical_crossentropy',optimizer='Adam(lr=0.1)',metrics=['accuracy']   | batch_size=8,epochs=1 | madlib_keras | 1.18359375 | {17.1899569034576,32.8988509178162,49.0880739688873} | {accuracy}   | categorical_crossentropy |      0.358333319425583 |    1.11154329776764 | {0.358333319425583,0.324999988079071,0.358333319425583} | {1.09858739376068,1.09980893135071,1.11154329776764}    |        0.233333334326744 |      1.16109442710876 | {0.233333334326744,0.366666674613953,0.233333334326744} | {1.12281405925751,1.10376691818237,1.16109442710876}
 (12 rows)
 </pre>
 Note that the loss and accuracy values pick up from where the previous run left off.
@@ -1383,7 +1482,7 @@
 segment to segment and training takes place in parallel [1,2]. If you have fewer model
 selection tuples to train than segments, then some
 segments may not be busy 100% of the time so speedup will not necessarily increase
-on a larger cluster. Inference (predict) is an embarrassingly parallel operation so
+on a larger cluster. Inference (prediction) is an embarrassingly parallel operation so
 inference runtimes will be proportionally faster as the number of segments increases.
 
 @anchor literature
diff --git a/src/ports/postgres/modules/deep_learning/madlib_keras_gpu_info.sql_in b/src/ports/postgres/modules/deep_learning/madlib_keras_gpu_info.sql_in
index 66cbcc2..a35e73b 100644
--- a/src/ports/postgres/modules/deep_learning/madlib_keras_gpu_info.sql_in
+++ b/src/ports/postgres/modules/deep_learning/madlib_keras_gpu_info.sql_in
@@ -20,7 +20,7 @@
  *
  * @file madlib_keras_gpu_info.sql_in
  *
- * @brief SQL functions for GPU configuration.
+ * @brief Utility function to report number and type of GPUs in the database cluster.
  * @date Nov 2019
  *
  *
@@ -33,9 +33,6 @@
 
 @brief Utility function to report number and type of GPUs in the database cluster.
 
-\warning <em> This MADlib method is still in early stage development.
-Interface and implementation are subject to change. </em>
-
 <div class="toc"><b>Contents</b><ul>
 <li class="level1"><a href="#get_gpu_config">GPU Configuration</a></li>
 <li class="level1"><a href="#example">Examples</a></li>
@@ -45,12 +42,13 @@
 
 This utility function reports the number and type of GPUs
 attached to hosts on the database cluster.
+
 This can be useful when determining which segments to use for
 training deep neural nets.  For example, for economic reasons
 you may wish to set up a heterogeneous clusters with GPUs only
 on some of the hosts, not all of them. This utility
 can help you identify where the GPUS are and direct the compute
-to those locations only in subsequent training and evaluation steps.
+to those locations only for model training.
 
 @anchor get_gpu_config
 @par GPU Confuguration
diff --git a/src/ports/postgres/modules/deep_learning/madlib_keras_model_selection.sql_in b/src/ports/postgres/modules/deep_learning/madlib_keras_model_selection.sql_in
index 325f355..b72e111 100644
--- a/src/ports/postgres/modules/deep_learning/madlib_keras_model_selection.sql_in
+++ b/src/ports/postgres/modules/deep_learning/madlib_keras_model_selection.sql_in
@@ -20,7 +20,7 @@
  *
  * @file madlib_keras_model_selection.sql_in
  *
- * @brief Generate configurations for model selection (hyperparams, architectures)
+ * @brief Generate configurations for model architecture search and hyperparameter tuning.
  * @date August 2020
  *
  *
@@ -30,26 +30,21 @@
 /**
 @addtogroup grp_keras_setup_model_selection
 
-@brief Utility function to generate configurations for model architecture search
-and hyperparameter tuning.
-
-\warning <em> This MADlib method is still in early stage development.
-Interface and implementation are subject to change. </em>
+@brief Generate configurations for model architecture search and hyperparameter tuning.
 
 <div class="toc"><b>Contents</b><ul>
 <li class="level1"><a href="#gen_mst_configs">Generate Model Configurations</a></li>
 <li class="level1"><a href="#load_mst_table">Load Model Selection Table [Deprecated]</a></li>
 <li class="level1"><a href="#example">Examples</a></li>
-<li class="level1"><a href="#notes">Notes</a></li>
 <li class="level1"><a href="#related">Related Topics</a></li>
 </ul></div>
 
-This module generates model configurations
-for training multiple models at the same time
-using <a href="group__grp__keras__run__model__selection.html">Run Model Selection</a>.
+This module generates model configurations using grid search or random search.
+
+Once the configurations are defined, they can be used by the fit function
+in <a href="group__grp__keras__run__model__selection.html">Train Model Configurations</a>.
 By model configurations we mean both hyperparameters and
-model architectures. Grid search or random search
-can be used to generate the configurations.
+model architectures. 
 The output table from this module
 defines the combinations of model architectures,
 compile and fit parameters to be trained in parallel.
@@ -77,7 +72,7 @@
   <dt>model_arch_table</dt>
   <dd>VARCHAR. Table containing model architectures and weights.
   For more information on this table
-  refer to the module <a href="group__grp__keras__model__arch.html">Load Models</a>.
+  refer to the module <a href="group__grp__keras__model__arch.html">Define Model Architectures</a>.
   </dd>
 
   <dt>model_selection_table</dt>
@@ -119,13 +114,22 @@
   which are very sensitive to changes near 1.  It has the effect of producing more values near 1
   than regular log-based sampling.
 
-  In the case of grid search, omit the sample type and just put the grid points in the list.
-  For custom loss functions, custom metrics, and custom top k categorical accuracy,
-  list the custom function name and provide the name of the
-  table where the serialized Python objects reside using the
-  parameter 'object_table' below. See the examples section later on this page.
-  For more information on custom functions, please
-  see <a href="group__grp__custom__function.html">Load Custom Functions</a>.
+  In the case of grid search, omit the sampling type and just put the grid points in the list.
+
+  @note
+  - Custom loss functions and custom metrics can be used as defined in
+  <a href="group__grp__custom__function.html">Define Custom Functions.</a>
+  List the custom function name and provide the name of the table where the 
+  serialized Python objects reside using the parameter 'object_table' below.
+  - The following loss function is
+  not supported: <em>sparse_categorical_crossentropy</em>.
+  The following metrics are not
+  supported: <em>sparse_categorical_accuracy, sparse_top_k_categorical_accuracy</em>.
+  - The Keras accuracy parameter <em>top_k_categorical_accuracy</em> returns top 5 accuracy by 
+  default.  If you want a different top k value, use the helper function
+  <a href="group__grp__custom__function.html#top_k_function">Top k Accuracy Function</a> 
+  to create a custom
+  Python function to compute the top k accuracy that you want.
   </dd>
 
   <dt>fit_params_grid</dt>
@@ -142,6 +146,10 @@
     }
   $$
   </pre>
+  @note
+  Callbacks are not currently supported except for TensorBoard
+  which you can specify in the usual way,
+  e.g., 'callbacks': ['[TensorBoard(log_dir="/tmp/logs/fit")]']
   </dd>
 
   <dt>search_type</dt>
@@ -171,6 +179,10 @@
   <dd>VARCHAR, default: NULL. Name of the table containing
   Python objects in the case that custom loss functions or
   custom metrics are specified in the 'compile_params_grid'.
+  Note that this table has to be created by 
+  the <a href="group__grp__custom__function.html">Define Custom Functions</a> method. It is
+  not allowed to pass a schema name, since it will be automatically pulled from
+  this functions associated madlib schema.
   </dd>
 
 </dl>
@@ -243,7 +255,7 @@
   <dt>model_arch_table</dt>
   <dd>VARCHAR. Table containing model architectures and weights.
   For more information on this table
-  refer to <a href="group__grp__keras__model__arch.html">Load Model</a>.
+  refer to <a href="group__grp__keras__model__arch.html">Define Model Architectures</a>.
   </dd>
 
   <dt>model_selection_table</dt>
@@ -271,7 +283,9 @@
   <dt>fit_params_list</dt>
   <dd>VARCHAR[].  Array of fit parameters to be tested.  Each element
   of the array should consist of a string of fit parameters
-  exactly as it is to be passed to Keras.
+  exactly as it is to be passed to Keras.  Callbacks are not currently supported 
+  except for TensorBoard which you can specify in the usual way, 
+  e.g., callbacks=[TensorBoard(log_dir="/tmp/logs/fit")]
   </dd>
 
   <dt>object_table (optional)</dt>
@@ -288,9 +302,9 @@
 so we first create a model architecture table with two different models.  Use Keras to define
 a model architecture with 1 hidden layer:
 <pre class="example">
-import keras
-from keras.models import Sequential
-from keras.layers import Dense
+from tensorflow import keras
+from tensorflow.keras.models import Sequential
+from tensorflow.keras.layers import Dense
 model1 = Sequential()
 model1.add(Dense(10, activation='relu', input_shape=(4,)))
 model1.add(Dense(10, activation='relu'))
@@ -677,7 +691,7 @@
 Let's say we have a table 'custom_function_table' that contains a custom loss
 function called 'my_custom_loss' and a custom accuracy function
 called 'my_custom_accuracy' based
-on <a href="group__grp__custom__function.html">Load Custom Functions.</a>
+on <a href="group__grp__custom__function.html">Define Custom Functions.</a>
 Generate the model configurations with:
 <pre class="example">
 DROP TABLE IF EXISTS mst_table, mst_table_summary;
@@ -724,7 +738,7 @@
 (16 rows)
 </pre>
 Similarly, if you created a custom top k categorical accuracy function 'top_3_accuracy'
-in <a href="group__grp__custom__function.html">Load Custom Functions</a>
+in <a href="group__grp__custom__function.html">Define Custom Functions</a>
 you can generate the model configurations as:
 <pre class="example">
 DROP TABLE IF EXISTS mst_table, mst_table_summary;
@@ -820,11 +834,6 @@
 (1 row)
 </pre>
 
-@anchor notes
-@par Notes
-
-1. TBD
-
 @anchor related
 @par Related Topics
 
diff --git a/src/ports/postgres/modules/utilities/minibatch_preprocessing.sql_in b/src/ports/postgres/modules/utilities/minibatch_preprocessing.sql_in
index 58668a1..b00aee7 100644
--- a/src/ports/postgres/modules/utilities/minibatch_preprocessing.sql_in
+++ b/src/ports/postgres/modules/utilities/minibatch_preprocessing.sql_in
@@ -18,7 +18,7 @@
  * under the License.
  *
  * @file minibatch_preprocessing.sql_in
- * @brief TODO
+ * @brief Utility that prepares input data for use by models that support mini-batch as an optimization option.
  * @date Mar 2018
  *
  */
@@ -48,6 +48,9 @@
 uses more than one training
 example at a time, typically resulting in faster and smoother convergence [1].
 
+@note This preprocessor should not be used for deep learning methods.  Please refer
+to the section on <a href="group__grp__dl.html">Deep Learning</a> for more information.
+
 @brief
 Utility that prepares input data for use by models that support
 mini-batch as an optimization option.