add clarification in DL user docs re GPU memory release
diff --git a/src/ports/postgres/modules/deep_learning/madlib_keras.sql_in b/src/ports/postgres/modules/deep_learning/madlib_keras.sql_in
index e4794a3..75fa56a 100644
--- a/src/ports/postgres/modules/deep_learning/madlib_keras.sql_in
+++ b/src/ports/postgres/modules/deep_learning/madlib_keras.sql_in
@@ -84,9 +84,20 @@
 called "Predict BYOM" below, where "BYOM" stands for "Bring Your Own Model."
 
 Note that the following MADlib functions are targeting a specific Keras
-version (2.2.4) with a specific Tensorflow kernel version (1.14).
+version (2.2.4) with a specific TensorFlow kernel version (1.14).
 Using a newer or older version may or may not work as intended.
 
+@note CUDA GPU memory cannot be released until the process holding it is terminated. 
+When a MADlib deep learning function is called with GPUs, Greenplum internally 
+creates a process (called a slice) which calls TensorFlow to do the computation. 
+This process holds the GPU memory until one of the following two things happen:
+query finishes and user logs out of the Postgres client/session; or, 
+query finishes and user waits for the timeout set by `gp_vmem_idle_resource_timeout`.  
+The default value for this timeout is 18 sec [8].  So the recommendation is:
+log out/reconnect to the session after every GPU query; or
+wait for `gp_vmem_idle_resource_timeout` before you run another GPU query (you can 
+also set it to a lower value).
+
 @anchor keras_fit
 @par Fit
 The fit (training) function has the following format:
@@ -1620,6 +1631,8 @@
 Yuhao Zhang, and Arun Kumar, Technical Report, Computer Science and Engineering, University of California,
 San Diego https://adalabucsd.github.io/papers/TR_2019_Cerebro.pdf.
 
+[8] Greenplum Database server configuration parameters https://gpdb.docs.pivotal.io/latest/ref_guide/config_params/guc-list.html
+
 @anchor related
 @par Related Topics
 
diff --git a/src/ports/postgres/modules/deep_learning/madlib_keras_fit_multiple_model.sql_in b/src/ports/postgres/modules/deep_learning/madlib_keras_fit_multiple_model.sql_in
index cd58d93..b929724 100644
--- a/src/ports/postgres/modules/deep_learning/madlib_keras_fit_multiple_model.sql_in
+++ b/src/ports/postgres/modules/deep_learning/madlib_keras_fit_multiple_model.sql_in
@@ -94,6 +94,17 @@
 This is not the case for GPDB 6+ where disk space is released during the
 fit multiple query.
 
+@note CUDA GPU memory cannot be released until the process holding it is terminated. 
+When a MADlib deep learning function is called with GPUs, Greenplum internally 
+creates a process (called a slice) which calls TensorFlow to do the computation. 
+This process holds the GPU memory until one of the following two things happen:
+query finishes and user logs out of the Postgres client/session; or, 
+query finishes and user waits for the timeout set by `gp_vmem_idle_resource_timeout`.  
+The default value for this timeout is 18 sec [8].  So the recommendation is:
+log out/reconnect to the session after every GPU query; or
+wait for `gp_vmem_idle_resource_timeout` before you run another GPU query (you can 
+also set it to a lower value).
+
 @anchor keras_fit
 @par Fit
 The fit (training) function has the following format:
@@ -1381,10 +1392,12 @@
 Geoffrey Hinton with Nitish Srivastava and Kevin Swersky,
 http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf
 
-[6] Deep learning section of Apache MADlib wiki, https://cwiki.apache.org/confluence/display/MADLIB/Deep+Learning
+[6] Deep learning section of Apache MADlib wiki https://cwiki.apache.org/confluence/display/MADLIB/Deep+Learning
 
 [7] Deep Learning, Ian Goodfellow, Yoshua Bengio and Aaron Courville, MIT Press, 2016.
 
+[8] Greenplum Database server configuration parameters https://gpdb.docs.pivotal.io/latest/ref_guide/config_params/guc-list.html
+
 @anchor related
 @par Related Topics