src/ports/postgres/modules/assoc_rules/assoc_rules.sql_in - madlib - Git at Google

 /* ----------------------------------------------------------------------- *//**
  *
  * @file assoc_rules.sql_in
  *
  * @brief The \ref assoc_rules function computes association rules for a given
  * set of data. The data is assumed to have two dimensions; items (between which
  * we are trying to discover associations), and a transaction id. This tranaction
  * id groups the items by event and could also be a user id, date, etc. depending
  * on the context of the data. This function assumes the data is stored in two
  * columns with one transaction id and one item per row.

  * @date June 2011
  * @date August 2012
  *
  * @sa For a brief introduction to the association rules implementation, see the module
  * description \ref grp_assoc_rules.
  *
  *//* ----------------------------------------------------------------------- */

 m4_include(`SQLCommon.m4')


 /**
 @addtogroup grp_assoc_rules

 <div class="toc"><b>Contents</b>
 <ul>
 <li><a href="#rules">Rules</a></li>
 <li><a href="#algorithm">Apriori Algorithm</a></li>
 <li><a href="#syntax">Function Syntax</a></li>
 <li><a href="#examples">Examples</a></li>
 <li><a href="#notes">Notes</a></li>
 <li><a href="#literature">Literature</a></li>
 <li><a href="#related">Related Topics</a></li>
 </ul>
 </div>

 @brief Computes association rules for a given set of data.

 This module implements the association rules data mining technique on a
 transactional data set. Given the names of a table and the columns, minimum
 support and confidence values, this function generates all single and
 multidimensional association rules that meet the minimum thresholds.

 Association rule mining is a widely used technique for discovering relationships
 between variables in a large data set (e.g., items in a store that are commonly
 purchased together). The classic market basket analysis example using
 association rules is the "beer and diapers" rule. According to data mining urban
 legend, a study of customer purchase behavior in a supermarket found that men
 often purchased beer and diapers together. After making this discovery, the
 managers strategically placed beer and diapers closer together on the shelves
 and saw a dramatic increase in sales. In addition to market basket analysis,
 association rules are also used in bioinformatics, web analytics, and several
 other fields.

 This type of data mining algorithm uses transactional data. Every transaction
 event has a unique identifier, and each transaction consists of a set of
 items (or itemset). Purchases are considered binary (either it was purchased or
 not), and this implementation does not take into consideration the quantity of
 each item. For the MADlib association rules function, it is assumed that the
 data is stored in two columns with one item and transaction id per row.
 Transactions with multiple items will span multiple rows with one row per item.

 <pre>
     trans_id | product
     ---------+---------
            1 | 1
            1 | 2
            1 | 3
            1 | 4
            2 | 3
            2 | 4
            2 | 5
            3 | 1
            3 | 4
            3 | 6
     ...
 </pre>

 @anchor rules
 @par Rules

 Association rules take the form "If X, then Y", where X and Y are non-empty
 itemsets. X and Y are called the antecedent and consequent, or the left-hand-side
 and right-hand-side, of the rule respectively. Using our previous example,
 the association rule may state "If {diapers}, then {beer}" with .2 support and
 .85 confidence.

 The following metrics are defined for any given itemset "X".
 - Count: The number of transactions that contain X

 - Support: The ratio of transactions that contain X to all transactions, T
 \f[
 S (X) = \frac{Total X}{Total transactions}
 \f]

 Given any association rule "If X, then Y", the association rules function will
 also calculate the following metrics:
 - Count: The number of transactions that contain X,Y

 - Support: The ratio of transactions that contain X,Y to all transactions, T
 \f[
 S (X \Rightarrow Y) = \frac{Total(X \cup Y)}{Total transactions}
 \f]

 - Confidence: The ratio of transactions that contain \f$ X,Y \f$ to
 transactions that contain \f$ X \f$. One could view this metric as the
 conditional probability of \f$ Y \f$ , given \f$ X \f$ . \f$ P(Y|X) \f$
 \f[
 C (X \Rightarrow Y) = \frac{s(X \cap Y )}{s(X)}
 \f]

 - Lift: The ratio of observed support of \f$ X,Y \f$ to the expected support of
 \f$ X,Y \f$ , assuming \f$ X \f$ and \f$ Y \f$ are independent.
 \f[
 L (X \Rightarrow Y) = \frac{s(X \cap Y )}{s(X) \cdot s(Y)}
 \f]

 - Conviction: The ratio of expected support of \f$ X \f$ occurring without
 \f$ Y \f$ assuming \f$ X \f$ and \f$ \neg Y \f$ are independent, to the
 observed support of \f$ X \f$ occuring without \f$ Y \f$. If conviction is
 greater than 1, then this metric shows that incorrect predictions ( \f$ X
 \Rightarrow Y \f$ ) occur less often than if these two actions were independent.
 This metric can be viewed as the ratio that the association rule would be
 incorrect if the actions were independent (i.e. a conviction of 1.5 indicates
 that if the variables were independent, this rule would be incorrect 50% more
 often.)

 \f[
 Conv (X \Rightarrow Y) = \frac{1 - S(Y)}{1 - C(X \Rightarrow Y)}
 \f]

 @anchor algorithm
 @par Apriori Algorithm

 Although there are many algorithms that generate association rules, the classic
 algorithm is called Apriori [1] which we have implemented in this module. It is a
 breadth-first search, as opposed to depth-first searches like Eclat. Frequent
 itemsets of order \f$ n \f$ are generated from sets of order \f$ n - 1 \f$.
 Using the downward closure property, all sets must have frequent subsets. There
 are two steps in this algorithm; generating frequent itemsets, and using these
 itemsets to construct the association rules. A simplified version of the
 algorithm is as follows, and assumes a minimum level of support and confidence
 is provided:

 \e Initial \e step
 -# Generate all itemsets of order 1.
 -# Eliminate itemsets that have support less than minimum support.

 \e Main \e algorithm
 -# For \f$ n \ge 2 \f$, generate itemsets of order \f$ n \f$ by combining the
 itemsets of order \f$ n - 1 \f$.
 This is done by doing the union of two itemsets that have identical items except one.
 -# Eliminate itemsets that have (n-1) order subsets with insufficient support.
 -# Eliminate itemsets with insufficient support.
 -# Repeat until itemsets cannot be generated, or maximum itemset size is exceeded.

 \e Association \e rule \e generation

 Given a frequent itemset \f$ A \f$ generated from the Apriori algorithm, and all
 subsets \f$ B \f$ , we generate rules such that \f$ B \Rightarrow (A - B) \f$
 meets minimum confidence requirements.

 @note Beware of combinatorial explosion.  The Apriori algorithm can potentially
 generate a huge number of rules, even for fairly simple data sets, resulting
 in run times that are unreasonably long.  To avoid this, it is recommended
 to cap the maximum itemset size to a small number to start with, then
 increase it gradually.  Similarly, <em>max_LHS_size</em> and <em>max_RHS_size</em>
 limit the number of items on the LHS and RHS of the rules
 and can significantly reduce run times.
 <em>Support</em> and <em>confidence</em> values are
 parameters that can also be used to control rule generation.

 @anchor syntax
 @par Function Syntax
 Association rules has the following syntax:
 <pre class="syntax">
 assoc_rules( support,
              confidence,
              tid_col,
              item_col,
              input_table,
              output_schema,
              verbose,
              max_itemset_size,
              max_LHS_size,
              max_RHS_size
            );</pre>
 This generates all association rules that satisfy the specified minimum
 <em>support</em> and <em>confidence</em>.

 \b Arguments
 <dl class="arglist">
   <dt>support</dt>
   <dd>Minimum level of support needed for each itemset to be included in result.</dd>

   <dt>confidence</dt>
   <dd>Minimum level of confidence needed for each rule to be included in result.</dd>

   <dt>tid_col</dt>
   <dd>Name of the column storing the transaction ids.</dd>

   <dt>item_col</dt>
   <dd>Name of the column storing the products.</dd>

   <dt>input_table</dt>
   <dd>Name of the table containing the input data.

   The input data is expected to be of the following form:
 <pre>{TABLE|VIEW} <em>input_table</em> (
     <em>trans_id</em> INTEGER,
     <em>product</em> TEXT
 )</pre>

   The algorithm maps the product names to consecutive integer ids starting at 1.
   If they are already structured this way, then the ids will not change.
   </dd>

   <dt>output_schema</dt>
   <dd>The name of the schema where the final results will be stored.
   The schema must be created before calling the function.  Alternatively, use
   <tt>NULL</tt> to output to the current schema.

   The results containing the rules, support, count, confidence, lift, and
   conviction are stored in the table \c assoc_rules in the schema
   specified by \c output_schema.

   The table has the following columns.
     <table class="output">
       <tr>
         <th>ruleid</th>
         <td>integer</td>
       </tr>
       <tr>
         <th>pre</th>
         <td>text</td>
       </tr>
       <tr>
         <th>post</th>
         <td>text</td>
       </tr>
       <tr>
         <th>count</th>
         <td>integer</td>
       </tr>
       <tr>
         <th>support</th>
         <td>double</td>
       </tr>
       <tr>
         <th>confidence</th>
         <td>double</td>
       </tr>
       <tr>
         <th>lift</th>
         <td>double</td>
       </tr>
       <tr>
         <th>conviction</th>
         <td>double</td>
       </tr>
     </table>

   On Greenplum Database, the table is distributed by the \c ruleid column.

   The \c pre and \c post columns are the itemsets of left and right hand sides of the
   association rule respectively. The \c support, \c confidence, \c lift, and
   \c conviction columns are calculated as described earlier.
   </dd>

   <dt>verbose (optional)</dt>
   <dd>BOOLEAN, default: FALSE. Determines if details are printed for each iteration
   as the algorithm progresses.</dd>

   <dt>max_itemset_size (optional)</dt>
   <dd>INTEGER, default: 10. Determines the maximum size of frequent
   itemsets that are used for generating association rules. Must be 2 or more.
   This parameter can be used to reduce run time for data sets where itemset size is large,
   which is a common situation. If your query is not returning or is running too long,
   try using a lower value for this parameter.</dd>


   <dt>max_LHS_size (optional)</dt>
   <dd>INTEGER, default: NULL. Determines the maximum size of the left hand side
   of the rule. Must be 1 or more.
   This parameter can be used to reduce run time.</dd>


   <dt>max_RHS_size (optional)</dt>
   <dd>INTEGER, default: NULL. Determines the maximum size of the right hand side
   of the rule. Must be 1 or more.
   This parameter can be used to reduce run time.  For example, setting to 1
   can significantly reduce run time if this makes sense for your use case.
   (The <em>apriori</em> algorithm in the R package <em>arules</em> [2] only
   supports a RHS of 1.)</dd>
 </dl>


 @anchor examples
 @examp

 Let's look at some sample transactional data and generate association rules.

 -# Create an input dataset:
 <pre class="example">
 DROP TABLE IF EXISTS test_data;
 CREATE TABLE test_data (
     trans_id INT,
     product TEXT
 );
 INSERT INTO test_data VALUES (1, 'beer');
 INSERT INTO test_data VALUES (1, 'diapers');
 INSERT INTO test_data VALUES (1, 'chips');
 INSERT INTO test_data VALUES (2, 'beer');
 INSERT INTO test_data VALUES (2, 'diapers');
 INSERT INTO test_data VALUES (3, 'beer');
 INSERT INTO test_data VALUES (3, 'diapers');
 INSERT INTO test_data VALUES (4, 'beer');
 INSERT INTO test_data VALUES (4, 'chips');
 INSERT INTO test_data VALUES (5, 'beer');
 INSERT INTO test_data VALUES (6, 'beer');
 INSERT INTO test_data VALUES (6, 'diapers');
 INSERT INTO test_data VALUES (6, 'chips');
 INSERT INTO test_data VALUES (7, 'beer');
 INSERT INTO test_data VALUES (7, 'diapers');
 </pre>

 -# Let \f$ min(support) = .25 \f$ and \f$ min(confidence) = .5 \f$, and the
 output schema is set to \c NULL indicating output to the current schema.
 In this example we set verbose to
 TRUE so that we have some insight into progress of the function. We
 can now generate association rules as follows:
 <pre class="example">
 SELECT * FROM madlib.assoc_rules( .25,            -- Support
                                   .5,             -- Confidence
                                   'trans_id',     -- Transaction id col
                                   'product',      -- Product col
                                   'test_data',    -- Input data
                                   NULL,           -- Output schema
                                   TRUE            -- Verbose output
                                 );
 </pre>
 Result (iteration details not shown):
 <pre class="result">
  output_schema | output_table | total_rules |   total_time
 ---------------+--------------+-------------+-----------------
  public        | assoc_rules  |           7 | 00:00:00.569254
 (1 row)
 </pre>
 The association rules are stored in the assoc_rules table:
 <pre class="example">
 SELECT * FROM assoc_rules
 ORDER BY support DESC, confidence DESC;
 </pre>
 Result:
 <pre class="result">
  ruleid |       pre       |      post      | count |      support      |    confidence     |       lift        |    conviction
 --------+-----------------+----------------+-------+-------------------+-------------------+-------------------+-------------------
       2 | {diapers}       | {beer}         |     5 | 0.714285714285714 |                 1 |                 1 |                 0
       6 | {beer}          | {diapers}      |     5 | 0.714285714285714 | 0.714285714285714 |                 1 |                 1
       5 | {chips}         | {beer}         |     3 | 0.428571428571429 |                 1 |                 1 |                 0
       4 | {chips,diapers} | {beer}         |     2 | 0.285714285714286 |                 1 |                 1 |                 0
       1 | {chips}         | {diapers,beer} |     2 | 0.285714285714286 | 0.666666666666667 | 0.933333333333333 | 0.857142857142857
       7 | {chips}         | {diapers}      |     2 | 0.285714285714286 | 0.666666666666667 | 0.933333333333333 | 0.857142857142857
       3 | {beer,chips}    | {diapers}      |     2 | 0.285714285714286 | 0.666666666666667 | 0.933333333333333 | 0.857142857142857
 (7 rows)
 </pre>

 -# Limit association rules generated from itemsets of size at most 2.  This parameter is
 a good way to reduce long run times.
 <pre class="example">
 SELECT * FROM madlib.assoc_rules( .25,            -- Support
                                   .5,             -- Confidence
                                   'trans_id',     -- Transaction id col
                                   'product',      -- Product col
                                   'test_data',    -- Input data
                                   NULL,           -- Output schema
                                   TRUE,           -- Verbose output
                                   2               -- Max itemset size
                                 );
 </pre>
 Result (iteration details not shown):
 <pre class="result">
  output_schema | output_table | total_rules |   total_time
 ---------------+--------------+-------------+-----------------
  public        | assoc_rules  |           4 | 00:00:00.565176
 (1 row)
 </pre>
 The association rules are again stored in the assoc_rules table:
 <pre class="example">
 SELECT * FROM assoc_rules
 ORDER BY support DESC, confidence DESC;
 </pre>
 Result:
 <pre class="result">
  ruleid |    pre    |   post    | count |      support      |    confidence     |       lift        |    conviction
 --------+-----------+-----------+-------+-------------------+-------------------+-------------------+-------------------
       1 | {diapers} | {beer}    |     5 | 0.714285714285714 |                 1 |                 1 |                 0
       2 | {beer}    | {diapers} |     5 | 0.714285714285714 | 0.714285714285714 |                 1 |                 1
       3 | {chips}   | {beer}    |     3 | 0.428571428571429 |                 1 |                 1 |                 0
       4 | {chips}   | {diapers} |     2 | 0.285714285714286 | 0.666666666666667 | 0.933333333333333 | 0.857142857142857
 (4 rows)
 </pre>

 -# Post-processing can now be done on the output table in the case that
 you want to filter the results.  For example, if you want any single item on the left hand side
 and a particular item on the right hand side:
 <pre class="example">
 SELECT * FROM assoc_rules WHERE array_upper(pre,1) = 1 AND post = array['beer'];
 </pre>
 Result:
 <pre class="result">
  ruleid |    pre    |  post  | count |      support      | confidence | lift | conviction
 --------+-----------+--------+-------+-------------------+------------+------+------------
       1 | {diapers} | {beer} |     5 | 0.714285714285714 |          1 |    1 |          0
       3 | {chips}   | {beer} |     3 | 0.428571428571429 |          1 |    1 |          0
 (2 rows)
 </pre>


 -# Limit the size of right hand side to 1.  This parameter is a good way to
 reduce long run times.
 <pre class="example">
 SELECT * FROM madlib.assoc_rules( .25,            -- Support
                                   .5,             -- Confidence
                                   'trans_id',     -- Transaction id col
                                   'product',      -- Product col
                                   'test_data',    -- Input data
                                   NULL,           -- Output schema
                                   TRUE,           -- Verbose output
                                   NULL,           -- Max itemset size
                                   NULL,           -- Max LHS size
                                   1               -- Max RHS size
                                 );
 </pre>
 Result (iteration details not shown):
 <pre class="result">
  output_schema | output_table | total_rules |   total_time
 ---------------+--------------+-------------+-----------------
  public        | assoc_rules  |           6 | 00:00:00.031362
 (1 row)
 </pre>
 The association rules are again stored in the assoc_rules table:
 <pre class="example">
 SELECT * FROM assoc_rules
 ORDER BY support DESC, confidence DESC;
 </pre>
 Result:
 <pre class="result">
  ruleid |       pre       |   post    | count |      support      |    confidence     |       lift        |    conviction
 --------+-----------------+-----------+-------+-------------------+-------------------+-------------------+-------------------
       4 | {diapers}       | {beer}    |     5 | 0.714285714285714 |                 1 |                 1 |                 0
       3 | {beer}          | {diapers} |     5 | 0.714285714285714 | 0.714285714285714 |                 1 |                 1
       1 | {chips}         | {beer}    |     3 | 0.428571428571429 |                 1 |                 1 |                 0
       6 | {diapers,chips} | {beer}    |     2 | 0.285714285714286 |                 1 |                 1 |                 0
       2 | {chips}         | {diapers} |     2 | 0.285714285714286 | 0.666666666666667 | 0.933333333333333 | 0.857142857142857
       5 | {beer,chips}    | {diapers} |     2 | 0.285714285714286 | 0.666666666666667 | 0.933333333333333 | 0.857142857142857
 (6 rows)
 </pre>


 @anchor notes
 @par Notes

 The association rules function always creates a table named \c assoc_rules.
 Make a copy of this table before running the function again if you would
 like to keep multiple association rule tables.  This behavior will be improved
 in a later release.

 @anchor literature
 @literature

 [1] https://en.wikipedia.org/wiki/Apriori_algorithm

 [2] https://cran.r-project.org/web/packages/arules/arules.pdf

 @anchor related
 @par Related Topics

 File assoc_rules.sql_in documenting the SQL function.

 */

 /*
  * @brief The result data type for the association rule API
  *
  * output_schema the name of the output schema.
  * output_table the name of the output table.
  * total_rules the number of total rules.
  * total_time the running time.
  */
 DROP TYPE IF EXISTS MADLIB_SCHEMA.assoc_rules_results CASCADE;
 CREATE TYPE MADLIB_SCHEMA.assoc_rules_results AS
     (
     output_schema TEXT,
     output_table TEXT,
     total_rules INT,
     total_time INTERVAL
 );

 /*
  * @brief Given the text form of a closed frequent pattern (cfp), this function
  * generates the association rules for that pattern. We use text format
  * because text values are hash joinable. The output is a set of text
  * array. For example, assuming the input pattern is '1,2,3'.
  * The result rules:
  * array['1', '2,3']
  * array['2', '1,3']
  * array['3', '1,2']
  * array['1,2', '3']
  * array['1,3', '2']
  * array['2,3', '1']
  * Note that two meaningless rules will be excluded:
  * array['1,2,3', NULL]
  * array[NULL, '1,2,3']
  *
  * @param arg 1 The text form of a closed frequent pattern.
  * @param arg 2 The number of items in the pattern.
  *
  * @return A set of text array. Each array has two elements, corresponding to
  * the left and right parts of an association rule.
  *
  */
 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.gen_rules_from_cfp
     (
     TEXT,
     INT,
     INT,
     INT
     )
 RETURNS SETOF TEXT[] AS 'MODULE_PATHNAME'
 LANGUAGE C STRICT IMMUTABLE
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `NO SQL', `');


 /**
  *
  * @param support minimum level of support needed for each itemset to
  * be included in result
  * @param confidence minimum level of confidence needed for each rule to
  * be included in result
  * @param tid_col name of the column storing the transaction ids
  * @param item_col name of the column storing the products
  * @param input_table name of the table where the data is stored
  * @param output_schema name of the schema where the final results will be stored
  * @param verbose determining if output contains comments
  *
  * @returns The schema and table name containing association rules,
  * and total number of rules found.
  *
  * This function computes the association rules between products in a data set.
  * It reads the name of the table, the column names of the product and ids, and
  * computes ssociation rules using the Apriori algorithm, and subject to the
  * support and confidence constraints as input by the user. This version of
  * association rules has verbose functionality. When verbose is true, output of
  * function includes iteration steps and comments on Apriori algorithm steps.
  *
  */
 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.assoc_rules
     (
     support FLOAT8,
     confidence FLOAT8,
     tid_col TEXT,
     item_col TEXT,
     input_table TEXT,
     output_schema TEXT,
     verbose BOOLEAN,
     max_itemset_size INTEGER,
     max_lhs_size INTEGER,
     max_rhs_size INTEGER
    )
 RETURNS MADLIB_SCHEMA.assoc_rules_results
 AS $$
     PythonFunctionBodyOnly(`assoc_rules', `assoc_rules')
     with AOControl(False):
         plpy.execute("SET client_min_messages = error;")
         # schema_madlib comes from PythonFunctionBodyOnly
         return assoc_rules.assoc_rules(schema_madlib,
                                        support,
                                        confidence,
                                        tid_col,
                                        item_col,
                                        input_table,
                                        output_schema,
                                        verbose,
                                        max_itemset_size,
                                        max_lhs_size,
                                        max_rhs_size
                                        );

 $$ LANGUAGE plpythonu
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA', `');

 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.assoc_rules
     (
     support FLOAT8,
     confidence FLOAT8,
     tid_col TEXT,
     item_col TEXT,
     input_table TEXT,
     output_schema TEXT,
     verbose BOOLEAN,
     max_itemset_size INTEGER,
     max_LHS_size INTEGER
    )
 RETURNS MADLIB_SCHEMA.assoc_rules_results
 AS $$
     PythonFunctionBodyOnly(`assoc_rules', `assoc_rules')
     with AOControl(False):
         plpy.execute("SET client_min_messages = error;")
         # schema_madlib comes from PythonFunctionBodyOnly
         return assoc_rules.assoc_rules(schema_madlib,
                                        support,
                                        confidence,
                                        tid_col,
                                        item_col,
                                        input_table,
                                        output_schema,
                                        verbose,
                                        max_itemset_size,
                                        max_lhs_size,
                                        None
                                        );

 $$ LANGUAGE plpythonu
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA', `');


 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.assoc_rules
     (
     support FLOAT8,
     confidence FLOAT8,
     tid_col TEXT,
     item_col TEXT,
     input_table TEXT,
     output_schema TEXT,
     verbose BOOLEAN,
     max_itemset_size INTEGER
    )
 RETURNS MADLIB_SCHEMA.assoc_rules_results
 AS $$
     PythonFunctionBodyOnly(`assoc_rules', `assoc_rules')
     with AOControl(False):
         plpy.execute("SET client_min_messages = error;")
         # schema_madlib comes from PythonFunctionBodyOnly
         return assoc_rules.assoc_rules(schema_madlib,
                                        support,
                                        confidence,
                                        tid_col,
                                        item_col,
                                        input_table,
                                        output_schema,
                                        verbose,
                                        max_itemset_size,
                                        None,
                                        None
                                        );

 $$ LANGUAGE plpythonu
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA', `');


 /**
  *
  * @brief The short form of the above function with vobose removed.
  *
  */
 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.assoc_rules
     (
     support FLOAT8,
     confidence FLOAT8,
     tid_col TEXT,
     item_col TEXT,
     input_table TEXT,
     output_schema TEXT
     )
 RETURNS MADLIB_SCHEMA.assoc_rules_results
 AS $$

     PythonFunctionBodyOnly(`assoc_rules', `assoc_rules')
     plpy.execute("SET client_min_messages = error;")
     with AOControl(False):
         # schema_madlib comes from PythonFunctionBodyOnly
         return assoc_rules.assoc_rules(schema_madlib,
                                        support,
                                        confidence,
                                        tid_col,
                                        item_col,
                                        input_table,
                                        output_schema,
                                        False,
                                        10,
                                        None,
                                        None);

 $$ LANGUAGE plpythonu
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA', `');


 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.assoc_rules
     (
     support FLOAT8,
     confidence FLOAT8,
     tid_col TEXT,
     item_col TEXT,
     input_table TEXT,
     output_schema TEXT,
     verbose BOOLEAN
     )
 RETURNS MADLIB_SCHEMA.assoc_rules_results
 AS $$
     PythonFunctionBodyOnly(`assoc_rules', `assoc_rules')
     plpy.execute("SET client_min_messages = error;")
     with AOControl(False):
         # schema_madlib comes from PythonFunctionBodyOnly
         return assoc_rules.assoc_rules(schema_madlib,
                                        support,
                                        confidence,
                                        tid_col,
                                        item_col,
                                        input_table,
                                        output_schema,
                                        verbose,
                                        10,
                                        None,
                                        None);
 $$ LANGUAGE plpythonu
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA', `');

 --------------------------------------------------------------------------

 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.assoc_rules(message TEXT)
 RETURNS text AS $$
 PythonFunction(assoc_rules, assoc_rules, assoc_rules_help_message)
 $$ language plpythonu
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `CONTAINS SQL', `');

 CREATE OR REPLACE FUNCTION MADLIB_SCHEMA.assoc_rules()
 RETURNS text AS $$
     SELECT MADLIB_SCHEMA.assoc_rules('');
 $$ language SQL
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `CONTAINS SQL', `');