docs/v1.0/hypothesis__tests_8sql__in.html - madlib-site - Git at Google

 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml">
 <head>
 <meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
 <meta http-equiv="X-UA-Compatible" content="IE=9"/>
 <meta name="generator" content="Doxygen 1.8.4"/>
 <title>MADlib: hypothesis_tests.sql_in File Reference</title>
 <link href="tabs.css" rel="stylesheet" type="text/css"/>
 <script type="text/javascript" src="jquery.js"></script>
 <script type="text/javascript" src="dynsections.js"></script>
 <link href="navtree.css" rel="stylesheet" type="text/css"/>
 <script type="text/javascript" src="resize.js"></script>
 <script type="text/javascript" src="navtree.js"></script>
 <script type="text/javascript">
   $(document).ready(initResizable);
   $(window).load(resizeHeight);
 </script>
 <link href="search/search.css" rel="stylesheet" type="text/css"/>
 <script type="text/javascript" src="search/search.js"></script>
 <script type="text/javascript">
   $(document).ready(function() { searchBox.OnSelectItem(0); });
 </script>
 <script type="text/x-mathjax-config">
   MathJax.Hub.Config({
     extensions: ["tex2jax.js", "TeX/AMSmath.js", "TeX/AMSsymbols.js"],
     jax: ["input/TeX","output/HTML-CSS"],
 });
 </script><script src="../mathjax/MathJax.js"></script>
 <link href="doxygen.css" rel="stylesheet" type="text/css" />
 </head>
 <body>
 <div id="top"><!-- do not remove this div, it is closed by doxygen! -->
 <div id="titlearea">
 <table cellspacing="0" cellpadding="0">
  <tbody>
  <tr style="height: 56px;">
   <td style="padding-left: 0.5em;">
    <div id="projectname">MADlib
    &#160;<span id="projectnumber">1.0</span> <span style="font-size:10pt; font-style:italic"><a href="../latest/./hypothesis__tests_8sql__in.html"> A newer version is available</a></span>
    </div>
    <div id="projectbrief">User Documentation</div>
   </td>
  </tr>
  </tbody>
 </table>
 </div>
 <!-- end header part -->
 <!-- Generated by Doxygen 1.8.4 -->
 <script type="text/javascript">
 var searchBox = new SearchBox("searchBox", "search",false,'Search');
 </script>
   <div id="navrow1" class="tabs">
     <ul class="tablist">
       <li><a href="index.html"><span>Main&#160;Page</span></a></li>
       <li><a href="modules.html"><span>Modules</span></a></li>
       <li>
         <div id="MSearchBox" class="MSearchBoxInactive">
         <span class="left">
           <img id="MSearchSelect" src="search/mag_sel.png"
                onmouseover="return searchBox.OnSearchSelectShow()"
                onmouseout="return searchBox.OnSearchSelectHide()"
                alt=""/>
           <input type="text" id="MSearchField" value="Search" accesskey="S"
                onfocus="searchBox.OnSearchFieldFocus(true)"
                onblur="searchBox.OnSearchFieldFocus(false)"
                onkeyup="searchBox.OnSearchFieldChange(event)"/>
           </span><span class="right">
             <a id="MSearchClose" href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" border="0" src="search/close.png" alt=""/></a>
           </span>
         </div>
       </li>
     </ul>
   </div>
 </div><!-- top -->
 <div id="side-nav" class="ui-resizable side-nav-resizable">
   <div id="nav-tree">
     <div id="nav-tree-contents">
       <div id="nav-sync" class="sync"></div>
     </div>
   </div>
   <div id="splitbar" style="-moz-user-select:none;"
        class="ui-resizable-handle">
   </div>
 </div>
 <script type="text/javascript">
 $(document).ready(function(){initNavTree('hypothesis__tests_8sql__in.html','');});
 </script>
 <div id="doc-content">
 <!-- window showing the filter options -->
 <div id="MSearchSelectWindow"
      onmouseover="return searchBox.OnSearchSelectShow()"
      onmouseout="return searchBox.OnSearchSelectHide()"
      onkeydown="return searchBox.OnSearchSelectKey(event)">
 <a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(0)"><span class="SelectionMark">&#160;</span>All</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(1)"><span class="SelectionMark">&#160;</span>Files</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(2)"><span class="SelectionMark">&#160;</span>Functions</a><a class="SelectItem" href="javascript:void(0)" onclick="searchBox.OnSelectItem(3)"><span class="SelectionMark">&#160;</span>Groups</a></div>

 <!-- iframe showing the search results (closed by default) -->
 <div id="MSearchResultsWindow">
 <iframe src="javascript:void(0)" frameborder="0"
         name="MSearchResults" id="MSearchResults">
 </iframe>
 </div>

 <div class="header">
   <div class="summary">
 <a href="#func-members">Functions</a>  </div>
   <div class="headertitle">
 <div class="title">hypothesis_tests.sql_in File Reference</div>  </div>
 </div><!--header-->
 <div class="contents">

 <p>SQL functions for statistical hypothesis tests.
 <a href="#details">More...</a></p>

 <p><a href="hypothesis__tests_8sql__in_source.html">Go to the source code of this file.</a></p>
 <table class="memberdecls">
 <tr class="heading"><td colspan="2"><h2 class="groupheader"><a name="func-members"></a>
 Functions</h2></td></tr>
 <tr class="memitem:ae7197f66a085f53d71167ac0a9029567"><td class="memItemLeft" align="right" valign="top">aggregate t_test_result&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="hypothesis__tests_8sql__in.html#ae7197f66a085f53d71167ac0a9029567">t_test_one</a> (float8 value)</td></tr>
 <tr class="memdesc:ae7197f66a085f53d71167ac0a9029567"><td class="mdescLeft">&#160;</td><td class="mdescRight">Perform one-sample or dependent paired Student t-test.  <a href="#ae7197f66a085f53d71167ac0a9029567">More...</a><br/></td></tr>
 <tr class="separator:ae7197f66a085f53d71167ac0a9029567"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a74e6ef1522197957b5aa35bf67004364"><td class="memItemLeft" align="right" valign="top">aggregate t_test_result&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="hypothesis__tests_8sql__in.html#a74e6ef1522197957b5aa35bf67004364">t_test_two_pooled</a> (boolean first, float8 value)</td></tr>
 <tr class="memdesc:a74e6ef1522197957b5aa35bf67004364"><td class="mdescLeft">&#160;</td><td class="mdescRight">Perform two-sample pooled (i.e., equal variances) Student t-test.  <a href="#a74e6ef1522197957b5aa35bf67004364">More...</a><br/></td></tr>
 <tr class="separator:a74e6ef1522197957b5aa35bf67004364"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:aa95e5a0c8b4841c113c84a393b8b4868"><td class="memItemLeft" align="right" valign="top">aggregate t_test_result&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="hypothesis__tests_8sql__in.html#aa95e5a0c8b4841c113c84a393b8b4868">t_test_two_unpooled</a> (boolean first, float8 value)</td></tr>
 <tr class="memdesc:aa95e5a0c8b4841c113c84a393b8b4868"><td class="mdescLeft">&#160;</td><td class="mdescRight">Perform unpooled (i.e., unequal variances) t-test (also known as Welch's t-test)  <a href="#aa95e5a0c8b4841c113c84a393b8b4868">More...</a><br/></td></tr>
 <tr class="separator:aa95e5a0c8b4841c113c84a393b8b4868"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a8f90d2f805a6ab3034f80a5967dffa1d"><td class="memItemLeft" align="right" valign="top">aggregate f_test_result&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="hypothesis__tests_8sql__in.html#a8f90d2f805a6ab3034f80a5967dffa1d">f_test</a> (boolean first, float8 value)</td></tr>
 <tr class="memdesc:a8f90d2f805a6ab3034f80a5967dffa1d"><td class="mdescLeft">&#160;</td><td class="mdescRight">Perform Fisher F-test.  <a href="#a8f90d2f805a6ab3034f80a5967dffa1d">More...</a><br/></td></tr>
 <tr class="separator:a8f90d2f805a6ab3034f80a5967dffa1d"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:afc6a7ac3eada83df681bc6efeddfd9eb"><td class="memItemLeft" align="right" valign="top">aggregate chi2_test_result&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="hypothesis__tests_8sql__in.html#afc6a7ac3eada83df681bc6efeddfd9eb">chi2_gof_test</a> (bigint observed, float8 expected=1, bigint df=0)</td></tr>
 <tr class="memdesc:afc6a7ac3eada83df681bc6efeddfd9eb"><td class="mdescLeft">&#160;</td><td class="mdescRight">Perform Pearson's chi-squared goodness-of-fit test.  <a href="#afc6a7ac3eada83df681bc6efeddfd9eb">More...</a><br/></td></tr>
 <tr class="separator:afc6a7ac3eada83df681bc6efeddfd9eb"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:af45ae9d1275d385bbacd18bff688ba7f"><td class="memItemLeft" align="right" valign="top">aggregate ks_test_result&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="hypothesis__tests_8sql__in.html#af45ae9d1275d385bbacd18bff688ba7f">ks_test</a> (boolean first, float8 value, bigint m, bigint n)</td></tr>
 <tr class="memdesc:af45ae9d1275d385bbacd18bff688ba7f"><td class="mdescLeft">&#160;</td><td class="mdescRight">Perform Kolmogorov-Smirnov test.  <a href="#af45ae9d1275d385bbacd18bff688ba7f">More...</a><br/></td></tr>
 <tr class="separator:af45ae9d1275d385bbacd18bff688ba7f"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a32cdc58e8a5d149dd90304805de07fbd"><td class="memItemLeft" align="right" valign="top">aggregate mw_test_result&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="hypothesis__tests_8sql__in.html#a32cdc58e8a5d149dd90304805de07fbd">mw_test</a> (boolean first, float8 value)</td></tr>
 <tr class="memdesc:a32cdc58e8a5d149dd90304805de07fbd"><td class="mdescLeft">&#160;</td><td class="mdescRight">Perform Mann-Whitney test.  <a href="#a32cdc58e8a5d149dd90304805de07fbd">More...</a><br/></td></tr>
 <tr class="separator:a32cdc58e8a5d149dd90304805de07fbd"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:afea2309e99477df6ebbfbcea11272507"><td class="memItemLeft" align="right" valign="top">aggregate wsr_test_result&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="hypothesis__tests_8sql__in.html#afea2309e99477df6ebbfbcea11272507">wsr_test</a> (float8 value, float8 precision=-1)</td></tr>
 <tr class="memdesc:afea2309e99477df6ebbfbcea11272507"><td class="mdescLeft">&#160;</td><td class="mdescRight">Perform Wilcoxon-Signed-Rank test.  <a href="#afea2309e99477df6ebbfbcea11272507">More...</a><br/></td></tr>
 <tr class="separator:afea2309e99477df6ebbfbcea11272507"><td class="memSeparator" colspan="2">&#160;</td></tr>
 <tr class="memitem:a2d8a6b8665dcc5002c06e3c56379d791"><td class="memItemLeft" align="right" valign="top">aggregate one_way_anova_result&#160;</td><td class="memItemRight" valign="bottom"><a class="el" href="hypothesis__tests_8sql__in.html#a2d8a6b8665dcc5002c06e3c56379d791">one_way_anova</a> (integer group, float8 value)</td></tr>
 <tr class="memdesc:a2d8a6b8665dcc5002c06e3c56379d791"><td class="mdescLeft">&#160;</td><td class="mdescRight">Perform one-way analysis of variance.  <a href="#a2d8a6b8665dcc5002c06e3c56379d791">More...</a><br/></td></tr>
 <tr class="separator:a2d8a6b8665dcc5002c06e3c56379d791"><td class="memSeparator" colspan="2">&#160;</td></tr>
 </table>
 <a name="details" id="details"></a><h2 class="groupheader">Detailed Description</h2>
 <div class="textblock"><dl class="section see"><dt>See Also</dt><dd>For an overview of hypthesis-test functions, see the module description <a class="el" href="group__grp__stats__tests.html">Hypothesis Tests</a>. </dd></dl>

 <p>Definition in file <a class="el" href="hypothesis__tests_8sql__in_source.html">hypothesis_tests.sql_in</a>.</p>
 </div><h2 class="groupheader">Function Documentation</h2>
 <a class="anchor" id="afc6a7ac3eada83df681bc6efeddfd9eb"></a>
 <div class="memitem">
 <div class="memproto">
       <table class="memname">
         <tr>
           <td class="memname">aggregate chi2_test_result chi2_gof_test </td>
           <td>(</td>
           <td class="paramtype">bigint&#160;</td>
           <td class="paramname"><em>observed</em>, </td>
         </tr>
         <tr>
           <td class="paramkey"></td>
           <td></td>
           <td class="paramtype">float8&#160;</td>
           <td class="paramname"><em>expected</em> = <code>1</code>, </td>
         </tr>
         <tr>
           <td class="paramkey"></td>
           <td></td>
           <td class="paramtype">bigint&#160;</td>
           <td class="paramname"><em>df</em> = <code>0</code>&#160;</td>
         </tr>
         <tr>
           <td></td>
           <td>)</td>
           <td></td><td></td>
         </tr>
       </table>
 </div><div class="memdoc">
 <p>Let \( n_1, \dots, n_k \) be a realization of a (vector) random variable \( N = (N_1, \dots, N_k) \) that follows the multinomial distribution with parameters \( k \) and \( p = (p_1, \dots, p_k) \). Test the null hypothesis \( H_0 : p = p^0 \).</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">observed</td><td>Number \( n_i \) of observations of the current event/row </td></tr>
     <tr><td class="paramname">expected</td><td>Expected number of observations of current event/row. This number is not required to be normalized. That is, \( p^0_i \) will be taken as <code>expected</code> divided by <code>sum(expected)</code>. Hence, if this parameter is not specified, chi2_test() will by default use \( p^0 = (\frac 1k, \dots, \frac 1k) \), i.e., test that \( p \) is a discrete uniform distribution. </td></tr>
     <tr><td class="paramname">df</td><td>Degrees of freedom. This is the number of events reduced by the degree of freedom lost by using the observed numbers for defining the expected number of observations. If this parameter is 0, the degree of freedom is taken as \( (k - 1) \).</td></tr>
   </table>
   </dd>
 </dl>
 <dl class="section return"><dt>Returns</dt><dd>A composite value as follows. Let \( n = \sum_{i=1}^n n_i \).<ul>
 <li><code>statistic FLOAT8</code> - Statistic <p class="formulaDsp">
 \[ \chi^2 = \sum_{i=1}^k \frac{(n_i - np_i)^2}{np_i} \]
 </p>
  The corresponding random variable is approximately chi-squared distributed with <code>df</code> degrees of freedom.</li>
 <li><code>df BIGINT</code> - Degrees of freedom</li>
 <li><code>p_value FLOAT8</code> - Approximate p-value, i.e., \( \Pr[X^2 \geq \chi^2 \mid p = p^0] \). Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a230513b6b549d5b445cbacbdbab42c15">chi_squared_cdf</a>(statistic))</code>.</li>
 <li><code>phi FLOAT8</code> - Phi coefficient, i.e., \( \phi = \sqrt{\frac{\chi^2}{n}} \)</li>
 <li><code>contingency_coef FLOAT8</code> - Contingency coefficient, i.e., \( \sqrt{\frac{\chi^2}{n + \chi^2}} \)</li>
 </ul>
 </dd></dl>
 <dl class="section user"><dt>Usage:</dt><dd><ul>
 <li>Test null hypothesis that all possible outcomes of a categorical variable are equally likely: <pre>SELECT (chi2_gof_test(<em>observed</em>, 1, NULL)).* FROM <em>source</em></pre></li>
 <li>Test null hypothesis that two categorical variables are independent. Such data is often shown in a <em>contingency table</em> (also known as <em>crosstab</em>). A crosstab is a matrix where possible values for the first variable correspond to rows and values for the second variable to columns. The matrix elements are the observation frequencies of the joint occurrence of the respective values. <a class="el" href="hypothesis__tests_8sql__in.html#afc6a7ac3eada83df681bc6efeddfd9eb" title="Perform Pearson&#39;s chi-squared goodness-of-fit test. ">chi2_gof_test()</a> assumes that the crosstab is stored in normalized form, i.e., there are three columns <code><em>var1</em></code>, <code><em>var2</em></code>, <code><em>observed</em></code>. <pre>SELECT (chi2_gof_test(<em>observed</em>, expected, deg_freedom)).*
 FROM (
     SELECT
         <em>observed</em>,
         sum(<em>observed</em>) OVER (PARTITION BY var1)::DOUBLE PRECISION
             * sum(<em>observed</em>) OVER (PARTITION BY var2) AS expected
     FROM <em>source</em>
 ) p, (
    SELECT
         (count(DISTINCT <em>var1</em>) - 1) * (count(DISTINCT <em>var2</em>) - 1) AS deg_freedom
     FROM <em>source</em>
 ) q;</pre> </li>
 </ul>
 </dd></dl>

 <p>Definition at line <a class="el" href="hypothesis__tests_8sql__in_source.html#l00549">549</a> of file <a class="el" href="hypothesis__tests_8sql__in_source.html">hypothesis_tests.sql_in</a>.</p>

 </div>
 </div>
 <a class="anchor" id="a8f90d2f805a6ab3034f80a5967dffa1d"></a>
 <div class="memitem">
 <div class="memproto">
       <table class="memname">
         <tr>
           <td class="memname">aggregate f_test_result f_test </td>
           <td>(</td>
           <td class="paramtype">boolean&#160;</td>
           <td class="paramname"><em>first</em>, </td>
         </tr>
         <tr>
           <td class="paramkey"></td>
           <td></td>
           <td class="paramtype">float8&#160;</td>
           <td class="paramname"><em>value</em>&#160;</td>
         </tr>
         <tr>
           <td></td>
           <td>)</td>
           <td></td><td></td>
         </tr>
       </table>
 </div><div class="memdoc">
 <p>Given realizations \( x_1, \dots, x_m \) and \( y_1, \dots, y_n \) of i.i.d. random variables \( X_1, \dots, X_m \sim N(\mu_X, \sigma^2) \) and \( Y_1, \dots, Y_n \sim N(\mu_Y, \sigma^2) \) with unknown parameters \( \mu_X, \mu_Y, \) and \( \sigma^2 \), test the null hypotheses \( H_0 : \sigma_X &lt; \sigma_Y \) and \( H_0 : \sigma_X = \sigma_Y \).</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">first</td><td>Indicator whether <code>value</code> is from first sample \( x_1, \dots, x_m \) (if <code>TRUE</code>) or from second sample \( y_1, \dots, y_n \) (if <code>FALSE</code>) </td></tr>
     <tr><td class="paramname">value</td><td>Value of random variate \( x_i \) or \( y_i \)</td></tr>
   </table>
   </dd>
 </dl>
 <dl class="section return"><dt>Returns</dt><dd>A composite value as follows. We denote by \( \bar x, \bar y \) the sample means and by \( s_X^2, s_Y^2 \) the sample variances.<ul>
 <li><code>statistic FLOAT8</code> - Statistic <p class="formulaDsp">
 \[ f = \frac{s_Y^2}{s_X^2} \]
 </p>
  The corresponding random variable is F-distributed with \( (n - 1) \) degrees of freedom in the numerator and \( (m - 1) \) degrees of freedom in the denominator.</li>
 <li><code>df1 BIGINT</code> - Degrees of freedom in the numerator \( (n - 1) \)</li>
 <li><code>df2 BIGINT</code> - Degrees of freedom in the denominator \( (m - 1) \)</li>
 <li><code>p_value_one_sided FLOAT8</code> - Lower bound on one-sided p-value. In detail, the result is \( \Pr[F \geq f \mid \sigma_X = \sigma_Y] \), which is a lower bound on \( \Pr[F \geq f \mid \sigma_X \leq \sigma_Y] \). Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a6c5b3e35531e44098f9d0cbef14cb8a6">fisher_f_cdf</a>(statistic))</code>.</li>
 <li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., \( 2 \cdot \min \{ p, 1 - p \} \) where \( p = \Pr[ F \geq f \mid \sigma_X = \sigma_Y] \). Computed as <code>(min(p_value_one_sided, 1. - p_value_one_sided))</code>.</li>
 </ul>
 </dd></dl>
 <dl class="section user"><dt>Usage:</dt><dd><ul>
 <li>Test null hypothesis that the variance of the first sample is at most (or equal to, respectively) the variance of the second sample: <pre>SELECT (f_test(<em>first</em>, <em>value</em>)).* FROM <em>source</em></pre> </li>
 </ul>
 </dd></dl>

 <p>Definition at line <a class="el" href="hypothesis__tests_8sql__in_source.html#l00419">419</a> of file <a class="el" href="hypothesis__tests_8sql__in_source.html">hypothesis_tests.sql_in</a>.</p>

 </div>
 </div>
 <a class="anchor" id="af45ae9d1275d385bbacd18bff688ba7f"></a>
 <div class="memitem">
 <div class="memproto">
       <table class="memname">
         <tr>
           <td class="memname">aggregate ks_test_result ks_test </td>
           <td>(</td>
           <td class="paramtype">boolean&#160;</td>
           <td class="paramname"><em>first</em>, </td>
         </tr>
         <tr>
           <td class="paramkey"></td>
           <td></td>
           <td class="paramtype">float8&#160;</td>
           <td class="paramname"><em>value</em>, </td>
         </tr>
         <tr>
           <td class="paramkey"></td>
           <td></td>
           <td class="paramtype">bigint&#160;</td>
           <td class="paramname"><em>m</em>, </td>
         </tr>
         <tr>
           <td class="paramkey"></td>
           <td></td>
           <td class="paramtype">bigint&#160;</td>
           <td class="paramname"><em>n</em>&#160;</td>
         </tr>
         <tr>
           <td></td>
           <td>)</td>
           <td></td><td></td>
         </tr>
       </table>
 </div><div class="memdoc">
 <p>Given realizations \( x_1, \dots, x_m \) and \( y_1, \dots, y_m \) of i.i.d. random variables \( X_1, \dots, X_m \) and i.i.d. \( Y_1, \dots, Y_n \), respectively, test the null hypothesis that the underlying distributions function \( F_X, F_Y \) are identical, i.e., \( H_0 : F_X = F_Y \).</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">first</td><td>Determines whether the value belongs to the first (if <code>TRUE</code>) or the second sample (if <code>FALSE</code>) </td></tr>
     <tr><td class="paramname">value</td><td>Value of random variate \( x_i \) or \( y_i \) </td></tr>
     <tr><td class="paramname">m</td><td>Size \( m \) of the first sample. See usage instructions below. </td></tr>
     <tr><td class="paramname">n</td><td>Size of the second sample. See usage instructions below.</td></tr>
   </table>
   </dd>
 </dl>
 <dl class="section return"><dt>Returns</dt><dd>A composite value.<ul>
 <li><code>statistic FLOAT8</code> - Kolmogorov–Smirnov statistic <p class="formulaDsp">
 \[ d = \max_{t \in \mathbb R} |F_x(t) - F_y(t)| \]
 </p>
  where \( F_x(t) := \frac 1m |\{ i \mid x_i \leq t \}| \) and \( F_y \) (defined likewise) are the empirical distribution functions.</li>
 <li><code>k_statistic FLOAT8</code> - Kolmogorov statistic \( k = r + 0.12 + \frac{0.11}{r} \) where \( r = \sqrt{\frac{m n}{m+n}}. \) Then \( k \) is approximately Kolmogorov distributed.</li>
 <li><code>p_value FLOAT8</code> - Approximate p-value, i.e., an approximate value for \( \Pr[D \geq d \mid F_X = F_Y] \). Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#aeef43f74f583bdff17bd074d9c0d9607">kolmogorov_cdf</a>(k_statistic))</code>.</li>
 </ul>
 </dd></dl>
 <dl class="section user"><dt>Usage:</dt><dd><ul>
 <li>Test null hypothesis that two samples stem from the same distribution: <pre>SELECT (ks_test(<em>first</em>, <em>value</em>,
     (SELECT count(<em>value</em>) FROM <em>source</em> WHERE <em>first</em>),
     (SELECT count(<em>value</em>) FROM <em>source</em> WHERE NOT <em>first</em>)
     ORDER BY <em>value</em>
 )).* FROM <em>source</em></pre></li>
 </ul>
 </dd></dl>
 <dl class="section note"><dt>Note</dt><dd>This aggregate must be used as an ordered aggregate (<code>ORDER BY <em>value</code></em>) and will raise an exception if values are not ordered. </dd></dl>

 <p>Definition at line <a class="el" href="hypothesis__tests_8sql__in_source.html#l00655">655</a> of file <a class="el" href="hypothesis__tests_8sql__in_source.html">hypothesis_tests.sql_in</a>.</p>

 </div>
 </div>
 <a class="anchor" id="a32cdc58e8a5d149dd90304805de07fbd"></a>
 <div class="memitem">
 <div class="memproto">
       <table class="memname">
         <tr>
           <td class="memname">aggregate mw_test_result mw_test </td>
           <td>(</td>
           <td class="paramtype">boolean&#160;</td>
           <td class="paramname"><em>first</em>, </td>
         </tr>
         <tr>
           <td class="paramkey"></td>
           <td></td>
           <td class="paramtype">float8&#160;</td>
           <td class="paramname"><em>value</em>&#160;</td>
         </tr>
         <tr>
           <td></td>
           <td>)</td>
           <td></td><td></td>
         </tr>
       </table>
 </div><div class="memdoc">
 <p>Given realizations \( x_1, \dots, x_m \) and \( y_1, \dots, y_m \) of i.i.d. random variables \( X_1, \dots, X_m \) and i.i.d. \( Y_1, \dots, Y_n \), respectively, test the null hypothesis that the underlying distributions are equal, i.e., \( H_0 : \forall i,j: \Pr[X_i &gt; Y_j] + \frac{\Pr[X_i = Y_j]}{2} = \frac 12 \).</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">first</td><td>Determines whether the value belongs to the first (if <code>TRUE</code>) or the second sample (if <code>FALSE</code>) </td></tr>
     <tr><td class="paramname">value</td><td>Value of random variate \( x_i \) or \( y_i \)</td></tr>
   </table>
   </dd>
 </dl>
 <dl class="section return"><dt>Returns</dt><dd>A composite value.<ul>
 <li><code>statistic FLOAT8</code> - Statistic <p class="formulaDsp">
 \[ z = \frac{u - \bar x}{\sqrt{\frac{mn(m+n+1)}{12}}} \]
 </p>
  where \( u \) is the u-statistic computed as follows. The z-statistic is approximately standard normally distributed.</li>
 <li><code>u_statistic FLOAT8</code> - Statistic \( u = \min \{ u_x, u_y \} \) where <p class="formulaDsp">
 \[ u_x = mn + \binom{m+1}{2} - \sum_{i=1}^m r_{x,i} \]
 </p>
  where <p class="formulaDsp">
 \[ r_{x,i} = \{ j \mid x_j &lt; x_i \} + \{ j \mid y_j &lt; x_i \} + \frac{\{ j \mid x_j = x_i \} + \{ j \mid y_j = x_i \} + 1}{2} \]
 </p>
  is defined as the rank of \( x_i \) in the combined list of all \( m+n \) observations. For ties, the average rank of all equal values is used.</li>
 <li><code>p_value_one_sided FLOAT8</code> - Approximate one-sided p-value, i.e., an approximate value for \( \Pr[Z \geq z \mid H_0] \). Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#aebcd34ad7b1ca4b31d9699112c9a3b90">normal_cdf</a>(z_statistic))</code>.</li>
 <li><code>p_value_two_sided FLOAT8</code> - Approximate two-sided p-value, i.e., an approximate value for \( \Pr[|Z| \geq |z| \mid H_0] \). Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#aebcd34ad7b1ca4b31d9699112c9a3b90">normal_cdf</a>(-abs(z_statistic)))</code>.</li>
 </ul>
 </dd></dl>
 <dl class="section user"><dt>Usage:</dt><dd><ul>
 <li>Test null hypothesis that two samples stem from the same distribution: <pre>SELECT (mw_test(<em>first</em>, <em>value</em> ORDER BY <em>value</em>)).* FROM <em>source</em></pre></li>
 </ul>
 </dd></dl>
 <dl class="section note"><dt>Note</dt><dd>This aggregate must be used as an ordered aggregate (<code>ORDER BY <em>value</code></em>) and will raise an exception if values are not ordered. </dd></dl>

 <p>Definition at line <a class="el" href="hypothesis__tests_8sql__in_source.html#l00744">744</a> of file <a class="el" href="hypothesis__tests_8sql__in_source.html">hypothesis_tests.sql_in</a>.</p>

 </div>
 </div>
 <a class="anchor" id="a2d8a6b8665dcc5002c06e3c56379d791"></a>
 <div class="memitem">
 <div class="memproto">
       <table class="memname">
         <tr>
           <td class="memname">aggregate one_way_anova_result one_way_anova </td>
           <td>(</td>
           <td class="paramtype">integer&#160;</td>
           <td class="paramname"><em>group</em>, </td>
         </tr>
         <tr>
           <td class="paramkey"></td>
           <td></td>
           <td class="paramtype">float8&#160;</td>
           <td class="paramname"><em>value</em>&#160;</td>
         </tr>
         <tr>
           <td></td>
           <td>)</td>
           <td></td><td></td>
         </tr>
       </table>
 </div><div class="memdoc">
 <p>Given realizations \( x_{1,1}, \dots, x_{1, n_1}, x_{2,1}, \dots, x_{2,n_2}, \dots, x_{k,n_k} \) of i.i.d. random variables \( X_{i,j} \sim N(\mu_i, \sigma^2) \) with unknown parameters \( \mu_1, \dots, \mu_k \) and \( \sigma^2 \), test the null hypotheses \( H_0 : \mu_1 = \dots = \mu_k \).</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">group</td><td>Group which <code>value</code> is from. Note that <code>group</code> can assume arbitary value not limited to a continguous range of integers. </td></tr>
     <tr><td class="paramname">value</td><td>Value of random variate \( x_{i,j} \)</td></tr>
   </table>
   </dd>
 </dl>
 <dl class="section return"><dt>Returns</dt><dd>A composite value as follows. Let \( n := \sum_{i=1}^k n_i \) be the total size of all samples. Denote by \( \bar x \) the grand mean, by \( \overline{x_i} \) the group sample means, and by \( s_i^2 \) the group sample variances.<ul>
 <li><code>sum_squares_between DOUBLE PRECISION</code> - sum of squares between the group means, i.e., \( \mathit{SS}_b = \sum_{i=1}^k n_i (\overline{x_i} - \bar x)^2. \)</li>
 <li><code>sum_squares_within DOUBLE PRECISION</code> - sum of squares within the groups, i.e., \( \mathit{SS}_w = \sum_{i=1}^k (n_i - 1) s_i^2. \)</li>
 <li><code>df_between BIGINT</code> - degree of freedom for between-group variation \( (k-1) \)</li>
 <li><code>df_within BIGINT</code> - degree of freedom for within-group variation \( (n-k) \)</li>
 <li><code>mean_squares_between DOUBLE PRECISION</code> - mean square between groups, i.e., \( s_b^2 := \frac{\mathit{SS}_b}{k-1} \)</li>
 <li><code>mean_squares_within DOUBLE PRECISION</code> - mean square within groups, i.e., \( s_w^2 := \frac{\mathit{SS}_w}{n-k} \)</li>
 <li><code>statistic DOUBLE PRECISION</code> - Statistic computed as <p class="formulaDsp">
 \[ f = \frac{s_b^2}{s_w^2}. \]
 </p>
  This statistic is Fisher F-distributed with \( (k-1) \) degrees of freedom in the numerator and \( (n-k) \) degrees of freedom in the denominator.</li>
 <li><code>p_value DOUBLE PRECISION</code> - p-value, i.e., \( \Pr[ F \geq f \mid H_0] \).</li>
 </ul>
 </dd></dl>
 <dl class="section user"><dt>Usage:</dt><dd><ul>
 <li>Test null hypothesis that the mean of the all samples is equal: <pre>SELECT (one_way_anova(<em>group</em>, <em>value</em>)).* FROM <em>source</em></pre> </li>
 </ul>
 </dd></dl>

 <p>Definition at line <a class="el" href="hypothesis__tests_8sql__in_source.html#l00987">987</a> of file <a class="el" href="hypothesis__tests_8sql__in_source.html">hypothesis_tests.sql_in</a>.</p>

 </div>
 </div>
 <a class="anchor" id="ae7197f66a085f53d71167ac0a9029567"></a>
 <div class="memitem">
 <div class="memproto">
       <table class="memname">
         <tr>
           <td class="memname">aggregate t_test_result t_test_one </td>
           <td>(</td>
           <td class="paramtype">float8&#160;</td>
           <td class="paramname"><em>value</em>)</td><td></td>
           <td></td>
         </tr>
       </table>
 </div><div class="memdoc">
 <p>Given realizations \( x_1, \dots, x_n \) of i.i.d. random variables \( X_1, \dots, X_n \sim N(\mu, \sigma^2) \) with unknown parameters \( \mu \) and \( \sigma^2 \), test the null hypotheses \( H_0 : \mu \leq 0 \) and \( H_0 : \mu = 0 \).</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">value</td><td>Value of random variate \( x_i \)</td></tr>
   </table>
   </dd>
 </dl>
 <dl class="section return"><dt>Returns</dt><dd>A composite value as follows. We denote by \( \bar x \) the sample mean and by \( s^2 \) the sample variance.<ul>
 <li><code>statistic FLOAT8</code> - Statistic <p class="formulaDsp">
 \[ t = \frac{\sqrt n \cdot \bar x}{s} \]
 </p>
  The corresponding random variable is Student-t distributed with \( (n - 1) \) degrees of freedom.</li>
 <li><code>df FLOAT8</code> - Degrees of freedom \( (n - 1) \)</li>
 <li><code>p_value_one_sided FLOAT8</code> - Lower bound on one-sided p-value. In detail, the result is \( \Pr[\bar X \geq \bar x \mid \mu = 0] \), which is a lower bound on \( \Pr[\bar X \geq \bar x \mid \mu \leq 0] \). Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(statistic))</code>.</li>
 <li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., \( \Pr[ |\bar X| \geq |\bar x| \mid \mu = 0] \). Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(-abs(statistic)))</code>.</li>
 </ul>
 </dd></dl>
 <dl class="section user"><dt>Usage:</dt><dd><ul>
 <li>One-sample t-test: Test null hypothesis that the mean of a sample is at most (or equal to, respectively) \( \mu_0 \): <pre>SELECT (t_test_one(<em>value</em> - <em>mu_0</em>)).* FROM <em>source</em></pre></li>
 <li>Dependent paired t-test: Test null hypothesis that the mean difference between the first and second value in each pair is at most (or equal to, respectively) \( \mu_0 \): <pre>SELECT (t_test_one(<em>first</em> - <em>second</em> - <em>mu_0</em>)).*
               FROM <em>source</em></pre> </li>
 </ul>
 </dd></dl>

 <p>Definition at line <a class="el" href="hypothesis__tests_8sql__in_source.html#l00224">224</a> of file <a class="el" href="hypothesis__tests_8sql__in_source.html">hypothesis_tests.sql_in</a>.</p>

 </div>
 </div>
 <a class="anchor" id="a74e6ef1522197957b5aa35bf67004364"></a>
 <div class="memitem">
 <div class="memproto">
       <table class="memname">
         <tr>
           <td class="memname">aggregate t_test_result t_test_two_pooled </td>
           <td>(</td>
           <td class="paramtype">boolean&#160;</td>
           <td class="paramname"><em>first</em>, </td>
         </tr>
         <tr>
           <td class="paramkey"></td>
           <td></td>
           <td class="paramtype">float8&#160;</td>
           <td class="paramname"><em>value</em>&#160;</td>
         </tr>
         <tr>
           <td></td>
           <td>)</td>
           <td></td><td></td>
         </tr>
       </table>
 </div><div class="memdoc">
 <p>Given realizations \( x_1, \dots, x_n \) and \( y_1, \dots, y_m \) of i.i.d. random variables \( X_1, \dots, X_n \sim N(\mu_X, \sigma^2) \) and \( Y_1, \dots, Y_m \sim N(\mu_Y, \sigma^2) \) with unknown parameters \( \mu_X, \mu_Y, \) and \( \sigma^2 \), test the null hypotheses \( H_0 : \mu_X \leq \mu_Y \) and \( H_0 : \mu_X = \mu_Y \).</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">first</td><td>Indicator whether <code>value</code> is from first sample \( x_1, \dots, x_n \) (if <code>TRUE</code>) or from second sample \( y_1, \dots, y_m \) (if <code>FALSE</code>) </td></tr>
     <tr><td class="paramname">value</td><td>Value of random variate \( x_i \) or \( y_i \)</td></tr>
   </table>
   </dd>
 </dl>
 <dl class="section return"><dt>Returns</dt><dd>A composite value as follows. We denote by \( \bar x, \bar y \) the sample means and by \( s_X^2, s_Y^2 \) the sample variances.<ul>
 <li><code>statistic FLOAT8</code> - Statistic <p class="formulaDsp">
 \[ t = \frac{\bar x - \bar y}{s_p \sqrt{1/n + 1/m}} \]
 </p>
  where <p class="formulaDsp">
 \[ s_p^2 = \frac{\sum_{i=1}^n (x_i - \bar x)^2 + \sum_{i=1}^m (y_i - \bar y)^2} {n + m - 2} \]
 </p>
  is the <em>pooled variance</em>. The corresponding random variable is Student-t distributed with \( (n + m - 2) \) degrees of freedom.</li>
 <li><code>df FLOAT8</code> - Degrees of freedom \( (n + m - 2) \)</li>
 <li><code>p_value_one_sided FLOAT8</code> - Lower bound on one-sided p-value. In detail, the result is \( \Pr[\bar X - \bar Y \geq \bar x - \bar y \mid \mu_X = \mu_Y] \), which is a lower bound on \( \Pr[\bar X - \bar Y \geq \bar x - \bar y \mid \mu_X \leq \mu_Y] \). Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(statistic))</code>.</li>
 <li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., \( \Pr[ |\bar X - \bar Y| \geq |\bar x - \bar y| \mid \mu_X = \mu_Y] \). Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(-abs(statistic)))</code>.</li>
 </ul>
 </dd></dl>
 <dl class="section user"><dt>Usage:</dt><dd><ul>
 <li>Two-sample pooled t-test: Test null hypothesis that the mean of the first sample is at most (or equal to, respectively) the mean of the second sample: <pre>SELECT (t_test_pooled(<em>first</em>, <em>value</em>)).* FROM <em>source</em></pre> </li>
 </ul>
 </dd></dl>

 <p>Definition at line <a class="el" href="hypothesis__tests_8sql__in_source.html#l00300">300</a> of file <a class="el" href="hypothesis__tests_8sql__in_source.html">hypothesis_tests.sql_in</a>.</p>

 </div>
 </div>
 <a class="anchor" id="aa95e5a0c8b4841c113c84a393b8b4868"></a>
 <div class="memitem">
 <div class="memproto">
       <table class="memname">
         <tr>
           <td class="memname">aggregate t_test_result t_test_two_unpooled </td>
           <td>(</td>
           <td class="paramtype">boolean&#160;</td>
           <td class="paramname"><em>first</em>, </td>
         </tr>
         <tr>
           <td class="paramkey"></td>
           <td></td>
           <td class="paramtype">float8&#160;</td>
           <td class="paramname"><em>value</em>&#160;</td>
         </tr>
         <tr>
           <td></td>
           <td>)</td>
           <td></td><td></td>
         </tr>
       </table>
 </div><div class="memdoc">
 <p>Given realizations \( x_1, \dots, x_n \) and \( y_1, \dots, y_m \) of i.i.d. random variables \( X_1, \dots, X_n \sim N(\mu_X, \sigma_X^2) \) and \( Y_1, \dots, Y_m \sim N(\mu_Y, \sigma_Y^2) \) with unknown parameters \( \mu_X, \mu_Y, \sigma_X^2, \) and \( \sigma_Y^2 \), test the null hypotheses \( H_0 : \mu_X \leq \mu_Y \) and \( H_0 : \mu_X = \mu_Y \).</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">first</td><td>Indicator whether <code>value</code> is from first sample \( x_1, \dots, x_n \) (if <code>TRUE</code>) or from second sample \( y_1, \dots, y_m \) (if <code>FALSE</code>) </td></tr>
     <tr><td class="paramname">value</td><td>Value of random variate \( x_i \) or \( y_i \)</td></tr>
   </table>
   </dd>
 </dl>
 <dl class="section return"><dt>Returns</dt><dd>A composite value as follows. We denote by \( \bar x, \bar y \) the sample means and by \( s_X^2, s_Y^2 \) the sample variances.<ul>
 <li><code>statistic FLOAT8</code> - Statistic <p class="formulaDsp">
 \[ t = \frac{\bar x - \bar y}{\sqrt{s_X^2/n + s_Y^2/m}} \]
 </p>
  The corresponding random variable is approximately Student-t distributed with <p class="formulaDsp">
 \[ \frac{(s_X^2 / n + s_Y^2 / m)^2}{(s_X^2 / n)^2/(n-1) + (s_Y^2 / m)^2/(m-1)} \]
 </p>
  degrees of freedom (Welch–Satterthwaite formula).</li>
 <li><code>df FLOAT8</code> - Degrees of freedom (as above)</li>
 <li><code>p_value_one_sided FLOAT8</code> - Lower bound on one-sided p-value. In detail, the result is \( \Pr[\bar X - \bar Y \geq \bar x - \bar y \mid \mu_X = \mu_Y] \), which is a lower bound on \( \Pr[\bar X - \bar Y \geq \bar x - \bar y \mid \mu_X \leq \mu_Y] \). Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(statistic))</code>.</li>
 <li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., \( \Pr[ |\bar X - \bar Y| \geq |\bar x - \bar y| \mid \mu_X = \mu_Y] \). Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#a5322531131074c23a2dbf067ee504ef7">students_t_cdf</a>(-abs(statistic)))</code>.</li>
 </ul>
 </dd></dl>
 <dl class="section user"><dt>Usage:</dt><dd><ul>
 <li>Two-sample unpooled t-test: Test null hypothesis that the mean of the first sample is at most (or equal to, respectively) the mean of the second sample: <pre>SELECT (t_test_unpooled(<em>first</em>, <em>value</em>)).* FROM <em>source</em></pre> </li>
 </ul>
 </dd></dl>

 <p>Definition at line <a class="el" href="hypothesis__tests_8sql__in_source.html#l00364">364</a> of file <a class="el" href="hypothesis__tests_8sql__in_source.html">hypothesis_tests.sql_in</a>.</p>

 </div>
 </div>
 <a class="anchor" id="afea2309e99477df6ebbfbcea11272507"></a>
 <div class="memitem">
 <div class="memproto">
       <table class="memname">
         <tr>
           <td class="memname">aggregate wsr_test_result wsr_test </td>
           <td>(</td>
           <td class="paramtype">float8&#160;</td>
           <td class="paramname"><em>value</em>, </td>
         </tr>
         <tr>
           <td class="paramkey"></td>
           <td></td>
           <td class="paramtype">float8&#160;</td>
           <td class="paramname"><em>precision</em> = <code>-1</code>&#160;</td>
         </tr>
         <tr>
           <td></td>
           <td>)</td>
           <td></td><td></td>
         </tr>
       </table>
 </div><div class="memdoc">
 <p>Given realizations \( x_1, \dots, x_n \) of i.i.d. random variables \( X_1, \dots, X_n \) with unknown mean \( \mu \), test the null hypotheses \( H_0 : \mu \leq 0 \) and \( H_0 : \mu = 0 \).</p>
 <dl class="params"><dt>Parameters</dt><dd>
   <table class="params">
     <tr><td class="paramname">value</td><td>Value of random variate \( x_i \) or \( y_i \). Values of 0 are ignored (i.e., they do not count towards \( n \)). </td></tr>
     <tr><td class="paramname">precision</td><td>The precision \( \epsilon_i \) with which value is known. The precision determines the handling of ties. The current value \( v_i \) is regarded a tie with the previous value \( v_{i-1} \) if \( v_i - \epsilon_i \leq \max_{j=1, \dots, i-1} v_j + \epsilon_j \). If <code>precision</code> is negative, then it will be treated as <code>value * 2^(-52)</code>. (Note that \( 2^{-52} \) is the machine epsilon for type <code>DOUBLE PRECISION</code>.)</td></tr>
   </table>
   </dd>
 </dl>
 <dl class="section return"><dt>Returns</dt><dd>A composite value:<ul>
 <li><code>statistic FLOAT8</code> - statistic computed as follows. Let \( w^+ = \sum_{i \mid x_i &gt; 0} r_i \) and \( w^- = \sum_{i \mid x_i &lt; 0} r_i \) be the <em>signed rank sums</em> where <p class="formulaDsp">
 \[ r_i = \{ j \mid |x_j| &lt; |x_i| \} + \frac{\{ j \mid |x_j| = |x_i| \} + 1}{2}. \]
 </p>
  The Wilcoxon signed-rank statistic is \( w = \min \{ w^+, w^- \} \).</li>
 <li><code>rank_sum_pos FLOAT8</code> - rank sum of all positive values, i.e., \( w^+ \)</li>
 <li><code>rank_sum_neg FLOAT8</code> - rank sum of all negative values, i.e., \( w^- \)</li>
 <li><code>num BIGINT</code> - number \( n \) of non-zero values</li>
 <li><code>z_statistic FLOAT8</code> - z-statistic <p class="formulaDsp">
 \[ z = \frac{w^+ - \frac{n(n+1)}{4}} {\sqrt{\frac{n(n+1)(2n+1)}{24} - \sum_{i=1}^n \frac{t_i^2 - 1}{48}}} \]
 </p>
  where \( t_i \) is the number of values with absolute value equal to \( |x_i| \). The corresponding random variable is approximately standard normally distributed.</li>
 <li><code>p_value_one_sided FLOAT8</code> - One-sided p-value i.e., \( \Pr[Z \geq z \mid \mu \leq 0] \). Computed as <code>(1.0 - <a class="el" href="prob_8sql__in.html#aebcd34ad7b1ca4b31d9699112c9a3b90">normal_cdf</a>(z_statistic))</code>.</li>
 <li><code>p_value_two_sided FLOAT8</code> - Two-sided p-value, i.e., \( \Pr[ |Z| \geq |z| \mid \mu = 0] \). Computed as <code>(2 * <a class="el" href="prob_8sql__in.html#aebcd34ad7b1ca4b31d9699112c9a3b90">normal_cdf</a>(-abs(z_statistic)))</code>.</li>
 </ul>
 </dd></dl>
 <dl class="section user"><dt>Usage:</dt><dd><ul>
 <li>One-sample test: Test null hypothesis that the mean of a sample is at most (or equal to, respectively) \( \mu_0 \): <pre>SELECT (wsr_test(<em>value</em> - <em>mu_0</em> ORDER BY abs(<em>value</em>))).* FROM <em>source</em></pre></li>
 <li>Dependent paired test: Test null hypothesis that the mean difference between the first and second value in a pair is at most (or equal to, respectively) \( \mu_0 \): <pre>SELECT (wsr_test(<em>first</em> - <em>second</em> - <em>mu_0</em> ORDER BY abs(<em>first</em> - <em>second</em>))).* FROM <em>source</em></pre> If correctly determining ties is important (e.g., you may want to do so when comparing to software products that take <code>first</code>, <code>second</code>, and <code>mu_0</code> as individual parameters), supply the precision parameter. This can be done as follows: <pre>SELECT (wsr_test(
     <em>first</em> - <em>second</em> - <em>mu_0</em>,
     3 * 2^(-52) * greatest(first, second, mu_0)
     ORDER BY abs(<em>first</em> - <em>second</em>)
 )).* FROM <em>source</em></pre> Here \( 2^{-52} \) is the machine epsilon, which we scale to the magnitude of the input data and multiply with 3 because we have a sum with three terms.</li>
 </ul>
 </dd></dl>
 <dl class="section note"><dt>Note</dt><dd>This aggregate must be used as an ordered aggregate (<code>ORDER BY abs(<em>value</code></em>)) and will raise an exception if the absolute values are not ordered. </dd></dl>

 <p>Definition at line <a class="el" href="hypothesis__tests_8sql__in_source.html#l00872">872</a> of file <a class="el" href="hypothesis__tests_8sql__in_source.html">hypothesis_tests.sql_in</a>.</p>

 </div>
 </div>
 </div><!-- contents -->
 </div><!-- doc-content -->
 <!-- start footer part -->
 <div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
   <ul>
     <li class="navelem"><a class="el" href="dir_68267d1309a1af8e8297ef4c3efbcdba.html">src</a></li><li class="navelem"><a class="el" href="dir_efbcf68973d247bbf15f9eecae7f24e3.html">ports</a></li><li class="navelem"><a class="el" href="dir_a4a48839224ef8488facbffa8a397967.html">postgres</a></li><li class="navelem"><a class="el" href="dir_dc596537ad427a4d866006d1a3e1fe29.html">modules</a></li><li class="navelem"><a class="el" href="dir_505cd743a8a717435eca324f49291a46.html">stats</a></li><li class="navelem"><a class="el" href="hypothesis__tests_8sql__in.html">hypothesis_tests.sql_in</a></li>
     <li class="footer">Generated on Tue Sep 10 2013 15:48:04 for MADlib by
     <a href="http://www.doxygen.org/index.html">
     <img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.4 </li>
   </ul>
 </div>
 </body>
 </html>