blob: 407aa29955cf4927b691e3167f2d9d9c0f8d9e22 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta name="copyright" content="(C) Copyright 2025" />
<meta name="DC.rights.owner" content="(C) Copyright 2025" />
<meta name="DC.Type" content="concept" />
<meta name="DC.Title" content="STDDEV, STDDEV_SAMP, STDDEV_POP Functions" />
<meta name="DC.Relation" scheme="URI" content="../topics/impala_aggregate_functions.html" />
<meta name="prodname" content="Impala" />
<meta name="prodname" content="Impala" />
<meta name="version" content="Impala 3.4.x" />
<meta name="version" content="Impala 3.4.x" />
<meta name="DC.Format" content="XHTML" />
<meta name="DC.Identifier" content="stddev" />
<link rel="stylesheet" type="text/css" href="../commonltr.css" />
<title>STDDEV, STDDEV_SAMP, STDDEV_POP Functions</title>
</head>
<body id="stddev">
<h1 class="title topictitle1" id="ariaid-title1">STDDEV, STDDEV_SAMP, STDDEV_POP Functions</h1>
<div class="body conbody">
<p class="p">
An aggregate function that
<a class="xref" href="http://en.wikipedia.org/wiki/Standard_deviation" target="_blank">standard
deviation</a> of a set of numbers.
</p>
<p class="p">
<strong class="ph b">Syntax:</strong>
</p>
<pre class="pre codeblock"><code>{ STDDEV | STDDEV_SAMP | STDDEV_POP } ([DISTINCT | ALL] <var class="keyword varname">expression</var>)</code></pre>
<p class="p">
This function works with any numeric data type.
</p>
<p class="p">
<strong class="ph b">Return type:</strong> <code class="ph codeph">DOUBLE</code> in Impala 2.0 and higher;
<code class="ph codeph">STRING</code> in earlier releases
</p>
<p class="p">
This function is typically used in mathematical formulas related to probability distributions.
</p>
<p class="p">
The <code class="ph codeph">STDDEV_POP()</code> and <code class="ph codeph">STDDEV_SAMP()</code> functions compute the population
standard deviation and sample standard deviation, respectively, of the input values.
(<code class="ph codeph">STDDEV()</code> is an alias for <code class="ph codeph">STDDEV_SAMP()</code>.) Both functions evaluate all input
rows matched by the query. The difference is that <code class="ph codeph">STDDEV_SAMP()</code> is scaled by
<code class="ph codeph">1/(N-1)</code> while <code class="ph codeph">STDDEV_POP()</code> is scaled by <code class="ph codeph">1/N</code>.
</p>
<p class="p">
If no input rows match the query, the result of any of these functions is <code class="ph codeph">NULL</code>. If a single
input row matches the query, the result of any of these functions is <code class="ph codeph">"0.0"</code>.
</p>
<p class="p">
<strong class="ph b">Examples:</strong>
</p>
<p class="p">
This example demonstrates how <code class="ph codeph">STDDEV()</code> and <code class="ph codeph">STDDEV_SAMP()</code> return the same
result, while <code class="ph codeph">STDDEV_POP()</code> uses a slightly different calculation to reflect that the input
data is considered part of a larger <span class="q">"population"</span>.
</p>
<pre class="pre codeblock"><code>[localhost:21000] &gt; select stddev(score) from test_scores;
+---------------+
| stddev(score) |
+---------------+
| 28.5 |
+---------------+
[localhost:21000] &gt; select stddev_samp(score) from test_scores;
+--------------------+
| stddev_samp(score) |
+--------------------+
| 28.5 |
+--------------------+
[localhost:21000] &gt; select stddev_pop(score) from test_scores;
+-------------------+
| stddev_pop(score) |
+-------------------+
| 28.4858 |
+-------------------+
</code></pre>
<p class="p">
This example demonstrates that, because the return value of these aggregate functions is a
<code class="ph codeph">STRING</code>, you must currently convert the result with <code class="ph codeph">CAST</code>.
</p>
<pre class="pre codeblock"><code>[localhost:21000] &gt; create table score_stats as select cast(stddev(score) as decimal(7,4)) `standard_deviation`, cast(variance(score) as decimal(7,4)) `variance` from test_scores;
+-------------------+
| summary |
+-------------------+
| Inserted 1 row(s) |
+-------------------+
[localhost:21000] &gt; desc score_stats;
+--------------------+--------------+---------+
| name | type | comment |
+--------------------+--------------+---------+
| standard_deviation | decimal(7,4) | |
| variance | decimal(7,4) | |
+--------------------+--------------+---------+
</code></pre>
<p class="p">
<strong class="ph b">Restrictions:</strong>
</p>
<p class="p">
This function cannot be used in an analytic context. That is, the
<code class="ph codeph">OVER()</code> clause is not allowed at all with this function.
</p>
<p class="p">
<strong class="ph b">Related information:</strong>
</p>
<p class="p">
The <code class="ph codeph">STDDEV()</code>, <code class="ph codeph">STDDEV_POP()</code>, and <code class="ph codeph">STDDEV_SAMP()</code> functions
compute the standard deviation (square root of the variance) based on the results of
<code class="ph codeph">VARIANCE()</code>, <code class="ph codeph">VARIANCE_POP()</code>, and <code class="ph codeph">VARIANCE_SAMP()</code>
respectively. See <a class="xref" href="impala_variance.html#variance">VARIANCE, VARIANCE_SAMP, VARIANCE_POP, VAR_SAMP, VAR_POP Functions</a> for details about the variance property.
</p>
</div>
<div class="related-links">
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_aggregate_functions.html">Impala Aggregate Functions</a></div>
</div>
</div></body>
</html>