| <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> |
| <html><head><title>R: Calculates the approximate quantiles of a numerical column of...</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
| <link rel="stylesheet" type="text/css" href="R.css"> |
| |
| <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> |
| <script>hljs.initHighlightingOnLoad();</script> |
| </head><body> |
| |
| <table width="100%" summary="page for approxQuantile {SparkR}"><tr><td>approxQuantile {SparkR}</td><td align="right">R Documentation</td></tr></table> |
| |
| <h2>Calculates the approximate quantiles of a numerical column of a SparkDataFrame</h2> |
| |
| <h3>Description</h3> |
| |
| <p>Calculates the approximate quantiles of a numerical column of a SparkDataFrame. |
| The result of this algorithm has the following deterministic bound: |
| If the SparkDataFrame has N elements and if we request the quantile at probability p up to |
| error err, then the algorithm will return a sample x from the SparkDataFrame so that the |
| *exact* rank of x is close to (p * N). More precisely, |
| floor((p - err) * N) <= rank(x) <= ceil((p + err) * N). |
| This method implements a variation of the Greenwald-Khanna algorithm (with some speed |
| optimizations). The algorithm was first present in [[http://dx.doi.org/10.1145/375663.375670 |
| Space-efficient Online Computation of Quantile Summaries]] by Greenwald and Khanna. |
| </p> |
| |
| |
| <h3>Usage</h3> |
| |
| <pre> |
| ## S4 method for signature 'SparkDataFrame,character,numeric,numeric' |
| approxQuantile(x, col, |
| probabilities, relativeError) |
| </pre> |
| |
| |
| <h3>Arguments</h3> |
| |
| <table summary="R argblock"> |
| <tr valign="top"><td><code>x</code></td> |
| <td> |
| <p>A SparkDataFrame.</p> |
| </td></tr> |
| <tr valign="top"><td><code>col</code></td> |
| <td> |
| <p>The name of the numerical column.</p> |
| </td></tr> |
| <tr valign="top"><td><code>probabilities</code></td> |
| <td> |
| <p>A list of quantile probabilities. Each number must belong to [0, 1]. |
| For example 0 is the minimum, 0.5 is the median, 1 is the maximum.</p> |
| </td></tr> |
| <tr valign="top"><td><code>relativeError</code></td> |
| <td> |
| <p>The relative target precision to achieve (>= 0). If set to zero, |
| the exact quantiles are computed, which could be very expensive. |
| Note that values greater than 1 are accepted but give the same result as 1.</p> |
| </td></tr> |
| </table> |
| |
| |
| <h3>Value</h3> |
| |
| <p>The approximate quantiles at the given probabilities. |
| </p> |
| |
| |
| <h3>Note</h3> |
| |
| <p>approxQuantile since 2.0.0 |
| </p> |
| |
| |
| <h3>See Also</h3> |
| |
| <p>Other stat functions: <code><a href="corr.html">corr</a></code>, |
| <code><a href="corr.html">corr</a></code>, <code><a href="corr.html">corr</a></code>, |
| <code><a href="corr.html">corr,Column-method</a></code>, |
| <code><a href="corr.html">corr,SparkDataFrame-method</a></code>; |
| <code><a href="cov.html">cov</a></code>, <code><a href="cov.html">cov</a></code>, <code><a href="cov.html">cov</a></code>, |
| <code><a href="cov.html">cov,SparkDataFrame-method</a></code>, |
| <code><a href="cov.html">cov,characterOrColumn-method</a></code>, |
| <code><a href="cov.html">covar_samp</a></code>, <code><a href="cov.html">covar_samp</a></code>, |
| <code><a href="cov.html">covar_samp,characterOrColumn,characterOrColumn-method</a></code>; |
| <code><a href="crosstab.html">crosstab</a></code>, |
| <code><a href="crosstab.html">crosstab,SparkDataFrame,character,character-method</a></code>; |
| <code><a href="freqItems.html">freqItems</a></code>, |
| <code><a href="freqItems.html">freqItems,SparkDataFrame,character-method</a></code>; |
| <code><a href="sampleBy.html">sampleBy</a></code>, <code><a href="sampleBy.html">sampleBy</a></code>, |
| <code><a href="sampleBy.html">sampleBy,SparkDataFrame,character,list,numeric-method</a></code> |
| </p> |
| |
| |
| <h3>Examples</h3> |
| |
| <pre><code class="r">## Not run: |
| ##D df <- read.json("/path/to/file.json") |
| ##D quantiles <- approxQuantile(df, "key", c(0.5, 0.8), 0.0) |
| ## End(Not run) |
| </code></pre> |
| |
| |
| <hr><div align="center">[Package <em>SparkR</em> version 2.1.1 <a href="00Index.html">Index</a>]</div> |
| </body></html> |