blob: 31f97e08c5b34ea34b7a5dad7a7cc022106f3900 [file] [log] [blame]
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Finding frequent items for columns, possibly with false...</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link rel="stylesheet" type="text/css" href="R.css" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script>
<script>hljs.initHighlightingOnLoad();</script>
</head><body>
<table width="100%" summary="page for freqItems {SparkR}"><tr><td>freqItems {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>
<h2>Finding frequent items for columns, possibly with false positives</h2>
<h3>Description</h3>
<p>Finding frequent items for columns, possibly with false positives.
Using the frequent element count algorithm described in
<a href="http://dx.doi.org/10.1145/762471.762473">http://dx.doi.org/10.1145/762471.762473</a>, proposed by Karp, Schenker, and Papadimitriou.
</p>
<h3>Usage</h3>
<pre>
## S4 method for signature 'SparkDataFrame,character'
freqItems(x, cols, support = 0.01)
</pre>
<h3>Arguments</h3>
<table summary="R argblock">
<tr valign="top"><td><code>x</code></td>
<td>
<p>A SparkDataFrame.</p>
</td></tr>
<tr valign="top"><td><code>cols</code></td>
<td>
<p>A vector column names to search frequent items in.</p>
</td></tr>
<tr valign="top"><td><code>support</code></td>
<td>
<p>(Optional) The minimum frequency for an item to be considered <code>frequent</code>.
Should be greater than 1e-4. Default support = 0.01.</p>
</td></tr>
</table>
<h3>Value</h3>
<p>a local R data.frame with the frequent items in each column
</p>
<h3>Note</h3>
<p>freqItems since 1.6.0
</p>
<h3>See Also</h3>
<p>Other stat functions: <code><a href="approxQuantile.html">approxQuantile</a></code>,
<code><a href="corr.html">corr</a></code>, <code><a href="cov.html">cov</a></code>,
<code><a href="crosstab.html">crosstab</a></code>, <code><a href="sampleBy.html">sampleBy</a></code>
</p>
<h3>Examples</h3>
<pre><code class="r">## Not run:
##D df &lt;- read.json(&quot;/path/to/file.json&quot;)
##D fi = freqItems(df, c(&quot;title&quot;, &quot;gender&quot;))
## End(Not run)
</code></pre>
<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.4.0 <a href="00Index.html">Index</a>]</div>
</body></html>