blob: d050836827d63c6a0e72fd4af33b3a5630eb05b5 [file] [log] [blame]
<!DOCTYPE html><html><head><title>R: PrefixSpan</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css">
<script type="text/javascript">
const macros = { "\\R": "\\textsf{R}", "\\code": "\\texttt"};
function processMathHTML() {
var l = document.getElementsByClassName('reqn');
for (let e of l) { katex.render(e.textContent, e, { throwOnError: false, macros }); }
return;
}</script>
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.js"
onload="processMathHTML();"></script>
<link rel="stylesheet" type="text/css" href="R.css" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script>
<script>hljs.initHighlightingOnLoad();</script>
</head><body><div class="container">
<table style="width: 100%;"><tr><td>spark.findFrequentSequentialPatterns {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>
<h2>PrefixSpan</h2>
<h3>Description</h3>
<p>A parallel PrefixSpan algorithm to mine frequent sequential patterns.
<code>spark.findFrequentSequentialPatterns</code> returns a complete set of frequent sequential
patterns.
For more details, see
<a href="https://spark.apache.org/docs/latest/mllib-frequent-pattern-mining.html#prefixspan">
PrefixSpan</a>.
</p>
<h3>Usage</h3>
<pre><code class='language-R'>spark.findFrequentSequentialPatterns(data, ...)
## S4 method for signature 'SparkDataFrame'
spark.findFrequentSequentialPatterns(
data,
minSupport = 0.1,
maxPatternLength = 10L,
maxLocalProjDBSize = 32000000L,
sequenceCol = "sequence"
)
</code></pre>
<h3>Arguments</h3>
<table>
<tr style="vertical-align: top;"><td><code>data</code></td>
<td>
<p>A SparkDataFrame.</p>
</td></tr>
<tr style="vertical-align: top;"><td><code>...</code></td>
<td>
<p>additional argument(s) passed to the method.</p>
</td></tr>
<tr style="vertical-align: top;"><td><code>minSupport</code></td>
<td>
<p>Minimal support level.</p>
</td></tr>
<tr style="vertical-align: top;"><td><code>maxPatternLength</code></td>
<td>
<p>Maximal pattern length.</p>
</td></tr>
<tr style="vertical-align: top;"><td><code>maxLocalProjDBSize</code></td>
<td>
<p>Maximum number of items (including delimiters used in the internal
storage format) allowed in a projected database before local
processing.</p>
</td></tr>
<tr style="vertical-align: top;"><td><code>sequenceCol</code></td>
<td>
<p>name of the sequence column in dataset.</p>
</td></tr>
</table>
<h3>Value</h3>
<p>A complete set of frequent sequential patterns in the input sequences of itemsets.
The returned <code>SparkDataFrame</code> contains columns of sequence and corresponding
frequency. The schema of it will be:
<code>sequence: ArrayType(ArrayType(T))</code>, <code>freq: integer</code>
where T is the item type
</p>
<h3>Note</h3>
<p>spark.findFrequentSequentialPatterns(SparkDataFrame) since 3.0.0
</p>
<h3>Examples</h3>
<pre><code class="language-r">## Not run:
##D df &lt;- createDataFrame(list(list(list(list(1L, 2L), list(3L))),
##D list(list(list(1L), list(3L, 2L), list(1L, 2L))),
##D list(list(list(1L, 2L), list(5L))),
##D list(list(list(6L)))),
##D schema = c(&quot;sequence&quot;))
##D frequency &lt;- spark.findFrequentSequentialPatterns(df, minSupport = 0.5, maxPatternLength = 5L,
##D maxLocalProjDBSize = 32000000L)
##D showDF(frequency)
## End(Not run)
</code></pre>
<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 3.2.3 <a href="00Index.html">Index</a>]</div>
</div>
</body></html>