blob: 49ace3cdfd1442458c186dece80e49b1c9b75eae [file] [log] [blame]
<!DOCTYPE html><html><head><title>R: Repartition by range</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css">
<script type="text/javascript">
const macros = { "\\R": "\\textsf{R}", "\\code": "\\texttt"};
function processMathHTML() {
var l = document.getElementsByClassName('reqn');
for (let e of l) { katex.render(e.textContent, e, { throwOnError: false, macros }); }
return;
}</script>
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.js"
onload="processMathHTML();"></script>
<link rel="stylesheet" type="text/css" href="R.css" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script>
<script>hljs.initHighlightingOnLoad();</script>
</head><body><div class="container">
<table style="width: 100%;"><tr><td>repartitionByRange {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>
<h2>Repartition by range</h2>
<h3>Description</h3>
<p>The following options for repartition by range are possible:
</p>
<ul>
<li><p>1. Return a new SparkDataFrame range partitioned by
the given columns into <code>numPartitions</code>.
</p>
</li>
<li><p>2. Return a new SparkDataFrame range partitioned by the given column(s),
using <code>spark.sql.shuffle.partitions</code> as number of partitions.
</p>
</li></ul>
<p>At least one partition-by expression must be specified.
When no explicit sort order is specified, &quot;ascending nulls first&quot; is assumed.
</p>
<h3>Usage</h3>
<pre><code class='language-R'>repartitionByRange(x, ...)
## S4 method for signature 'SparkDataFrame'
repartitionByRange(x, numPartitions = NULL, col = NULL, ...)
</code></pre>
<h3>Arguments</h3>
<table>
<tr style="vertical-align: top;"><td><code>x</code></td>
<td>
<p>a SparkDataFrame.</p>
</td></tr>
<tr style="vertical-align: top;"><td><code>...</code></td>
<td>
<p>additional column(s) to be used in the range partitioning.</p>
</td></tr>
<tr style="vertical-align: top;"><td><code>numPartitions</code></td>
<td>
<p>the number of partitions to use.</p>
</td></tr>
<tr style="vertical-align: top;"><td><code>col</code></td>
<td>
<p>the column by which the range partitioning will be performed.</p>
</td></tr>
</table>
<h3>Details</h3>
<p>Note that due to performance reasons this method uses sampling to estimate the ranges.
Hence, the output may not be consistent, since sampling can return different values.
The sample size can be controlled by the config
<code>spark.sql.execution.rangeExchange.sampleSizePerPartition</code>.
</p>
<h3>Note</h3>
<p>repartitionByRange since 2.4.0
</p>
<h3>See Also</h3>
<p><a href="../../SparkR/help/repartition.html">repartition</a>, <a href="../../SparkR/help/coalesce.html">coalesce</a>
</p>
<p>Other SparkDataFrame functions:
<code><a href="../../SparkR/help/SparkDataFrame-class.html">SparkDataFrame-class</a></code>,
<code><a href="../../SparkR/help/agg.html">agg</a>()</code>,
<code><a href="../../SparkR/help/alias.html">alias</a>()</code>,
<code><a href="../../SparkR/help/arrange.html">arrange</a>()</code>,
<code><a href="../../SparkR/help/as.data.frame.html">as.data.frame</a>()</code>,
<code><a href="../../SparkR/help/attach+2CSparkDataFrame-method.html">attach,SparkDataFrame-method</a></code>,
<code><a href="../../SparkR/help/broadcast.html">broadcast</a>()</code>,
<code><a href="../../SparkR/help/cache.html">cache</a>()</code>,
<code><a href="../../SparkR/help/checkpoint.html">checkpoint</a>()</code>,
<code><a href="../../SparkR/help/coalesce.html">coalesce</a>()</code>,
<code><a href="../../SparkR/help/collect.html">collect</a>()</code>,
<code><a href="../../SparkR/help/colnames.html">colnames</a>()</code>,
<code><a href="../../SparkR/help/coltypes.html">coltypes</a>()</code>,
<code><a href="../../SparkR/help/createOrReplaceTempView.html">createOrReplaceTempView</a>()</code>,
<code><a href="../../SparkR/help/crossJoin.html">crossJoin</a>()</code>,
<code><a href="../../SparkR/help/cube.html">cube</a>()</code>,
<code><a href="../../SparkR/help/dapplyCollect.html">dapplyCollect</a>()</code>,
<code><a href="../../SparkR/help/dapply.html">dapply</a>()</code>,
<code><a href="../../SparkR/help/describe.html">describe</a>()</code>,
<code><a href="../../SparkR/help/dim.html">dim</a>()</code>,
<code><a href="../../SparkR/help/distinct.html">distinct</a>()</code>,
<code><a href="../../SparkR/help/dropDuplicates.html">dropDuplicates</a>()</code>,
<code><a href="../../SparkR/help/dropna.html">dropna</a>()</code>,
<code><a href="../../SparkR/help/drop.html">drop</a>()</code>,
<code><a href="../../SparkR/help/dtypes.html">dtypes</a>()</code>,
<code><a href="../../SparkR/help/exceptAll.html">exceptAll</a>()</code>,
<code><a href="../../SparkR/help/except.html">except</a>()</code>,
<code><a href="../../SparkR/help/explain.html">explain</a>()</code>,
<code><a href="../../SparkR/help/filter.html">filter</a>()</code>,
<code><a href="../../SparkR/help/first.html">first</a>()</code>,
<code><a href="../../SparkR/help/gapplyCollect.html">gapplyCollect</a>()</code>,
<code><a href="../../SparkR/help/gapply.html">gapply</a>()</code>,
<code><a href="../../SparkR/help/getNumPartitions.html">getNumPartitions</a>()</code>,
<code><a href="../../SparkR/help/group_by.html">group_by</a>()</code>,
<code><a href="../../SparkR/help/head.html">head</a>()</code>,
<code><a href="../../SparkR/help/hint.html">hint</a>()</code>,
<code><a href="../../SparkR/help/histogram.html">histogram</a>()</code>,
<code><a href="../../SparkR/help/insertInto.html">insertInto</a>()</code>,
<code><a href="../../SparkR/help/intersectAll.html">intersectAll</a>()</code>,
<code><a href="../../SparkR/help/intersect.html">intersect</a>()</code>,
<code><a href="../../SparkR/help/isLocal.html">isLocal</a>()</code>,
<code><a href="../../SparkR/help/isStreaming.html">isStreaming</a>()</code>,
<code><a href="../../SparkR/help/join.html">join</a>()</code>,
<code><a href="../../SparkR/help/limit.html">limit</a>()</code>,
<code><a href="../../SparkR/help/localCheckpoint.html">localCheckpoint</a>()</code>,
<code><a href="../../SparkR/help/merge.html">merge</a>()</code>,
<code><a href="../../SparkR/help/mutate.html">mutate</a>()</code>,
<code><a href="../../SparkR/help/ncol.html">ncol</a>()</code>,
<code><a href="../../SparkR/help/nrow.html">nrow</a>()</code>,
<code><a href="../../SparkR/help/persist.html">persist</a>()</code>,
<code><a href="../../SparkR/help/printSchema.html">printSchema</a>()</code>,
<code><a href="../../SparkR/help/randomSplit.html">randomSplit</a>()</code>,
<code><a href="../../SparkR/help/rbind.html">rbind</a>()</code>,
<code><a href="../../SparkR/help/rename.html">rename</a>()</code>,
<code><a href="../../SparkR/help/repartition.html">repartition</a>()</code>,
<code><a href="../../SparkR/help/rollup.html">rollup</a>()</code>,
<code><a href="../../SparkR/help/sample.html">sample</a>()</code>,
<code><a href="../../SparkR/help/saveAsTable.html">saveAsTable</a>()</code>,
<code><a href="../../SparkR/help/schema.html">schema</a>()</code>,
<code><a href="../../SparkR/help/selectExpr.html">selectExpr</a>()</code>,
<code><a href="../../SparkR/help/select.html">select</a>()</code>,
<code><a href="../../SparkR/help/showDF.html">showDF</a>()</code>,
<code><a href="../../SparkR/help/show.html">show</a>()</code>,
<code><a href="../../SparkR/help/storageLevel.html">storageLevel</a>()</code>,
<code><a href="../../SparkR/help/str.html">str</a>()</code>,
<code><a href="../../SparkR/help/subset.html">subset</a>()</code>,
<code><a href="../../SparkR/help/summary.html">summary</a>()</code>,
<code><a href="../../SparkR/help/take.html">take</a>()</code>,
<code><a href="../../SparkR/help/toJSON.html">toJSON</a>()</code>,
<code><a href="../../SparkR/help/unionAll.html">unionAll</a>()</code>,
<code><a href="../../SparkR/help/unionByName.html">unionByName</a>()</code>,
<code><a href="../../SparkR/help/union.html">union</a>()</code>,
<code><a href="../../SparkR/help/unpersist.html">unpersist</a>()</code>,
<code><a href="../../SparkR/help/withColumn.html">withColumn</a>()</code>,
<code><a href="../../SparkR/help/withWatermark.html">withWatermark</a>()</code>,
<code><a href="../../SparkR/help/with.html">with</a>()</code>,
<code><a href="../../SparkR/help/write.df.html">write.df</a>()</code>,
<code><a href="../../SparkR/help/write.jdbc.html">write.jdbc</a>()</code>,
<code><a href="../../SparkR/help/write.json.html">write.json</a>()</code>,
<code><a href="../../SparkR/help/write.orc.html">write.orc</a>()</code>,
<code><a href="../../SparkR/help/write.parquet.html">write.parquet</a>()</code>,
<code><a href="../../SparkR/help/write.stream.html">write.stream</a>()</code>,
<code><a href="../../SparkR/help/write.text.html">write.text</a>()</code>
</p>
<h3>Examples</h3>
<pre><code class="r">## Not run:
##D sparkR.session()
##D path &lt;- &quot;path/to/file.json&quot;
##D df &lt;- read.json(path)
##D newDF &lt;- repartitionByRange(df, col = df$col1, df$col2)
##D newDF &lt;- repartitionByRange(df, 3L, col = df$col1, df$col2)
## End(Not run)
</code></pre>
<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 3.2.2 <a href="00Index.html">Index</a>]</div>
</div>
</body></html>