blob: 7b6f360f9528351399897ad7d661b65af04b91a1 [file] [log] [blame]
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Create a SparkDataFrame representing the database table...</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link rel="stylesheet" type="text/css" href="R.css" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script>
<script>hljs.initHighlightingOnLoad();</script>
</head><body>
<table width="100%" summary="page for read.jdbc {SparkR}"><tr><td>read.jdbc {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>
<h2>Create a SparkDataFrame representing the database table accessible via JDBC URL</h2>
<h3>Description</h3>
<p>Additional JDBC database connection properties can be set (...)
</p>
<h3>Usage</h3>
<pre>
read.jdbc(url, tableName, partitionColumn = NULL, lowerBound = NULL,
upperBound = NULL, numPartitions = 0L, predicates = list(), ...)
</pre>
<h3>Arguments</h3>
<table summary="R argblock">
<tr valign="top"><td><code>url</code></td>
<td>
<p>JDBC database url of the form <code>jdbc:subprotocol:subname</code></p>
</td></tr>
<tr valign="top"><td><code>tableName</code></td>
<td>
<p>the name of the table in the external database</p>
</td></tr>
<tr valign="top"><td><code>partitionColumn</code></td>
<td>
<p>the name of a column of integral type that will be used for partitioning</p>
</td></tr>
<tr valign="top"><td><code>lowerBound</code></td>
<td>
<p>the minimum value of <code>partitionColumn</code> used to decide partition stride</p>
</td></tr>
<tr valign="top"><td><code>upperBound</code></td>
<td>
<p>the maximum value of <code>partitionColumn</code> used to decide partition stride</p>
</td></tr>
<tr valign="top"><td><code>numPartitions</code></td>
<td>
<p>the number of partitions, This, along with <code>lowerBound</code> (inclusive),
<code>upperBound</code> (exclusive), form partition strides for generated WHERE
clause expressions used to split the column <code>partitionColumn</code> evenly.
This defaults to SparkContext.defaultParallelism when unset.</p>
</td></tr>
<tr valign="top"><td><code>predicates</code></td>
<td>
<p>a list of conditions in the where clause; each one defines one partition</p>
</td></tr>
<tr valign="top"><td><code>...</code></td>
<td>
<p>additional JDBC database connection named properties.</p>
</td></tr>
</table>
<h3>Details</h3>
<p>Only one of partitionColumn or predicates should be set. Partitions of the table will be
retrieved in parallel based on the <code>numPartitions</code> or by the predicates.
</p>
<p>Don't create too many partitions in parallel on a large cluster; otherwise Spark might crash
your external database systems.
</p>
<h3>Value</h3>
<p>SparkDataFrame
</p>
<h3>Note</h3>
<p>read.jdbc since 2.0.0
</p>
<h3>Examples</h3>
<pre><code class="r">## Not run:
##D sparkR.session()
##D jdbcUrl &lt;- &quot;jdbc:mysql://localhost:3306/databasename&quot;
##D df &lt;- read.jdbc(jdbcUrl, &quot;table&quot;, predicates = list(&quot;field&lt;=123&quot;), user = &quot;username&quot;)
##D df2 &lt;- read.jdbc(jdbcUrl, &quot;table2&quot;, partitionColumn = &quot;index&quot;, lowerBound = 0,
##D upperBound = 10000, user = &quot;username&quot;, password = &quot;password&quot;)
## End(Not run)
</code></pre>
<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.1 <a href="00Index.html">Index</a>]</div>
</body></html>