blob: 264cc8fe5be076ed3918e9d9e3dd39567891c130 [file] [log] [blame]
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Get the existing SparkSession or initialize a new...</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link rel="stylesheet" type="text/css" href="R.css" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script>
<script>hljs.initHighlightingOnLoad();</script>
</head><body>
<table width="100%" summary="page for sparkR.session {SparkR}"><tr><td>sparkR.session {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>
<h2>Get the existing SparkSession or initialize a new SparkSession.</h2>
<h3>Description</h3>
<p>SparkSession is the entry point into SparkR. <code>sparkR.session</code> gets the existing
SparkSession or initializes a new SparkSession.
Additional Spark properties can be set in <code>...</code>, and these named parameters take priority
over values in <code>master</code>, <code>appName</code>, named lists of <code>sparkConfig</code>.
</p>
<h3>Usage</h3>
<pre>
sparkR.session(master = "", appName = "SparkR",
sparkHome = Sys.getenv("SPARK_HOME"), sparkConfig = list(),
sparkJars = "", sparkPackages = "", enableHiveSupport = TRUE, ...)
</pre>
<h3>Arguments</h3>
<table summary="R argblock">
<tr valign="top"><td><code>master</code></td>
<td>
<p>the Spark master URL.</p>
</td></tr>
<tr valign="top"><td><code>appName</code></td>
<td>
<p>application name to register with cluster manager.</p>
</td></tr>
<tr valign="top"><td><code>sparkHome</code></td>
<td>
<p>Spark Home directory.</p>
</td></tr>
<tr valign="top"><td><code>sparkConfig</code></td>
<td>
<p>named list of Spark configuration to set on worker nodes.</p>
</td></tr>
<tr valign="top"><td><code>sparkJars</code></td>
<td>
<p>character vector of jar files to pass to the worker nodes.</p>
</td></tr>
<tr valign="top"><td><code>sparkPackages</code></td>
<td>
<p>character vector of package coordinates</p>
</td></tr>
<tr valign="top"><td><code>enableHiveSupport</code></td>
<td>
<p>enable support for Hive, fallback if not built with Hive support; once
set, this cannot be turned off on an existing session</p>
</td></tr>
<tr valign="top"><td><code>...</code></td>
<td>
<p>named Spark properties passed to the method.</p>
</td></tr>
</table>
<h3>Details</h3>
<p>When called in an interactive session, this method checks for the Spark installation, and, if not
found, it will be downloaded and cached automatically. Alternatively, <code>install.spark</code> can
be called manually.
</p>
<p>A default warehouse is created automatically in the current directory when a managed table is
created via <code>sql</code> statement <code>CREATE TABLE</code>, for example. To change the location of the
warehouse, set the named parameter <code>spark.sql.warehouse.dir</code> to the SparkSession. Along with
the warehouse, an accompanied metastore may also be automatically created in the current
directory when a new SparkSession is initialized with <code>enableHiveSupport</code> set to
<code>TRUE</code>, which is the default. For more details, refer to Hive configuration at
<a href="http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables">http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables</a>.
</p>
<p>For details on how to initialize and use SparkR, refer to SparkR programming guide at
<a href="http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession">http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession</a>.
</p>
<h3>Note</h3>
<p>sparkR.session since 2.0.0
</p>
<h3>Examples</h3>
<pre><code class="r">## Not run:
##D sparkR.session()
##D df &lt;- read.json(path)
##D
##D sparkR.session(&quot;local[2]&quot;, &quot;SparkR&quot;, &quot;/home/spark&quot;)
##D sparkR.session(&quot;yarn-client&quot;, &quot;SparkR&quot;, &quot;/home/spark&quot;,
##D list(spark.executor.memory=&quot;4g&quot;),
##D c(&quot;one.jar&quot;, &quot;two.jar&quot;, &quot;three.jar&quot;),
##D c(&quot;com.databricks:spark-avro_2.11:2.0.1&quot;))
##D sparkR.session(spark.master = &quot;yarn-client&quot;, spark.executor.memory = &quot;4g&quot;)
## End(Not run)
</code></pre>
<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 2.3.1 <a href="00Index.html">Index</a>]</div>
</body></html>