site/docs/3.2.2/api/R/sparkR.session.html - spark-website - Git at Google

 <!DOCTYPE html><html><head><title>R: Get the existing SparkSession or initialize a new...</title>
 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
 <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css">
 <script type="text/javascript">
 const macros = { "\\R": "\\textsf{R}", "\\code": "\\texttt"};
 function processMathHTML() {
     var l = document.getElementsByClassName('reqn');
     for (let e of l) { katex.render(e.textContent, e, { throwOnError: false, macros }); }
     return;
 }</script>
 <script defer src="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.js"
     onload="processMathHTML();"></script>
 <link rel="stylesheet" type="text/css" href="R.css" />

 <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css">
 <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script>
 <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script>
 <script>hljs.initHighlightingOnLoad();</script>
 </head><body><div class="container">

 <table style="width: 100%;"><tr><td>sparkR.session {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>

 <h2>Get the existing SparkSession or initialize a new SparkSession.</h2>

 <h3>Description</h3>

 <p>SparkSession is the entry point into SparkR. <code>sparkR.session</code> gets the existing
 SparkSession or initializes a new SparkSession.
 Additional Spark properties can be set in <code>...</code>, and these named parameters take priority
 over values in <code>master</code>, <code>appName</code>, named lists of <code>sparkConfig</code>.
 </p>


 <h3>Usage</h3>

 <pre><code class='language-R'>sparkR.session(
   master = "",
   appName = "SparkR",
   sparkHome = Sys.getenv("SPARK_HOME"),
   sparkConfig = list(),
   sparkJars = "",
   sparkPackages = "",
   enableHiveSupport = TRUE,
   ...
 )
 </code></pre>


 <h3>Arguments</h3>

 <table>
 <tr style="vertical-align: top;"><td><code>master</code></td>
 <td>
 <p>the Spark master URL.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>appName</code></td>
 <td>
 <p>application name to register with cluster manager.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>sparkHome</code></td>
 <td>
 <p>Spark Home directory.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>sparkConfig</code></td>
 <td>
 <p>named list of Spark configuration to set on worker nodes.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>sparkJars</code></td>
 <td>
 <p>character vector of jar files to pass to the worker nodes.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>sparkPackages</code></td>
 <td>
 <p>character vector of package coordinates</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>enableHiveSupport</code></td>
 <td>
 <p>enable support for Hive, fallback if not built with Hive support; once
 set, this cannot be turned off on an existing session</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>...</code></td>
 <td>
 <p>named Spark properties passed to the method.</p>
 </td></tr>
 </table>


 <h3>Details</h3>

 <p>When called in an interactive session, this method checks for the Spark installation, and, if not
 found, it will be downloaded and cached automatically. Alternatively, <code>install.spark</code> can
 be called manually.
 </p>
 <p>A default warehouse is created automatically in the current directory when a managed table is
 created via <code>sql</code> statement <code>CREATE TABLE</code>, for example. To change the location of the
 warehouse, set the named parameter <code>spark.sql.warehouse.dir</code> to the SparkSession. Along with
 the warehouse, an accompanied metastore may also be automatically created in the current
 directory when a new SparkSession is initialized with <code>enableHiveSupport</code> set to
 <code>TRUE</code>, which is the default. For more details, refer to Hive configuration at
 <a href="http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables">http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables</a>.
 </p>
 <p>For details on how to initialize and use SparkR, refer to SparkR programming guide at
 <a href="http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession">http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession</a>.
 </p>


 <h3>Note</h3>

 <p>sparkR.session since 2.0.0
 </p>


 <h3>Examples</h3>

 <pre><code class="r">## Not run:
 ##D sparkR.session()
 ##D df &lt;- read.json(path)
 ##D
 ##D sparkR.session(&quot;local[2]&quot;, &quot;SparkR&quot;, &quot;/home/spark&quot;)
 ##D sparkR.session(&quot;yarn&quot;, &quot;SparkR&quot;, &quot;/home/spark&quot;,
 ##D                list(spark.executor.memory=&quot;4g&quot;, spark.submit.deployMode=&quot;client&quot;),
 ##D                c(&quot;one.jar&quot;, &quot;two.jar&quot;, &quot;three.jar&quot;),
 ##D                c(&quot;com.databricks:spark-avro_2.12:2.0.1&quot;))
 ##D sparkR.session(spark.master = &quot;yarn&quot;, spark.submit.deployMode = &quot;client&quot;,
 ##D                spark.executor.memory = &quot;4g&quot;)
 ## End(Not run)
 </code></pre>


 <hr /><div style="text-align: center;">[Package <em>SparkR</em> version 3.2.2 <a href="00Index.html">Index</a>]</div>
 </div>
 </body></html>
	<!DOCTYPE html><html><head><title>R: Get the existing SparkSession or initialize a new...</title>
	<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
	<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
	<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css">
	<script type="text/javascript">
	const macros = { "\\R": "\\textsf{R}", "\\code": "\\texttt"};
	function processMathHTML() {
	var l = document.getElementsByClassName('reqn');
	for (let e of l) { katex.render(e.textContent, e, { throwOnError: false, macros }); }
	return;
	}</script>
	<script defer src="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.js"
	onload="processMathHTML();"></script>
	<link rel="stylesheet" type="text/css" href="R.css" />

	<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css">
	<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script>
	<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script>
	<script>hljs.initHighlightingOnLoad();</script>
	</head><body><div class="container">

	<table style="width: 100%;"><tr><td>sparkR.session {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>

	<h2>Get the existing SparkSession or initialize a new SparkSession.</h2>

	<h3>Description</h3>

	<p>SparkSession is the entry point into SparkR. <code>sparkR.session</code> gets the existing
	SparkSession or initializes a new SparkSession.
	Additional Spark properties can be set in <code>...</code>, and these named parameters take priority
	over values in <code>master</code>, <code>appName</code>, named lists of <code>sparkConfig</code>.
	</p>


	<h3>Usage</h3>

	<pre><code class='language-R'>sparkR.session(
	master = "",
	appName = "SparkR",
	sparkHome = Sys.getenv("SPARK_HOME"),
	sparkConfig = list(),
	sparkJars = "",
	sparkPackages = "",
	enableHiveSupport = TRUE,
	...
	)
	</code></pre>


	<h3>Arguments</h3>

	<table>
	<tr style="vertical-align: top;"><td><code>master</code></td>
	<td>
	<p>the Spark master URL.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>appName</code></td>
	<td>
	<p>application name to register with cluster manager.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>sparkHome</code></td>
	<td>
	<p>Spark Home directory.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>sparkConfig</code></td>
	<td>
	<p>named list of Spark configuration to set on worker nodes.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>sparkJars</code></td>
	<td>
	<p>character vector of jar files to pass to the worker nodes.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>sparkPackages</code></td>
	<td>
	<p>character vector of package coordinates</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>enableHiveSupport</code></td>
	<td>
	<p>enable support for Hive, fallback if not built with Hive support; once
	set, this cannot be turned off on an existing session</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>...</code></td>
	<td>
	<p>named Spark properties passed to the method.</p>
	</td></tr>
	</table>


	<h3>Details</h3>

	<p>When called in an interactive session, this method checks for the Spark installation, and, if not
	found, it will be downloaded and cached automatically. Alternatively, <code>install.spark</code> can
	be called manually.
	</p>
	<p>A default warehouse is created automatically in the current directory when a managed table is
	created via <code>sql</code> statement <code>CREATE TABLE</code>, for example. To change the location of the
	warehouse, set the named parameter <code>spark.sql.warehouse.dir</code> to the SparkSession. Along with
	the warehouse, an accompanied metastore may also be automatically created in the current
	directory when a new SparkSession is initialized with <code>enableHiveSupport</code> set to
	<code>TRUE</code>, which is the default. For more details, refer to Hive configuration at
	<a href="http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables">http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables</a>.
	</p>
	<p>For details on how to initialize and use SparkR, refer to SparkR programming guide at
	<a href="http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession">http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession</a>.
	</p>


	<h3>Note</h3>

	<p>sparkR.session since 2.0.0
	</p>


	<h3>Examples</h3>

	<pre><code class="r">## Not run:
	##D sparkR.session()
	##D df <- read.json(path)
	##D
	##D sparkR.session("local[2]", "SparkR", "/home/spark")
	##D sparkR.session("yarn", "SparkR", "/home/spark",
	##D list(spark.executor.memory="4g", spark.submit.deployMode="client"),
	##D c("one.jar", "two.jar", "three.jar"),
	##D c("com.databricks:spark-avro_2.12:2.0.1"))
	##D sparkR.session(spark.master = "yarn", spark.submit.deployMode = "client",
	##D spark.executor.memory = "4g")
	## End(Not run)
	</code></pre>


	<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 3.2.2 <a href="00Index.html">Index</a>]</div>
	</div>
	</body></html>