| <!DOCTYPE html><html><head><title>R: Get the existing SparkSession or initialize a new...</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> |
| <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css"> |
| <script type="text/javascript"> |
| const macros = { "\\R": "\\textsf{R}", "\\code": "\\texttt"}; |
| function processMathHTML() { |
| var l = document.getElementsByClassName('reqn'); |
| for (let e of l) { katex.render(e.textContent, e, { throwOnError: false, macros }); } |
| return; |
| }</script> |
| <script defer src="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.js" |
| onload="processMathHTML();"></script> |
| <link rel="stylesheet" type="text/css" href="R.css" /> |
| |
| <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> |
| <script>hljs.initHighlightingOnLoad();</script> |
| </head><body><div class="container"> |
| |
| <table style="width: 100%;"><tr><td>sparkR.session {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> |
| |
| <h2>Get the existing SparkSession or initialize a new SparkSession.</h2> |
| |
| <h3>Description</h3> |
| |
| <p>SparkSession is the entry point into SparkR. <code>sparkR.session</code> gets the existing |
| SparkSession or initializes a new SparkSession. |
| Additional Spark properties can be set in <code>...</code>, and these named parameters take priority |
| over values in <code>master</code>, <code>appName</code>, named lists of <code>sparkConfig</code>. |
| </p> |
| |
| |
| <h3>Usage</h3> |
| |
| <pre><code class='language-R'>sparkR.session( |
| master = "", |
| appName = "SparkR", |
| sparkHome = Sys.getenv("SPARK_HOME"), |
| sparkConfig = list(), |
| sparkJars = "", |
| sparkPackages = "", |
| enableHiveSupport = TRUE, |
| ... |
| ) |
| </code></pre> |
| |
| |
| <h3>Arguments</h3> |
| |
| <table> |
| <tr style="vertical-align: top;"><td><code>master</code></td> |
| <td> |
| <p>the Spark master URL.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>appName</code></td> |
| <td> |
| <p>application name to register with cluster manager.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>sparkHome</code></td> |
| <td> |
| <p>Spark Home directory.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>sparkConfig</code></td> |
| <td> |
| <p>named list of Spark configuration to set on worker nodes.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>sparkJars</code></td> |
| <td> |
| <p>character vector of jar files to pass to the worker nodes.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>sparkPackages</code></td> |
| <td> |
| <p>character vector of package coordinates</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>enableHiveSupport</code></td> |
| <td> |
| <p>enable support for Hive, fallback if not built with Hive support; once |
| set, this cannot be turned off on an existing session</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>...</code></td> |
| <td> |
| <p>named Spark properties passed to the method.</p> |
| </td></tr> |
| </table> |
| |
| |
| <h3>Details</h3> |
| |
| <p>When called in an interactive session, this method checks for the Spark installation, and, if not |
| found, it will be downloaded and cached automatically. Alternatively, <code>install.spark</code> can |
| be called manually. |
| </p> |
| <p>A default warehouse is created automatically in the current directory when a managed table is |
| created via <code>sql</code> statement <code>CREATE TABLE</code>, for example. To change the location of the |
| warehouse, set the named parameter <code>spark.sql.warehouse.dir</code> to the SparkSession. Along with |
| the warehouse, an accompanied metastore may also be automatically created in the current |
| directory when a new SparkSession is initialized with <code>enableHiveSupport</code> set to |
| <code>TRUE</code>, which is the default. For more details, refer to Hive configuration at |
| <a href="http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables">http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables</a>. |
| </p> |
| <p>For details on how to initialize and use SparkR, refer to SparkR programming guide at |
| <a href="http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession">http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession</a>. |
| </p> |
| |
| |
| <h3>Note</h3> |
| |
| <p>sparkR.session since 2.0.0 |
| </p> |
| |
| |
| <h3>Examples</h3> |
| |
| <pre><code class="r">## Not run: |
| ##D sparkR.session() |
| ##D df <- read.json(path) |
| ##D |
| ##D sparkR.session("local[2]", "SparkR", "/home/spark") |
| ##D sparkR.session("yarn", "SparkR", "/home/spark", |
| ##D list(spark.executor.memory="4g", spark.submit.deployMode="client"), |
| ##D c("one.jar", "two.jar", "three.jar"), |
| ##D c("com.databricks:spark-avro_2.12:2.0.1")) |
| ##D sparkR.session(spark.master = "yarn", spark.submit.deployMode = "client", |
| ##D spark.executor.memory = "4g") |
| ## End(Not run) |
| </code></pre> |
| |
| |
| <hr /><div style="text-align: center;">[Package <em>SparkR</em> version 3.2.2 <a href="00Index.html">Index</a>]</div> |
| </div> |
| </body></html> |