site/docs/3.2.2/api/R/glm.html - spark-website - Git at Google

 <!DOCTYPE html><html><head><title>R: Generalized Linear Models (R-compliant)</title>
 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
 <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css">
 <script type="text/javascript">
 const macros = { "\\R": "\\textsf{R}", "\\code": "\\texttt"};
 function processMathHTML() {
     var l = document.getElementsByClassName('reqn');
     for (let e of l) { katex.render(e.textContent, e, { throwOnError: false, macros }); }
     return;
 }</script>
 <script defer src="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.js"
     onload="processMathHTML();"></script>
 <link rel="stylesheet" type="text/css" href="R.css" />

 <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css">
 <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script>
 <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script>
 <script>hljs.initHighlightingOnLoad();</script>
 </head><body><div class="container">

 <table style="width: 100%;"><tr><td>glm,formula,ANY,SparkDataFrame-method {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>

 <h2>Generalized Linear Models (R-compliant)</h2>

 <h3>Description</h3>

 <p>Fits a generalized linear model, similarly to R's glm().
 </p>


 <h3>Usage</h3>

 <pre><code class='language-R'>## S4 method for signature 'formula,ANY,SparkDataFrame'
 glm(
   formula,
   family = gaussian,
   data,
   epsilon = 1e-06,
   maxit = 25,
   weightCol = NULL,
   var.power = 0,
   link.power = 1 - var.power,
   stringIndexerOrderType = c("frequencyDesc", "frequencyAsc", "alphabetDesc",
     "alphabetAsc"),
   offsetCol = NULL
 )
 </code></pre>


 <h3>Arguments</h3>

 <table>
 <tr style="vertical-align: top;"><td><code>formula</code></td>
 <td>
 <p>a symbolic description of the model to be fitted. Currently only a few formula
 operators are supported, including '~', '.', ':', '+', and '-'.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>family</code></td>
 <td>
 <p>a description of the error distribution and link function to be used in the model.
 This can be a character string naming a family function, a family function or
 the result of a call to a family function. Refer R family at
 <a href="https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html">https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html</a>.
 Currently these families are supported: <code>binomial</code>, <code>gaussian</code>,
 <code>poisson</code>, <code>Gamma</code>, and <code>tweedie</code>.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>data</code></td>
 <td>
 <p>a SparkDataFrame or R's glm data for training.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>epsilon</code></td>
 <td>
 <p>positive convergence tolerance of iterations.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>maxit</code></td>
 <td>
 <p>integer giving the maximal number of IRLS iterations.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>weightCol</code></td>
 <td>
 <p>the weight column name. If this is not set or <code>NULL</code>, we treat all instance
 weights as 1.0.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>var.power</code></td>
 <td>
 <p>the index of the power variance function in the Tweedie family.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>link.power</code></td>
 <td>
 <p>the index of the power link function in the Tweedie family.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>stringIndexerOrderType</code></td>
 <td>
 <p>how to order categories of a string feature column. This is used to
 decide the base level of a string feature as the last category
 after ordering is dropped when encoding strings. Supported options
 are &quot;frequencyDesc&quot;, &quot;frequencyAsc&quot;, &quot;alphabetDesc&quot;, and
 &quot;alphabetAsc&quot;. The default value is &quot;frequencyDesc&quot;. When the
 ordering is set to &quot;alphabetDesc&quot;, this drops the same category
 as R when encoding strings.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>offsetCol</code></td>
 <td>
 <p>the offset column name. If this is not set or empty, we treat all instance
 offsets as 0.0. The feature specified as offset has a constant coefficient of
 1.0.</p>
 </td></tr>
 </table>


 <h3>Value</h3>

 <p><code>glm</code> returns a fitted generalized linear model.
 </p>


 <h3>Note</h3>

 <p>glm since 1.5.0
 </p>


 <h3>See Also</h3>

 <p><a href="../../SparkR/help/spark.glm.html">spark.glm</a>
 </p>


 <h3>Examples</h3>

 <pre><code class="r">## Not run:
 ##D sparkR.session()
 ##D t &lt;- as.data.frame(Titanic)
 ##D df &lt;- createDataFrame(t)
 ##D model &lt;- glm(Freq ~ Sex + Age, df, family = &quot;gaussian&quot;)
 ##D summary(model)
 ## End(Not run)
 </code></pre>


 <hr /><div style="text-align: center;">[Package <em>SparkR</em> version 3.2.2 <a href="00Index.html">Index</a>]</div>
 </div>
 </body></html>
	<!DOCTYPE html><html><head><title>R: Generalized Linear Models (R-compliant)</title>
	<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
	<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
	<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css">
	<script type="text/javascript">
	const macros = { "\\R": "\\textsf{R}", "\\code": "\\texttt"};
	function processMathHTML() {
	var l = document.getElementsByClassName('reqn');
	for (let e of l) { katex.render(e.textContent, e, { throwOnError: false, macros }); }
	return;
	}</script>
	<script defer src="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.js"
	onload="processMathHTML();"></script>
	<link rel="stylesheet" type="text/css" href="R.css" />

	<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css">
	<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script>
	<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script>
	<script>hljs.initHighlightingOnLoad();</script>
	</head><body><div class="container">

	<table style="width: 100%;"><tr><td>glm,formula,ANY,SparkDataFrame-method {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>

	<h2>Generalized Linear Models (R-compliant)</h2>

	<h3>Description</h3>

	<p>Fits a generalized linear model, similarly to R's glm().
	</p>


	<h3>Usage</h3>

	<pre><code class='language-R'>## S4 method for signature 'formula,ANY,SparkDataFrame'
	glm(
	formula,
	family = gaussian,
	data,
	epsilon = 1e-06,
	maxit = 25,
	weightCol = NULL,
	var.power = 0,
	link.power = 1 - var.power,
	stringIndexerOrderType = c("frequencyDesc", "frequencyAsc", "alphabetDesc",
	"alphabetAsc"),
	offsetCol = NULL
	)
	</code></pre>


	<h3>Arguments</h3>

	<table>
	<tr style="vertical-align: top;"><td><code>formula</code></td>
	<td>
	<p>a symbolic description of the model to be fitted. Currently only a few formula
	operators are supported, including '~', '.', ':', '+', and '-'.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>family</code></td>
	<td>
	<p>a description of the error distribution and link function to be used in the model.
	This can be a character string naming a family function, a family function or
	the result of a call to a family function. Refer R family at
	<a href="https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html">https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html</a>.
	Currently these families are supported: <code>binomial</code>, <code>gaussian</code>,
	<code>poisson</code>, <code>Gamma</code>, and <code>tweedie</code>.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>data</code></td>
	<td>
	<p>a SparkDataFrame or R's glm data for training.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>epsilon</code></td>
	<td>
	<p>positive convergence tolerance of iterations.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>maxit</code></td>
	<td>
	<p>integer giving the maximal number of IRLS iterations.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>weightCol</code></td>
	<td>
	<p>the weight column name. If this is not set or <code>NULL</code>, we treat all instance
	weights as 1.0.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>var.power</code></td>
	<td>
	<p>the index of the power variance function in the Tweedie family.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>link.power</code></td>
	<td>
	<p>the index of the power link function in the Tweedie family.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>stringIndexerOrderType</code></td>
	<td>
	<p>how to order categories of a string feature column. This is used to
	decide the base level of a string feature as the last category
	after ordering is dropped when encoding strings. Supported options
	are "frequencyDesc", "frequencyAsc", "alphabetDesc", and
	"alphabetAsc". The default value is "frequencyDesc". When the
	ordering is set to "alphabetDesc", this drops the same category
	as R when encoding strings.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>offsetCol</code></td>
	<td>
	<p>the offset column name. If this is not set or empty, we treat all instance
	offsets as 0.0. The feature specified as offset has a constant coefficient of
	1.0.</p>
	</td></tr>
	</table>


	<h3>Value</h3>

	<p><code>glm</code> returns a fitted generalized linear model.
	</p>


	<h3>Note</h3>

	<p>glm since 1.5.0
	</p>


	<h3>See Also</h3>

	<p><a href="../../SparkR/help/spark.glm.html">spark.glm</a>
	</p>


	<h3>Examples</h3>

	<pre><code class="r">## Not run:
	##D sparkR.session()
	##D t <- as.data.frame(Titanic)
	##D df <- createDataFrame(t)
	##D model <- glm(Freq ~ Sex + Age, df, family = "gaussian")
	##D summary(model)
	## End(Not run)
	</code></pre>


	<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 3.2.2 <a href="00Index.html">Index</a>]</div>
	</div>
	</body></html>