| <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> |
| <html><head><title>R: Generalized Linear Models (R-compliant)</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
| <link rel="stylesheet" type="text/css" href="R.css"> |
| |
| <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> |
| <script>hljs.initHighlightingOnLoad();</script> |
| </head><body> |
| |
| <table width="100%" summary="page for glm {SparkR}"><tr><td>glm {SparkR}</td><td align="right">R Documentation</td></tr></table> |
| |
| <h2>Generalized Linear Models (R-compliant)</h2> |
| |
| <h3>Description</h3> |
| |
| <p>Fits a generalized linear model, similarly to R's glm(). |
| </p> |
| |
| |
| <h3>Usage</h3> |
| |
| <pre> |
| glm(formula, family = gaussian, data, weights, subset, na.action, |
| start = NULL, etastart, mustart, offset, control = list(...), |
| model = TRUE, method = "glm.fit", x = FALSE, y = TRUE, |
| contrasts = NULL, ...) |
| |
| ## S4 method for signature 'formula,ANY,SparkDataFrame' |
| glm(formula, family = gaussian, data, |
| epsilon = 1e-06, maxit = 25, weightCol = NULL) |
| </pre> |
| |
| |
| <h3>Arguments</h3> |
| |
| <table summary="R argblock"> |
| <tr valign="top"><td><code>formula</code></td> |
| <td> |
| <p>a symbolic description of the model to be fitted. Currently only a few formula |
| operators are supported, including '~', '.', ':', '+', and '-'.</p> |
| </td></tr> |
| <tr valign="top"><td><code>family</code></td> |
| <td> |
| <p>a description of the error distribution and link function to be used in the model. |
| This can be a character string naming a family function, a family function or |
| the result of a call to a family function. Refer R family at |
| <a href="https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html">https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html</a>. |
| Currently these families are supported: <code>binomial</code>, <code>gaussian</code>, |
| <code>Gamma</code>, and <code>poisson</code>.</p> |
| </td></tr> |
| <tr valign="top"><td><code>data</code></td> |
| <td> |
| <p>a SparkDataFrame or R's glm data for training.</p> |
| </td></tr> |
| <tr valign="top"><td><code>weights</code></td> |
| <td> |
| <p>an optional vector of ‘prior weights’ to be used |
| in the fitting process. Should be <code>NULL</code> or a numeric vector.</p> |
| </td></tr> |
| <tr valign="top"><td><code>subset</code></td> |
| <td> |
| <p>an optional vector specifying a subset of observations |
| to be used in the fitting process.</p> |
| </td></tr> |
| <tr valign="top"><td><code>na.action</code></td> |
| <td> |
| <p>a function which indicates what should happen |
| when the data contain <code>NA</code>s. The default is set by |
| the <code>na.action</code> setting of <code><a href="../../base/html/options.html">options</a></code>, and is |
| <code><a href="../../stats/html/na.fail.html">na.fail</a></code> if that is unset. The ‘factory-fresh’ |
| default is <code><a href="nafunctions.html">na.omit</a></code>. Another possible value is |
| <code>NULL</code>, no action. Value <code><a href="../../stats/html/na.fail.html">na.exclude</a></code> can be useful.</p> |
| </td></tr> |
| <tr valign="top"><td><code>start</code></td> |
| <td> |
| <p>starting values for the parameters in the linear predictor.</p> |
| </td></tr> |
| <tr valign="top"><td><code>etastart</code></td> |
| <td> |
| <p>starting values for the linear predictor.</p> |
| </td></tr> |
| <tr valign="top"><td><code>mustart</code></td> |
| <td> |
| <p>starting values for the vector of means.</p> |
| </td></tr> |
| <tr valign="top"><td><code>offset</code></td> |
| <td> |
| <p>this can be used to specify an <EM>a priori</EM> known |
| component to be included in the linear predictor during fitting. |
| This should be <code>NULL</code> or a numeric vector of length equal to |
| the number of cases. One or more <code><a href="../../stats/html/offset.html">offset</a></code> terms can be |
| included in the formula instead or as well, and if more than one is |
| specified their sum is used. See <code><a href="../../stats/html/model.extract.html">model.offset</a></code>.</p> |
| </td></tr> |
| <tr valign="top"><td><code>control</code></td> |
| <td> |
| <p>a list of parameters for controlling the fitting |
| process. For <code>glm.fit</code> this is passed to |
| <code><a href="../../stats/html/glm.control.html">glm.control</a></code>.</p> |
| </td></tr> |
| <tr valign="top"><td><code>model</code></td> |
| <td> |
| <p>a logical value indicating whether <EM>model frame</EM> |
| should be included as a component of the returned value.</p> |
| </td></tr> |
| <tr valign="top"><td><code>method</code></td> |
| <td> |
| <p>the method to be used in fitting the model. The default |
| method <code>"glm.fit"</code> uses iteratively reweighted least squares |
| (IWLS): the alternative <code>"model.frame"</code> returns the model frame |
| and does no fitting. |
| </p> |
| <p>User-supplied fitting functions can be supplied either as a function |
| or a character string naming a function, with a function which takes |
| the same arguments as <code>glm.fit</code>. If specified as a character |
| string it is looked up from within the <span class="pkg">stats</span> namespace. |
| </p> |
| </td></tr> |
| <tr valign="top"><td><code>x,y</code></td> |
| <td> |
| <p>For <code>glm</code>: logical values indicating whether the response vector |
| and model matrix used in the fitting process should be returned as |
| components of the returned value.</p> |
| </td></tr> |
| <tr valign="top"><td><code>contrasts</code></td> |
| <td> |
| <p>an optional list. See the <code>contrasts.arg</code> |
| of <code>model.matrix.default</code>.</p> |
| </td></tr> |
| <tr valign="top"><td><code>...</code></td> |
| <td> |
| |
| <p>For <code>glm</code>: arguments to be used to form the default |
| <code>control</code> argument if it is not supplied directly. |
| </p> |
| <p>For <code>weights</code>: further arguments passed to or from other methods. |
| </p> |
| </td></tr> |
| <tr valign="top"><td><code>epsilon</code></td> |
| <td> |
| <p>positive convergence tolerance of iterations.</p> |
| </td></tr> |
| <tr valign="top"><td><code>maxit</code></td> |
| <td> |
| <p>integer giving the maximal number of IRLS iterations.</p> |
| </td></tr> |
| <tr valign="top"><td><code>weightCol</code></td> |
| <td> |
| <p>the weight column name. If this is not set or <code>NULL</code>, we treat all instance |
| weights as 1.0.</p> |
| </td></tr> |
| </table> |
| |
| |
| <h3>Value</h3> |
| |
| <p><code>glm</code> returns a fitted generalized linear model. |
| </p> |
| |
| |
| <h3>Note</h3> |
| |
| <p>glm since 1.5.0 |
| </p> |
| |
| |
| <h3>See Also</h3> |
| |
| <p><a href="spark.glm.html">spark.glm</a> |
| </p> |
| |
| |
| <h3>Examples</h3> |
| |
| <pre><code class="r">## Not run: |
| ##D sparkR.session() |
| ##D data(iris) |
| ##D df <- createDataFrame(iris) |
| ##D model <- glm(Sepal_Length ~ Sepal_Width, df, family = "gaussian") |
| ##D summary(model) |
| ## End(Not run) |
| </code></pre> |
| |
| |
| <hr><div align="center">[Package <em>SparkR</em> version 2.1.1 <a href="00Index.html">Index</a>]</div> |
| </body></html> |