| <!DOCTYPE html><html><head><title>R: Generalized Linear Models (R-compliant)</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> |
| <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css"> |
| <script type="text/javascript"> |
| const macros = { "\\R": "\\textsf{R}", "\\code": "\\texttt"}; |
| function processMathHTML() { |
| var l = document.getElementsByClassName('reqn'); |
| for (let e of l) { katex.render(e.textContent, e, { throwOnError: false, macros }); } |
| return; |
| }</script> |
| <script defer src="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.js" |
| onload="processMathHTML();"></script> |
| <link rel="stylesheet" type="text/css" href="R.css" /> |
| |
| <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> |
| <script>hljs.initHighlightingOnLoad();</script> |
| </head><body><div class="container"> |
| |
| <table style="width: 100%;"><tr><td>glm,formula,ANY,SparkDataFrame-method {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table> |
| |
| <h2>Generalized Linear Models (R-compliant)</h2> |
| |
| <h3>Description</h3> |
| |
| <p>Fits a generalized linear model, similarly to R's glm(). |
| </p> |
| |
| |
| <h3>Usage</h3> |
| |
| <pre><code class='language-R'>## S4 method for signature 'formula,ANY,SparkDataFrame' |
| glm( |
| formula, |
| family = gaussian, |
| data, |
| epsilon = 1e-06, |
| maxit = 25, |
| weightCol = NULL, |
| var.power = 0, |
| link.power = 1 - var.power, |
| stringIndexerOrderType = c("frequencyDesc", "frequencyAsc", "alphabetDesc", |
| "alphabetAsc"), |
| offsetCol = NULL |
| ) |
| </code></pre> |
| |
| |
| <h3>Arguments</h3> |
| |
| <table> |
| <tr style="vertical-align: top;"><td><code>formula</code></td> |
| <td> |
| <p>a symbolic description of the model to be fitted. Currently only a few formula |
| operators are supported, including '~', '.', ':', '+', and '-'.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>family</code></td> |
| <td> |
| <p>a description of the error distribution and link function to be used in the model. |
| This can be a character string naming a family function, a family function or |
| the result of a call to a family function. Refer R family at |
| <a href="https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html">https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html</a>. |
| Currently these families are supported: <code>binomial</code>, <code>gaussian</code>, |
| <code>poisson</code>, <code>Gamma</code>, and <code>tweedie</code>.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>data</code></td> |
| <td> |
| <p>a SparkDataFrame or R's glm data for training.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>epsilon</code></td> |
| <td> |
| <p>positive convergence tolerance of iterations.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>maxit</code></td> |
| <td> |
| <p>integer giving the maximal number of IRLS iterations.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>weightCol</code></td> |
| <td> |
| <p>the weight column name. If this is not set or <code>NULL</code>, we treat all instance |
| weights as 1.0.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>var.power</code></td> |
| <td> |
| <p>the index of the power variance function in the Tweedie family.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>link.power</code></td> |
| <td> |
| <p>the index of the power link function in the Tweedie family.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>stringIndexerOrderType</code></td> |
| <td> |
| <p>how to order categories of a string feature column. This is used to |
| decide the base level of a string feature as the last category |
| after ordering is dropped when encoding strings. Supported options |
| are "frequencyDesc", "frequencyAsc", "alphabetDesc", and |
| "alphabetAsc". The default value is "frequencyDesc". When the |
| ordering is set to "alphabetDesc", this drops the same category |
| as R when encoding strings.</p> |
| </td></tr> |
| <tr style="vertical-align: top;"><td><code>offsetCol</code></td> |
| <td> |
| <p>the offset column name. If this is not set or empty, we treat all instance |
| offsets as 0.0. The feature specified as offset has a constant coefficient of |
| 1.0.</p> |
| </td></tr> |
| </table> |
| |
| |
| <h3>Value</h3> |
| |
| <p><code>glm</code> returns a fitted generalized linear model. |
| </p> |
| |
| |
| <h3>Note</h3> |
| |
| <p>glm since 1.5.0 |
| </p> |
| |
| |
| <h3>See Also</h3> |
| |
| <p><a href="../../SparkR/help/spark.glm.html">spark.glm</a> |
| </p> |
| |
| |
| <h3>Examples</h3> |
| |
| <pre><code class="r">## Not run: |
| ##D sparkR.session() |
| ##D t <- as.data.frame(Titanic) |
| ##D df <- createDataFrame(t) |
| ##D model <- glm(Freq ~ Sex + Age, df, family = "gaussian") |
| ##D summary(model) |
| ## End(Not run) |
| </code></pre> |
| |
| |
| <hr /><div style="text-align: center;">[Package <em>SparkR</em> version 3.2.2 <a href="00Index.html">Index</a>]</div> |
| </div> |
| </body></html> |