| <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> |
| <html><head><title>R: Generalized Linear Models</title> |
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
| <link rel="stylesheet" type="text/css" href="R.css"> |
| |
| <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css"> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script> |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script> |
| <script>hljs.initHighlightingOnLoad();</script> |
| </head><body> |
| |
| <table width="100%" summary="page for spark.glm {SparkR}"><tr><td>spark.glm {SparkR}</td><td align="right">R Documentation</td></tr></table> |
| |
| <h2>Generalized Linear Models</h2> |
| |
| <h3>Description</h3> |
| |
| <p>Fits generalized linear model against a SparkDataFrame. |
| Users can call <code>summary</code> to print a summary of the fitted model, <code>predict</code> to make |
| predictions on new data, and <code>write.ml</code>/<code>read.ml</code> to save/load fitted models. |
| </p> |
| |
| |
| <h3>Usage</h3> |
| |
| <pre> |
| spark.glm(data, formula, ...) |
| |
| ## S4 method for signature 'SparkDataFrame,formula' |
| spark.glm(data, formula, family = gaussian, |
| tol = 1e-06, maxIter = 25, weightCol = NULL, regParam = 0) |
| |
| ## S4 method for signature 'GeneralizedLinearRegressionModel' |
| summary(object) |
| |
| ## S3 method for class 'summary.GeneralizedLinearRegressionModel' |
| print(x, ...) |
| |
| ## S4 method for signature 'GeneralizedLinearRegressionModel' |
| predict(object, newData) |
| |
| ## S4 method for signature 'GeneralizedLinearRegressionModel,character' |
| write.ml(object, path, |
| overwrite = FALSE) |
| </pre> |
| |
| |
| <h3>Arguments</h3> |
| |
| <table summary="R argblock"> |
| <tr valign="top"><td><code>data</code></td> |
| <td> |
| <p>a SparkDataFrame for training.</p> |
| </td></tr> |
| <tr valign="top"><td><code>formula</code></td> |
| <td> |
| <p>a symbolic description of the model to be fitted. Currently only a few formula |
| operators are supported, including '~', '.', ':', '+', and '-'.</p> |
| </td></tr> |
| <tr valign="top"><td><code>...</code></td> |
| <td> |
| <p>additional arguments passed to the method.</p> |
| </td></tr> |
| <tr valign="top"><td><code>family</code></td> |
| <td> |
| <p>a description of the error distribution and link function to be used in the model. |
| This can be a character string naming a family function, a family function or |
| the result of a call to a family function. Refer R family at |
| <a href="https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html">https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html</a>. |
| Currently these families are supported: <code>binomial</code>, <code>gaussian</code>, |
| <code>Gamma</code>, and <code>poisson</code>.</p> |
| </td></tr> |
| <tr valign="top"><td><code>tol</code></td> |
| <td> |
| <p>positive convergence tolerance of iterations.</p> |
| </td></tr> |
| <tr valign="top"><td><code>maxIter</code></td> |
| <td> |
| <p>integer giving the maximal number of IRLS iterations.</p> |
| </td></tr> |
| <tr valign="top"><td><code>weightCol</code></td> |
| <td> |
| <p>the weight column name. If this is not set or <code>NULL</code>, we treat all instance |
| weights as 1.0.</p> |
| </td></tr> |
| <tr valign="top"><td><code>regParam</code></td> |
| <td> |
| <p>regularization parameter for L2 regularization.</p> |
| </td></tr> |
| <tr valign="top"><td><code>object</code></td> |
| <td> |
| <p>a fitted generalized linear model.</p> |
| </td></tr> |
| <tr valign="top"><td><code>x</code></td> |
| <td> |
| <p>summary object of fitted generalized linear model returned by <code>summary</code> function.</p> |
| </td></tr> |
| <tr valign="top"><td><code>newData</code></td> |
| <td> |
| <p>a SparkDataFrame for testing.</p> |
| </td></tr> |
| <tr valign="top"><td><code>path</code></td> |
| <td> |
| <p>the directory where the model is saved.</p> |
| </td></tr> |
| <tr valign="top"><td><code>overwrite</code></td> |
| <td> |
| <p>overwrites or not if the output path already exists. Default is FALSE |
| which means throw exception if the output path exists.</p> |
| </td></tr> |
| </table> |
| |
| |
| <h3>Value</h3> |
| |
| <p><code>spark.glm</code> returns a fitted generalized linear model. |
| </p> |
| <p><code>summary</code> returns summary information of the fitted model, which is a list. |
| The list of components includes at least the <code>coefficients</code> (coefficients matrix, which includes |
| coefficients, standard error of coefficients, t value and p value), |
| <code>null.deviance</code> (null/residual degrees of freedom), <code>aic</code> (AIC) |
| and <code>iter</code> (number of iterations IRLS takes). If there are collinear columns in the data, |
| the coefficients matrix only provides coefficients. |
| </p> |
| <p><code>predict</code> returns a SparkDataFrame containing predicted labels in a column named |
| "prediction". |
| </p> |
| |
| |
| <h3>Note</h3> |
| |
| <p>spark.glm since 2.0.0 |
| </p> |
| <p>summary(GeneralizedLinearRegressionModel) since 2.0.0 |
| </p> |
| <p>print.summary.GeneralizedLinearRegressionModel since 2.0.0 |
| </p> |
| <p>predict(GeneralizedLinearRegressionModel) since 1.5.0 |
| </p> |
| <p>write.ml(GeneralizedLinearRegressionModel, character) since 2.0.0 |
| </p> |
| |
| |
| <h3>See Also</h3> |
| |
| <p><a href="glm.html">glm</a>, <a href="read.ml.html">read.ml</a> |
| </p> |
| |
| |
| <h3>Examples</h3> |
| |
| <pre><code class="r">## Not run: |
| ##D sparkR.session() |
| ##D data(iris) |
| ##D df <- createDataFrame(iris) |
| ##D model <- spark.glm(df, Sepal_Length ~ Sepal_Width, family = "gaussian") |
| ##D summary(model) |
| ##D |
| ##D # fitted values on training data |
| ##D fitted <- predict(model, df) |
| ##D head(select(fitted, "Sepal_Length", "prediction")) |
| ##D |
| ##D # save fitted model to input path |
| ##D path <- "path/to/model" |
| ##D write.ml(model, path) |
| ##D |
| ##D # can also read back the saved model and print |
| ##D savedModel <- read.ml(path) |
| ##D summary(savedModel) |
| ## End(Not run) |
| </code></pre> |
| |
| |
| <hr><div align="center">[Package <em>SparkR</em> version 2.1.1 <a href="00Index.html">Index</a>]</div> |
| </body></html> |