site/docs/3.2.3/api/R/spark.svmLinear.html - spark-website - Git at Google

 <!DOCTYPE html><html><head><title>R: Linear SVM Model</title>
 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
 <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css">
 <script type="text/javascript">
 const macros = { "\\R": "\\textsf{R}", "\\code": "\\texttt"};
 function processMathHTML() {
     var l = document.getElementsByClassName('reqn');
     for (let e of l) { katex.render(e.textContent, e, { throwOnError: false, macros }); }
     return;
 }</script>
 <script defer src="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.js"
     onload="processMathHTML();"></script>
 <link rel="stylesheet" type="text/css" href="R.css" />

 <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css">
 <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script>
 <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script>
 <script>hljs.initHighlightingOnLoad();</script>
 </head><body><div class="container">

 <table style="width: 100%;"><tr><td>spark.svmLinear {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>

 <h2>Linear SVM Model</h2>

 <h3>Description</h3>

 <p>Fits a linear SVM model against a SparkDataFrame, similar to svm in e1071 package.
 Currently only supports binary classification model with linear kernel.
 Users can print, make predictions on the produced model and save the model to the input path.
 </p>


 <h3>Usage</h3>

 <pre><code class='language-R'>spark.svmLinear(data, formula, ...)

 ## S4 method for signature 'SparkDataFrame,formula'
 spark.svmLinear(
   data,
   formula,
   regParam = 0,
   maxIter = 100,
   tol = 1e-06,
   standardization = TRUE,
   threshold = 0,
   weightCol = NULL,
   aggregationDepth = 2,
   handleInvalid = c("error", "keep", "skip")
 )

 ## S4 method for signature 'LinearSVCModel'
 predict(object, newData)

 ## S4 method for signature 'LinearSVCModel'
 summary(object)

 ## S4 method for signature 'LinearSVCModel,character'
 write.ml(object, path, overwrite = FALSE)
 </code></pre>


 <h3>Arguments</h3>

 <table>
 <tr style="vertical-align: top;"><td><code>data</code></td>
 <td>
 <p>SparkDataFrame for training.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>formula</code></td>
 <td>
 <p>A symbolic description of the model to be fitted. Currently only a few formula
 operators are supported, including '~', '.', ':', '+', '-', '*', and '^'.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>...</code></td>
 <td>
 <p>additional arguments passed to the method.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>regParam</code></td>
 <td>
 <p>The regularization parameter. Only supports L2 regularization currently.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>maxIter</code></td>
 <td>
 <p>Maximum iteration number.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>tol</code></td>
 <td>
 <p>Convergence tolerance of iterations.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>standardization</code></td>
 <td>
 <p>Whether to standardize the training features before fitting the model.
 The coefficients of models will be always returned on the original scale,
 so it will be transparent for users. Note that with/without
 standardization, the models should be always converged to the same
 solution when no regularization is applied.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>threshold</code></td>
 <td>
 <p>The threshold in binary classification applied to the linear model prediction.
 This threshold can be any real number, where Inf will make all predictions 0.0
 and -Inf will make all predictions 1.0.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>weightCol</code></td>
 <td>
 <p>The weight column name.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>aggregationDepth</code></td>
 <td>
 <p>The depth for treeAggregate (greater than or equal to 2). If the
 dimensions of features or the number of partitions are large, this param
 could be adjusted to a larger size.
 This is an expert parameter. Default value should be good for most cases.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>handleInvalid</code></td>
 <td>
 <p>How to handle invalid data (unseen labels or NULL values) in features and
 label column of string type.
 Supported options: &quot;skip&quot; (filter out rows with invalid data),
 &quot;error&quot; (throw an error), &quot;keep&quot; (put invalid data in
 a special additional bucket, at index numLabels). Default
 is &quot;error&quot;.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>object</code></td>
 <td>
 <p>a LinearSVCModel fitted by <code>spark.svmLinear</code>.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>newData</code></td>
 <td>
 <p>a SparkDataFrame for testing.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>path</code></td>
 <td>
 <p>The directory where the model is saved.</p>
 </td></tr>
 <tr style="vertical-align: top;"><td><code>overwrite</code></td>
 <td>
 <p>Overwrites or not if the output path already exists. Default is FALSE
 which means throw exception if the output path exists.</p>
 </td></tr>
 </table>


 <h3>Value</h3>

 <p><code>spark.svmLinear</code> returns a fitted linear SVM model.
 </p>
 <p><code>predict</code> returns the predicted values based on a LinearSVCModel.
 </p>
 <p><code>summary</code> returns summary information of the fitted model, which is a list.
 The list includes <code>coefficients</code> (coefficients of the fitted model),
 <code>numClasses</code> (number of classes), <code>numFeatures</code> (number of features).
 </p>


 <h3>Note</h3>

 <p>spark.svmLinear since 2.2.0
 </p>
 <p>predict(LinearSVCModel) since 2.2.0
 </p>
 <p>summary(LinearSVCModel) since 2.2.0
 </p>
 <p>write.ml(LogisticRegression, character) since 2.2.0
 </p>


 <h3>Examples</h3>

 <pre><code class="language-r">## Not run:
 ##D sparkR.session()
 ##D t &lt;- as.data.frame(Titanic)
 ##D training &lt;- createDataFrame(t)
 ##D model &lt;- spark.svmLinear(training, Survived ~ ., regParam = 0.5)
 ##D summary &lt;- summary(model)
 ##D
 ##D # fitted values on training data
 ##D fitted &lt;- predict(model, training)
 ##D
 ##D # save fitted model to input path
 ##D path &lt;- &quot;path/to/model&quot;
 ##D write.ml(model, path)
 ##D
 ##D # can also read back the saved model and predict
 ##D # Note that summary deos not work on loaded model
 ##D savedModel &lt;- read.ml(path)
 ##D summary(savedModel)
 ## End(Not run)
 </code></pre>


 <hr /><div style="text-align: center;">[Package <em>SparkR</em> version 3.2.3 <a href="00Index.html">Index</a>]</div>
 </div>
 </body></html>
	<!DOCTYPE html><html><head><title>R: Linear SVM Model</title>
	<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
	<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
	<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.css">
	<script type="text/javascript">
	const macros = { "\\R": "\\textsf{R}", "\\code": "\\texttt"};
	function processMathHTML() {
	var l = document.getElementsByClassName('reqn');
	for (let e of l) { katex.render(e.textContent, e, { throwOnError: false, macros }); }
	return;
	}</script>
	<script defer src="https://cdn.jsdelivr.net/npm/katex@0.15.3/dist/katex.min.js"
	onload="processMathHTML();"></script>
	<link rel="stylesheet" type="text/css" href="R.css" />

	<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/styles/github.min.css">
	<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/highlight.min.js"></script>
	<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/8.3/languages/r.min.js"></script>
	<script>hljs.initHighlightingOnLoad();</script>
	</head><body><div class="container">

	<table style="width: 100%;"><tr><td>spark.svmLinear {SparkR}</td><td style="text-align: right;">R Documentation</td></tr></table>

	<h2>Linear SVM Model</h2>

	<h3>Description</h3>

	<p>Fits a linear SVM model against a SparkDataFrame, similar to svm in e1071 package.
	Currently only supports binary classification model with linear kernel.
	Users can print, make predictions on the produced model and save the model to the input path.
	</p>


	<h3>Usage</h3>

	<pre><code class='language-R'>spark.svmLinear(data, formula, ...)

	## S4 method for signature 'SparkDataFrame,formula'
	spark.svmLinear(
	data,
	formula,
	regParam = 0,
	maxIter = 100,
	tol = 1e-06,
	standardization = TRUE,
	threshold = 0,
	weightCol = NULL,
	aggregationDepth = 2,
	handleInvalid = c("error", "keep", "skip")
	)

	## S4 method for signature 'LinearSVCModel'
	predict(object, newData)

	## S4 method for signature 'LinearSVCModel'
	summary(object)

	## S4 method for signature 'LinearSVCModel,character'
	write.ml(object, path, overwrite = FALSE)
	</code></pre>


	<h3>Arguments</h3>

	<table>
	<tr style="vertical-align: top;"><td><code>data</code></td>
	<td>
	<p>SparkDataFrame for training.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>formula</code></td>
	<td>
	<p>A symbolic description of the model to be fitted. Currently only a few formula
	operators are supported, including '~', '.', ':', '+', '-', '*', and '^'.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>...</code></td>
	<td>
	<p>additional arguments passed to the method.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>regParam</code></td>
	<td>
	<p>The regularization parameter. Only supports L2 regularization currently.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>maxIter</code></td>
	<td>
	<p>Maximum iteration number.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>tol</code></td>
	<td>
	<p>Convergence tolerance of iterations.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>standardization</code></td>
	<td>
	<p>Whether to standardize the training features before fitting the model.
	The coefficients of models will be always returned on the original scale,
	so it will be transparent for users. Note that with/without
	standardization, the models should be always converged to the same
	solution when no regularization is applied.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>threshold</code></td>
	<td>
	<p>The threshold in binary classification applied to the linear model prediction.
	This threshold can be any real number, where Inf will make all predictions 0.0
	and -Inf will make all predictions 1.0.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>weightCol</code></td>
	<td>
	<p>The weight column name.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>aggregationDepth</code></td>
	<td>
	<p>The depth for treeAggregate (greater than or equal to 2). If the
	dimensions of features or the number of partitions are large, this param
	could be adjusted to a larger size.
	This is an expert parameter. Default value should be good for most cases.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>handleInvalid</code></td>
	<td>
	<p>How to handle invalid data (unseen labels or NULL values) in features and
	label column of string type.
	Supported options: "skip" (filter out rows with invalid data),
	"error" (throw an error), "keep" (put invalid data in
	a special additional bucket, at index numLabels). Default
	is "error".</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>object</code></td>
	<td>
	<p>a LinearSVCModel fitted by <code>spark.svmLinear</code>.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>newData</code></td>
	<td>
	<p>a SparkDataFrame for testing.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>path</code></td>
	<td>
	<p>The directory where the model is saved.</p>
	</td></tr>
	<tr style="vertical-align: top;"><td><code>overwrite</code></td>
	<td>
	<p>Overwrites or not if the output path already exists. Default is FALSE
	which means throw exception if the output path exists.</p>
	</td></tr>
	</table>


	<h3>Value</h3>

	<p><code>spark.svmLinear</code> returns a fitted linear SVM model.
	</p>
	<p><code>predict</code> returns the predicted values based on a LinearSVCModel.
	</p>
	<p><code>summary</code> returns summary information of the fitted model, which is a list.
	The list includes <code>coefficients</code> (coefficients of the fitted model),
	<code>numClasses</code> (number of classes), <code>numFeatures</code> (number of features).
	</p>


	<h3>Note</h3>

	<p>spark.svmLinear since 2.2.0
	</p>
	<p>predict(LinearSVCModel) since 2.2.0
	</p>
	<p>summary(LinearSVCModel) since 2.2.0
	</p>
	<p>write.ml(LogisticRegression, character) since 2.2.0
	</p>


	<h3>Examples</h3>

	<pre><code class="language-r">## Not run:
	##D sparkR.session()
	##D t <- as.data.frame(Titanic)
	##D training <- createDataFrame(t)
	##D model <- spark.svmLinear(training, Survived ~ ., regParam = 0.5)
	##D summary <- summary(model)
	##D
	##D # fitted values on training data
	##D fitted <- predict(model, training)
	##D
	##D # save fitted model to input path
	##D path <- "path/to/model"
	##D write.ml(model, path)
	##D
	##D # can also read back the saved model and predict
	##D # Note that summary deos not work on loaded model
	##D savedModel <- read.ml(path)
	##D summary(savedModel)
	## End(Not run)
	</code></pre>


	<hr /><div style="text-align: center;">[Package <em>SparkR</em> version 3.2.3 <a href="00Index.html">Index</a>]</div>
	</div>
	</body></html>