layout: doc-page title: Ordinary Least Squares Regression
The OrinaryLeastSquares
regressor in Mahout implements a closed-form solution to Ordinary Least Squares. This is in stark contrast to many “big data machine learning” frameworks which implement a stochastic approach. From the users perspecive this difference can be reduced to:
In this example we disable the “calculate common statistics” parameters, so our summary will NOT contain the coefficient of determination (R-squared) or Mean Square Error
import org.apache.mahout.math.algorithms.regression.OrdinaryLeastSquares val drmData = drmParallelize(dense( (2, 2, 10.5, 10, 29.509541), // Apple Cinnamon Cheerios (1, 2, 12, 12, 18.042851), // Cap'n'Crunch (1, 1, 12, 13, 22.736446), // Cocoa Puffs (2, 1, 11, 13, 32.207582), // Froot Loops (1, 2, 12, 11, 21.871292), // Honey Graham Ohs (2, 1, 16, 8, 36.187559), // Wheaties Honey Gold (6, 2, 17, 1, 50.764999), // Cheerios (3, 2, 13, 7, 40.400208), // Clusters (3, 3, 13, 4, 45.811716)), numPartitions = 2) val drmX = drmData(::, 0 until 4) val drmY = drmData(::, 4 until 5) val model = new OrdinaryLeastSquares[Int]().fit(drmX, drmY, 'calcCommonStatistics → false) println(model.summary)