docs/manual/source/evaluation/paramtuning.html.md - predictionio - Git at Google

 ---
 title: Hyperparameter Tuning
 ---

 <!--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 -->

 A PredictionIO engine is instantiated by a set of parameters. These parameters
 define which algorithm is to be used, as well supply the parameters for the algorithm itself. This naturally raises the question of how to choose the best set of parameters.
 The evaluation module streamlines the process of *tuning* the engine to the best
 parameter set and deploys it.

 ## Quick Start

 We demonstrate the evaluation with [the classification template]
 (/templates/classification/quickstart/).
 The classification template uses a naive bayesian algorithm that has a smoothing
 parameter. We evaluate the prediction quality against different parameter values
 to find the best parameter values, and then deploy it.

 ### Edit the AppId

 Edit MyClassification/src/main/scala/***Evaluation.scala*** to specify the
 *appId* you used to import the data.

 ```scala
 object EngineParamsList extends EngineParamsGenerator {
   ...
   private[this] val baseEP = EngineParams(
     dataSourceParams = DataSourceParams(appId = <YOUR_APP_ID>, evalK = Some(5)))
   ...
 }
 ```

 ### Build and run the evaluation
 To run an evaluation, the command `pio eval` is used. It takes two
 mandatory parameter,
 1. the `Evaluation` object, which tells PredictionIO the engine and metric we use
    for the evaluation; and
 2. the `EngineParamsGenerator`, which contains a list of engine params to test
    against.
 The following command kickstarts the evaluation
 workflow for the classification template.

 ```
 $ pio build
 ...
 $ pio eval org.example.classification.AccuracyEvaluation org.example.classification.EngineParamsList
 ```

 You will see the following output:

 ```
 ...
 [INFO] [CoreWorkflow$] runEvaluation started
 ...
 [INFO] [MetricEvaluator] Iteration 0
 [INFO] [MetricEvaluator] EngineParams: {"dataSourceParams":{"":{"appId":19,"evalK":5}},"preparatorParams":{"":{}},"algorithmParamsList":[{"naive":{"lambda":10.0}}],"servingParams":{"":{}}}
 [INFO] [MetricEvaluator] Result: MetricScores(0.9281045751633987,List())
 [INFO] [MetricEvaluator] Iteration 1
 [INFO] [MetricEvaluator] EngineParams: {"dataSourceParams":{"":{"appId":19,"evalK":5}},"preparatorParams":{"":{}},"algorithmParamsList":[{"naive":{"lambda":100.0}}],"servingParams":{"":{}}}
 [INFO] [MetricEvaluator] Result: MetricScores(0.9150326797385621,List())
 [INFO] [MetricEvaluator] Iteration 2
 [INFO] [MetricEvaluator] EngineParams: {"dataSourceParams":{"":{"appId":19,"evalK":5}},"preparatorParams":{"":{}},"algorithmParamsList":[{"naive":{"lambda":1000.0}}],"servingParams":{"":{}}}
 [INFO] [MetricEvaluator] Result: MetricScores(0.4444444444444444,List())
 [INFO] [MetricEvaluator] Writing best variant params to disk...
 [INFO] [CoreWorkflow$] Updating evaluation instance with result: MetricEvaluatorResult:
   # engine params evaluated: 3
 Optimal Engine Params:
   {
   "dataSourceParams":{
     "":{
       "appId":19,
       "evalK":5
     }
   },
   "preparatorParams":{
     "":{

     }
   },
   "algorithmParamsList":[
     {
       "naive":{
         "lambda":10.0
       }
     }
   ],
   "servingParams":{
     "":{

     }
   }
 }
 Metrics:
   org.example.classification.Accuracy: 0.9281045751633987
 The best variant params can be found in best.json
 [INFO] [CoreWorkflow$] runEvaluation completed
 ```

 The console prints out the evaluation metric score of each engine params, and
 finally pretty print the optimal engine params.
 Amongst the 3 engine params we evaluate, *lambda = 10.0* yields the highest
 accuracy score of ~0.9281.

 ### Deploy the best engine parameter

 The evaluation module also writes out the best engine parameter to disk at
 `best.json`. We can train and deploy this specify engine variant using the
 extra parameter `-v`. For example:

 ```bash
 $ pio train -v best.json
 ...
 [INFO] [CoreWorkflow$] Training completed successfully.
 $ pio deploy -v best.json
 ...
 [INFO] [HttpListener] Bound to localhost/127.0.0.1:8000
 [INFO] [MasterActor] Bind successful. Ready to serve.
 ```

 At this point, we have successfully deployed the best engine variant we found
 through the evaluation process.


 ## Detailed Explanation

 An engine often depends on a number of parameters, for example, the naive bayesian
 classification algorithm has a smoothing parameter to make the model more
 adaptive to unseen data. Compared with parameters which are *learnt* by the
 machine learning algorithm, this smoothing parameter *teaches* the algorithm
 how to work. Therefore, such parameters are usually called *hyperparameters*.

 In PredictionIO, we always take a holistic view of an engine. An engine is
 comprised of a set of ***DAS*** controllers, as well as the necessary parameters for the
 controllers themselves.
 In the evaluation, we attempt to find out the best hyperparameters for an
 *engine*, which we call ***engine params***. Using engine params we can
 deploy a complete engine.

 This section demonstrates how to select the optimal engine params
 whilst ensuring the model doesn't overfit using PredictionIO's evaluation
 module.

 ## The Evaluation Design

 The PredictionIO evaluation module tests for the best engine params for an
 engine.

 Given a set of engine params, we instantiate an engine and evaluate it with existing data.
 The data is split into two sets, a training set and a validation set.
 The training set is used to train the engine, which is deployed using the same steps described in earlier sections.
 We query the engine with the test set data, and compare the predicted values in the response
 with the actual data contained in the validation set.
 We define a ***metric*** to compare ***predicted result*** returned from
 the engine with the ***actual result*** which we obtained from the test data.
 The goal is to maximize the metric score.

 This process is repeated many times with a series of engine params.
 At the end, PredictionIO returns the best engine params.

 We demonstrate the evaluation with [the classification template]
 (/templates/classification/quickstart/).

 ## Evaluation Data Generation

 In evaluation data generation, the goal is to generate a sequence of (training,
 validation) data tuple. A common way is to use a *k-fold* generation process.
 The data set is split into *k folds*. We generate k tuples of training and
 validation sets, for each tuple, the training set takes *k - 1* of the folds and
 the validation set takes the remaining fold.

 To enable evaluation data generation, we need to define the ***actual result***
 and implement the method for generating the (training, validation) data tuple.

 ### Actual Result

 In MyClassification/src/main/scala/***Engine.scala***, the `ActualResult` class
 defines the ***actual result***:

 ```scala
 class ActualResult(
   val label: Double
 ) extends Serializable
 ```

 This class is used to store the actual label of the data (contrast to
 `PredictedResult` which is output of the engine).

 ### Implement Data Generation Method in DataSource

 In MyClassification/src/main/scala/***DataSource.scala***, the method
 `readEval` reads and selects data from datastore and returns a
 sequence of (training, validation) data.

 ```scala
 class DataSource(val dsp: DataSourceParams)
   extends PDataSource[TrainingData, EmptyEvaluationInfo, Query, ActualResult] {

   ...

   override
   def readEval(sc: SparkContext)
   : Seq[(TrainingData, EmptyEvaluationInfo, RDD[(Query, ActualResult)])] = {
     require(!dsp.evalK.isEmpty, "DataSourceParams.evalK must not be None")

     // The following code reads the data from data store. It is equivalent to
     // the readTraining method. We copy-and-paste the exact code here for
     // illustration purpose, a recommended approach is to factor out this logic
     // into a helper function and have both readTraining and readEval call the
     // helper.
     val eventsDb = Storage.getPEvents()
     val labeledPoints: RDD[LabeledPoint] = eventsDb.aggregateProperties(
       appId = dsp.appId,
       entityType = "user",
       // only keep entities with these required properties defined
       required = Some(List("plan", "attr0", "attr1", "attr2")))(sc)
       // aggregateProperties() returns RDD pair of
       // entity ID and its aggregated properties
       .map { case (entityId, properties) =>
         try {
           LabeledPoint(properties.get[Double]("plan"),
             Vectors.dense(Array(
               properties.get[Double]("attr0"),
               properties.get[Double]("attr1"),
               properties.get[Double]("attr2")
             ))
           )
         } catch {
           case e: Exception => {
             logger.error(s"Failed to get properties ${properties} of" +
               s" ${entityId}. Exception: ${e}.")
             throw e
           }
         }
       }.cache()
     // End of reading from data store

     // K-fold splitting
     val evalK = dsp.evalK.get
     val indexedPoints: RDD[(LabeledPoint, Long)] = labeledPoints.zipWithIndex

     (0 until evalK).map { idx =>
       val trainingPoints = indexedPoints.filter(_._2 % evalK != idx).map(_._1)
       val testingPoints = indexedPoints.filter(_._2 % evalK == idx).map(_._1)

       (
         new TrainingData(trainingPoints),
         new EmptyEvaluationInfo(),
         testingPoints.map {
           p => (new Query(p.features.toArray), new ActualResult(p.label))
         }
       )
     }
   }
 }
 ```

 The `readEval` method returns a sequence of (`TrainingData`, `EvaluationInfo`,
 `RDD[(Query, ActualResult)]`.
 `TrainingData` is the same class we use for deploy,
 `RDD[(Query, ActualResult)]` is the
 validation set, `EvaluationInfo` can be used to hold some global evaluation data
 ; it is not used in the current example.

 Lines 11 to 41 is the logic of reading and transforming data from the
 datastore; it is equvialent to the existing `readTraining` method. After line
 41, the variable `labeledPoints` contains the complete dataset with which we use
 to generate the (training, validation) sequence.

 Lines 43 to 57 is the *k-fold* logic. Line 45 gives each data point a unique id,
 and we decide whether the point belongs to the training or validation set
 depends on the *mod* of the id (lines 48 to 49).
 For each point in the validation set, we construct the `Query` and
 `ActualResult` (line 55) which is used validate the engine.

 ## Evaluation Metrics

 We define a `Metric` which gives a *score* to engine params. The higher the
 score, the better the engine params are.
 In this template, we use accuracy score which measures
 the portion of correct prediction among all data points.

 In MyClassification/src/main/scala/**Evaluation.scala**, the class
 `Accuracy` implements the *accuracy* score.
 It extends a base helper class `AverageMetric` which calculates the average
 score overall *(Query, PredictionResult, ActualResult)* tuple.

 ```scala
 case class Accuracy
   extends AverageMetric[EmptyEvaluationInfo, Query, PredictedResult, ActualResult] {
   def calculate(query: Query, predicted: PredictedResult, actual: ActualResult)
   : Double = (if (predicted.label == actual.label) 1.0 else 0.0)
 }
 ```

 Then, implement a `Evaluation` object to define the engine and metric
 used in this evaluation.

 ```scala
 object AccuracyEvaluation extends Evaluation {
   engineMetric = (ClassificationEngine(), new Accuracy())
 }
 ```

 ## Parameters Generation
 The last component is to specify the list of engine params we want to evaluate.
 In this guide, we discuss the simplest method. We specify an explicit list of
 engine params to be evaluated.

 In MyClassification/src/main/scala/**Evaluation.scala**, the object
 `EngineParamsList` specifies the engine params list to be used.

 ```scala
 object EngineParamsList extends EngineParamsGenerator {
   // Define list of EngineParams used in Evaluation

   // First, we define the base engine params. It specifies the appId from which
   // the data is read, and a evalK parameter is used to define the
   // cross-validation.
   private[this] val baseEP = EngineParams(
     dataSourceParams = DataSourceParams(appId = 18, evalK = Some(5)))

   // Second, we specify the engine params list by explicitly listing all
   // algorithm parameters. In this case, we evaluate 3 engine params, each with
   // a different algorithm params value.
   engineParamsList = Seq(
     baseEP.copy(algorithmParamsList = Seq(("naive", AlgorithmParams(10.0)))),
     baseEP.copy(algorithmParamsList = Seq(("naive", AlgorithmParams(100.0)))),
     baseEP.copy(algorithmParamsList = Seq(("naive", AlgorithmParams(1000.0)))))
 }
 ```

 A good practice is to first define a base engine params, it contains the common
 parameters used in all evaluations (lines 7 to 8). With the base params, we
 construct the list of engine params we want to evaluation by
 adding or replacing the controller parameter. Lines 13 to 16 generate 3 engine
 parameters, each has a different smoothing parameters.


 ## Running the Evaluation

 It remains to run the evaluation. Let's recap the quick start section above.
 The `pio eval` command kick starts the evaluation, and the result can be seen
 from the console.

 ```
 $ pio build
 ...
 $ pio eval org.example.classification.AccuracyEvaluation org.example.classification.EngineParamsList
 ```

 You will see the following output:

 ```
 ...
 [INFO] [CoreWorkflow$] runEvaluation started
 ...
 [INFO] [MetricEvaluator] Iteration 0
 [INFO] [MetricEvaluator] EngineParams: {"dataSourceParams":{"":{"appId":19,"evalK":5}},"preparatorParams":{"":{}},"algorithmParamsList":[{"naive":{"lambda":10.0}}],"servingParams":{"":{}}}
 [INFO] [MetricEvaluator] Result: MetricScores(0.9281045751633987,List())
 [INFO] [MetricEvaluator] Iteration 1
 [INFO] [MetricEvaluator] EngineParams: {"dataSourceParams":{"":{"appId":19,"evalK":5}},"preparatorParams":{"":{}},"algorithmParamsList":[{"naive":{"lambda":100.0}}],"servingParams":{"":{}}}
 [INFO] [MetricEvaluator] Result: MetricScores(0.9150326797385621,List())
 [INFO] [MetricEvaluator] Iteration 2
 [INFO] [MetricEvaluator] EngineParams: {"dataSourceParams":{"":{"appId":19,"evalK":5}},"preparatorParams":{"":{}},"algorithmParamsList":[{"naive":{"lambda":1000.0}}],"servingParams":{"":{}}}
 [INFO] [MetricEvaluator] Result: MetricScores(0.4444444444444444,List())
 [INFO] [MetricEvaluator] Writing best variant params to disk...
 [INFO] [CoreWorkflow$] Updating evaluation instance with result: MetricEvaluatorResult:
   # engine params evaluated: 3
 Optimal Engine Params:
   {
   "dataSourceParams":{
     "":{
       "appId":19,
       "evalK":5
     }
   },
   "preparatorParams":{
     "":{

     }
   },
   "algorithmParamsList":[
     {
       "naive":{
         "lambda":10.0
       }
     }
   ],
   "servingParams":{
     "":{

     }
   }
 }
 Metrics:
   org.template.classification.Accuracy: 0.9281045751633987
 The best variant params can be found in best.json
 [INFO] [CoreWorkflow$] runEvaluation completed
 ```

 ## Notes

 - We deliberately not mention ***test set*** in this hyperparameter tuning guide.
 In machine learning literature, the ***test set*** is a separate piece of data
 which is used to evaluate the final engine params outputted by the evaluation
 process. This guarantees that no information in the training / validation set is
 *leaked* into the engine params and yields a biased outcome. With PredictionIO,
 there are multiple ways of conducting robust tuning, we will cover this
 topic in the coming sections.
	---
	title: Hyperparameter Tuning
	---

	<!--
	Licensed to the Apache Software Foundation (ASF) under one or more
	contributor license agreements. See the NOTICE file distributed with
	this work for additional information regarding copyright ownership.
	The ASF licenses this file to You under the Apache License, Version 2.0
	(the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->

	A PredictionIO engine is instantiated by a set of parameters. These parameters
	define which algorithm is to be used, as well supply the parameters for the algorithm itself. This naturally raises the question of how to choose the best set of parameters.
	The evaluation module streamlines the process of tuning the engine to the best
	parameter set and deploys it.

	## Quick Start

	We demonstrate the evaluation with [the classification template]
	(/templates/classification/quickstart/).
	The classification template uses a naive bayesian algorithm that has a smoothing
	parameter. We evaluate the prediction quality against different parameter values
	to find the best parameter values, and then deploy it.

	### Edit the AppId

	Edit MyClassification/src/main/scala/*Evaluation.scala* to specify the
	appId you used to import the data.

	```scala
	object EngineParamsList extends EngineParamsGenerator {
	...
	private[this] val baseEP = EngineParams(
	dataSourceParams = DataSourceParams(appId = <YOUR_APP_ID>, evalK = Some(5)))
	...
	}
	```

	### Build and run the evaluation
	To run an evaluation, the command `pio eval` is used. It takes two
	mandatory parameter,
	1. the `Evaluation` object, which tells PredictionIO the engine and metric we use
	for the evaluation; and
	2. the `EngineParamsGenerator`, which contains a list of engine params to test
	against.
	The following command kickstarts the evaluation
	workflow for the classification template.

	```
	$ pio build
	...
	$ pio eval org.example.classification.AccuracyEvaluation org.example.classification.EngineParamsList
	```

	You will see the following output:

	```
	...
	[INFO] [CoreWorkflow$] runEvaluation started
	...
	[INFO] [MetricEvaluator] Iteration 0
	[INFO] [MetricEvaluator] EngineParams: {"dataSourceParams":{"":{"appId":19,"evalK":5}},"preparatorParams":{"":{}},"algorithmParamsList":[{"naive":{"lambda":10.0}}],"servingParams":{"":{}}}
	[INFO] [MetricEvaluator] Result: MetricScores(0.9281045751633987,List())
	[INFO] [MetricEvaluator] Iteration 1
	[INFO] [MetricEvaluator] EngineParams: {"dataSourceParams":{"":{"appId":19,"evalK":5}},"preparatorParams":{"":{}},"algorithmParamsList":[{"naive":{"lambda":100.0}}],"servingParams":{"":{}}}
	[INFO] [MetricEvaluator] Result: MetricScores(0.9150326797385621,List())
	[INFO] [MetricEvaluator] Iteration 2
	[INFO] [MetricEvaluator] EngineParams: {"dataSourceParams":{"":{"appId":19,"evalK":5}},"preparatorParams":{"":{}},"algorithmParamsList":[{"naive":{"lambda":1000.0}}],"servingParams":{"":{}}}
	[INFO] [MetricEvaluator] Result: MetricScores(0.4444444444444444,List())
	[INFO] [MetricEvaluator] Writing best variant params to disk...
	[INFO] [CoreWorkflow$] Updating evaluation instance with result: MetricEvaluatorResult:
	# engine params evaluated: 3
	Optimal Engine Params:
	{
	"dataSourceParams":{
	"":{
	"appId":19,
	"evalK":5
	}
	},
	"preparatorParams":{
	"":{

	}
	},
	"algorithmParamsList":[
	{
	"naive":{
	"lambda":10.0
	}
	}
	],
	"servingParams":{
	"":{

	}
	}
	}
	Metrics:
	org.example.classification.Accuracy: 0.9281045751633987
	The best variant params can be found in best.json
	[INFO] [CoreWorkflow$] runEvaluation completed
	```

	The console prints out the evaluation metric score of each engine params, and
	finally pretty print the optimal engine params.
	Amongst the 3 engine params we evaluate, lambda = 10.0 yields the highest
	accuracy score of ~0.9281.

	### Deploy the best engine parameter

	The evaluation module also writes out the best engine parameter to disk at
	`best.json`. We can train and deploy this specify engine variant using the
	extra parameter `-v`. For example:

	```bash
	$ pio train -v best.json
	...
	[INFO] [CoreWorkflow$] Training completed successfully.
	$ pio deploy -v best.json
	...
	[INFO] [HttpListener] Bound to localhost/127.0.0.1:8000
	[INFO] [MasterActor] Bind successful. Ready to serve.
	```

	At this point, we have successfully deployed the best engine variant we found
	through the evaluation process.


	## Detailed Explanation

	An engine often depends on a number of parameters, for example, the naive bayesian
	classification algorithm has a smoothing parameter to make the model more
	adaptive to unseen data. Compared with parameters which are learnt by the
	machine learning algorithm, this smoothing parameter teaches the algorithm
	how to work. Therefore, such parameters are usually called hyperparameters.

	In PredictionIO, we always take a holistic view of an engine. An engine is
	comprised of a set of *DAS* controllers, as well as the necessary parameters for the
	controllers themselves.
	In the evaluation, we attempt to find out the best hyperparameters for an
	engine, which we call *engine params*. Using engine params we can
	deploy a complete engine.

	This section demonstrates how to select the optimal engine params
	whilst ensuring the model doesn't overfit using PredictionIO's evaluation
	module.

	## The Evaluation Design

	The PredictionIO evaluation module tests for the best engine params for an
	engine.

	Given a set of engine params, we instantiate an engine and evaluate it with existing data.
	The data is split into two sets, a training set and a validation set.
	The training set is used to train the engine, which is deployed using the same steps described in earlier sections.
	We query the engine with the test set data, and compare the predicted values in the response
	with the actual data contained in the validation set.
	We define a *metric* to compare *predicted result* returned from
	the engine with the *actual result* which we obtained from the test data.
	The goal is to maximize the metric score.

	This process is repeated many times with a series of engine params.
	At the end, PredictionIO returns the best engine params.

	We demonstrate the evaluation with [the classification template]
	(/templates/classification/quickstart/).

	## Evaluation Data Generation

	In evaluation data generation, the goal is to generate a sequence of (training,
	validation) data tuple. A common way is to use a k-fold generation process.
	The data set is split into k folds. We generate k tuples of training and
	validation sets, for each tuple, the training set takes k - 1 of the folds and
	the validation set takes the remaining fold.

	To enable evaluation data generation, we need to define the *actual result*
	and implement the method for generating the (training, validation) data tuple.

	### Actual Result

	In MyClassification/src/main/scala/*Engine.scala*, the `ActualResult` class
	defines the *actual result*:

	```scala
	class ActualResult(
	val label: Double
	) extends Serializable
	```

	This class is used to store the actual label of the data (contrast to
	`PredictedResult` which is output of the engine).

	### Implement Data Generation Method in DataSource

	In MyClassification/src/main/scala/*DataSource.scala*, the method
	`readEval` reads and selects data from datastore and returns a
	sequence of (training, validation) data.

	```scala
	class DataSource(val dsp: DataSourceParams)
	extends PDataSource[TrainingData, EmptyEvaluationInfo, Query, ActualResult] {

	...

	override
	def readEval(sc: SparkContext)
	: Seq[(TrainingData, EmptyEvaluationInfo, RDD[(Query, ActualResult)])] = {
	require(!dsp.evalK.isEmpty, "DataSourceParams.evalK must not be None")

	// The following code reads the data from data store. It is equivalent to
	// the readTraining method. We copy-and-paste the exact code here for
	// illustration purpose, a recommended approach is to factor out this logic
	// into a helper function and have both readTraining and readEval call the
	// helper.
	val eventsDb = Storage.getPEvents()
	val labeledPoints: RDD[LabeledPoint] = eventsDb.aggregateProperties(
	appId = dsp.appId,
	entityType = "user",
	// only keep entities with these required properties defined
	required = Some(List("plan", "attr0", "attr1", "attr2")))(sc)
	// aggregateProperties() returns RDD pair of
	// entity ID and its aggregated properties
	.map { case (entityId, properties) =>
	try {
	LabeledPoint(properties.get[Double]("plan"),
	Vectors.dense(Array(
	properties.get[Double]("attr0"),
	properties.get[Double]("attr1"),
	properties.get[Double]("attr2")
	))
	)
	} catch {
	case e: Exception => {
	logger.error(s"Failed to get properties ${properties} of" +
	s" ${entityId}. Exception: ${e}.")
	throw e
	}
	}
	}.cache()
	// End of reading from data store

	// K-fold splitting
	val evalK = dsp.evalK.get
	val indexedPoints: RDD[(LabeledPoint, Long)] = labeledPoints.zipWithIndex

	(0 until evalK).map { idx =>
	val trainingPoints = indexedPoints.filter(_._2 % evalK != idx).map(_._1)
	val testingPoints = indexedPoints.filter(_._2 % evalK == idx).map(_._1)

	(
	new TrainingData(trainingPoints),
	new EmptyEvaluationInfo(),
	testingPoints.map {
	p => (new Query(p.features.toArray), new ActualResult(p.label))
	}
	)
	}
	}
	}
	```

	The `readEval` method returns a sequence of (`TrainingData`, `EvaluationInfo`,
	`RDD[(Query, ActualResult)]`.
	`TrainingData` is the same class we use for deploy,
	`RDD[(Query, ActualResult)]` is the
	validation set, `EvaluationInfo` can be used to hold some global evaluation data
	; it is not used in the current example.

	Lines 11 to 41 is the logic of reading and transforming data from the
	datastore; it is equvialent to the existing `readTraining` method. After line
	41, the variable `labeledPoints` contains the complete dataset with which we use
	to generate the (training, validation) sequence.

	Lines 43 to 57 is the k-fold logic. Line 45 gives each data point a unique id,
	and we decide whether the point belongs to the training or validation set
	depends on the mod of the id (lines 48 to 49).
	For each point in the validation set, we construct the `Query` and
	`ActualResult` (line 55) which is used validate the engine.

	## Evaluation Metrics

	We define a `Metric` which gives a score to engine params. The higher the
	score, the better the engine params are.
	In this template, we use accuracy score which measures
	the portion of correct prediction among all data points.

	In MyClassification/src/main/scala/Evaluation.scala, the class
	`Accuracy` implements the accuracy score.
	It extends a base helper class `AverageMetric` which calculates the average
	score overall (Query, PredictionResult, ActualResult) tuple.

	```scala
	case class Accuracy
	extends AverageMetric[EmptyEvaluationInfo, Query, PredictedResult, ActualResult] {
	def calculate(query: Query, predicted: PredictedResult, actual: ActualResult)
	: Double = (if (predicted.label == actual.label) 1.0 else 0.0)
	}
	```

	Then, implement a `Evaluation` object to define the engine and metric
	used in this evaluation.

	```scala
	object AccuracyEvaluation extends Evaluation {
	engineMetric = (ClassificationEngine(), new Accuracy())
	}
	```

	## Parameters Generation
	The last component is to specify the list of engine params we want to evaluate.
	In this guide, we discuss the simplest method. We specify an explicit list of
	engine params to be evaluated.

	In MyClassification/src/main/scala/Evaluation.scala, the object
	`EngineParamsList` specifies the engine params list to be used.

	```scala
	object EngineParamsList extends EngineParamsGenerator {
	// Define list of EngineParams used in Evaluation

	// First, we define the base engine params. It specifies the appId from which
	// the data is read, and a evalK parameter is used to define the
	// cross-validation.
	private[this] val baseEP = EngineParams(
	dataSourceParams = DataSourceParams(appId = 18, evalK = Some(5)))

	// Second, we specify the engine params list by explicitly listing all
	// algorithm parameters. In this case, we evaluate 3 engine params, each with
	// a different algorithm params value.
	engineParamsList = Seq(
	baseEP.copy(algorithmParamsList = Seq(("naive", AlgorithmParams(10.0)))),
	baseEP.copy(algorithmParamsList = Seq(("naive", AlgorithmParams(100.0)))),
	baseEP.copy(algorithmParamsList = Seq(("naive", AlgorithmParams(1000.0)))))
	}
	```

	A good practice is to first define a base engine params, it contains the common
	parameters used in all evaluations (lines 7 to 8). With the base params, we
	construct the list of engine params we want to evaluation by
	adding or replacing the controller parameter. Lines 13 to 16 generate 3 engine
	parameters, each has a different smoothing parameters.



	## Running the Evaluation

	It remains to run the evaluation. Let's recap the quick start section above.
	The `pio eval` command kick starts the evaluation, and the result can be seen
	from the console.

	```
	$ pio build
	...
	$ pio eval org.example.classification.AccuracyEvaluation org.example.classification.EngineParamsList
	```

	You will see the following output:

	```
	...
	[INFO] [CoreWorkflow$] runEvaluation started
	...
	[INFO] [MetricEvaluator] Iteration 0
	[INFO] [MetricEvaluator] EngineParams: {"dataSourceParams":{"":{"appId":19,"evalK":5}},"preparatorParams":{"":{}},"algorithmParamsList":[{"naive":{"lambda":10.0}}],"servingParams":{"":{}}}
	[INFO] [MetricEvaluator] Result: MetricScores(0.9281045751633987,List())
	[INFO] [MetricEvaluator] Iteration 1
	[INFO] [MetricEvaluator] EngineParams: {"dataSourceParams":{"":{"appId":19,"evalK":5}},"preparatorParams":{"":{}},"algorithmParamsList":[{"naive":{"lambda":100.0}}],"servingParams":{"":{}}}
	[INFO] [MetricEvaluator] Result: MetricScores(0.9150326797385621,List())
	[INFO] [MetricEvaluator] Iteration 2
	[INFO] [MetricEvaluator] EngineParams: {"dataSourceParams":{"":{"appId":19,"evalK":5}},"preparatorParams":{"":{}},"algorithmParamsList":[{"naive":{"lambda":1000.0}}],"servingParams":{"":{}}}
	[INFO] [MetricEvaluator] Result: MetricScores(0.4444444444444444,List())
	[INFO] [MetricEvaluator] Writing best variant params to disk...
	[INFO] [CoreWorkflow$] Updating evaluation instance with result: MetricEvaluatorResult:
	# engine params evaluated: 3
	Optimal Engine Params:
	{
	"dataSourceParams":{
	"":{
	"appId":19,
	"evalK":5
	}
	},
	"preparatorParams":{
	"":{

	}
	},
	"algorithmParamsList":[
	{
	"naive":{
	"lambda":10.0
	}
	}
	],
	"servingParams":{
	"":{

	}
	}
	}
	Metrics:
	org.template.classification.Accuracy: 0.9281045751633987
	The best variant params can be found in best.json
	[INFO] [CoreWorkflow$] runEvaluation completed
	```

	## Notes

	- We deliberately not mention *test set* in this hyperparameter tuning guide.
	In machine learning literature, the *test set* is a separate piece of data
	which is used to evaluate the final engine params outputted by the evaluation
	process. This guarantees that no information in the training / validation set is
	leaked into the engine params and yields a biased outcome. With PredictionIO,
	there are multiple ways of conducting robust tuning, we will cover this
	topic in the coming sections.