The code of an engine consists of D-A-S-E components:
Data Source reads data from an input source and transforms it into a desired format. Data Preparator preprocesses the data and forwards it to the algorithm for model training.
The Algorithm component includes the Machine Learning algorithm, and the settings of its parameters, determines how a predictive model is constructed.
The Serving component takes prediction queries and returns prediction results. If the engine has multiple algorithms, Serving will combine the results into one. Additionally, business-specific logic can be added in Serving to further customize the final returned results.
An Evaluation Metric quantifies prediction accuracy with a numerical score. It can be used for comparing algorithms or algorithm parameter settings.
Apache PredictionIO helps you modularize these components so you can build, for example, several Serving components for an Engine. You will be able to choose which one to be deployed when you create an Engine.
The main functions of an engine are:
An engine puts all DASE components into a deployable state by specifying:
One Data Source
One Data Preparator
One or more Algorithm(s)
One Serving
INFO: If more than one algorithm is specified, each of their model prediction results will be passed to Serving for ensembling.
Each Engine processes data and constructs predictive models independently. Therefore, every engine serves its own set of prediction results. For example, you may deploy two engines for your mobile application: one for recommending news to users and another one for suggesting new friends to users.
The following graph shows the workflow of DASE components when pio train
is run.
The following graph shows the workflow of DASE components when a REST query is received by a deployed engine.
Please see Implement DASE for DASE implementation details.
Please refer to following templates and their how-to guides for concrete examples.