blob: aa1074395a46f275bc6a8ddfaf6c0b5a5b6434b7 [file] [log] [blame]
---
title: Developing Engines with IntelliJ IDEA
---
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
## Prerequisites
This documentation assumes that you have a fully functional PredictionIO setup.
If you have not installed PredictionIO yet, please follow [these
instructions](/install/).
The following instructions have been tested with IntelliJ IDEA 2018.2.2
Community Edition.
## Preparing IntelliJ for Engine Development
### Installing IntelliJ Scala Plugin
First of all, you will need to install the [Scala
plugin](https://plugins.jetbrains.com/plugin/?id=1347) if you have not already
done so.
Go to the *Preferences* menu item, and look for *Plugins*. You should see
the following screen.
![IntelliJ Plugins](/images/intellij/intelliJ-scala-plugin.png)
Click *Install JetBrains plugin...*, the search for *Scala*. You should arrive
at something similar to the following.
![Scala Plugin](/images/intellij/intellij-scala-plugin-2.png)
Click the green *Install plugin* button to install the plugin. Restart IntelliJ
IDEA if asked to do so.
### Setting Up the Engine Directory
Create an engine directory from a template. This requires that you download a
template that you wish to start from or modify.
Follow template [install](/start/download) and [deploy](/start/deploy)
instructions or go through the [Quick
Start](/templates/recommendation/quickstart/) if you are planning to modify a
recommender. Make sure to build, train, and deploy the engine to make sure all
is configured properly.
From IntelliJ IDEA, choose *File* > *New* > *Project from Existing Sources...*.
When asked to select a directory to import, browse to the engine directory that
you downloaded too and proceed. Make sure you pick *Import project from external
model* > *SBT*, then proceed to finish.
You should be able to build the project at this point. To run and debug your
template, continue on to the rest of the steps.
### Optional: Issues with Snappy on macOS
If you are running on macOS and run into the following [known
issue](http://bit.ly/12Abtvn), follow steps in this section.
Edit `build.sbt` and add the following under `libraryDependencies`
```
"org.xerial.snappy" % "snappy-java" % "1.1.1.7"
```
![Updating build.sbt](/images/intellij/intellij-buildsbt.png)
When you are done editing, IntelliJ should prompt you to import new changes,
unless you have already enabled auto import. Import this change to make it
effective.
### Module Settings
INFO: IntelliJ will recreate module settings whenever it imports changes of your
project. You will need to repeat this section whenever that happens.
Due to the way how `pio` command sources required classes during runtime, it is
necessary to add them manually in module settings for *Run/Debug Configurations*
to work properly.
Right click on the project and click *Open Module Settings*. Hit the **+**
button right below the list of dependencies, and select *JARs or
directories...*.
The first JAR that you need to add is the `pio-assembly-<%= data.versions.pio
%>.jar` that contains all necessary classes. It can be found inside the
`dist/lib` directory of your PredictionIO source installation directory (if you
have built from sources) or the `lib` directory of your PredictionIO binary
installation directory.
Next, you will need to make sure some configuration files from your PredictionIO
installation can be found during runtime. Add the `conf` directory of your
PredictionIO installation directory. When asked about categories of the
directory, pick *Classes*.
Finally, you will need to add storage classes. The exact list of JARs that you
will need to add depends on your storage configuration. These JARs can be found
inside the `dist/lib/spark` directory of your PredictionIO source installation
directory (if you have built from sources) or the `lib/spark` directory of your
PredictionIO binary installation directory.
* `pio-data-elasticsearch-assembly-<%= data.versions.pio %>.jar`
Add this JAR if your configuration uses Elasticsearch.
* `pio-data-hbase-assembly-<%= data.versions.pio %>.jar`
Add this JAR if your configuration uses Apache HBase.
* `pio-data-hdfs-assembly-<%= data.versions.pio %>.jar`
Add this JAR if your configuration uses HDFS.
* `pio-data-jdbc-assembly-<%= data.versions.pio %>.jar`
Add this JAR if your configuration uses JDBC. Notice that you must also add any
additional JDBC driver JARs.
* `pio-data-localfs-assembly-<%= data.versions.pio %>.jar`
Add this JAR if your configuration uses local filesystem.
* `pio-data-s3-assembly-<%= data.versions.pio %>.jar`
Add this JAR if your configuration uses Amazon Web Services S3.
Make sure to change all these additions to *Runtime* scope. The following shows
an example that uses the JDBC storage backend with PostgreSQL driver.
![Example module settings for a JDBC and PostgreSQL
configuration](/images/intellij/intellij-module-settings.png)
## Running and Debugging in IntelliJ IDEA
### Simulating `pio train`
Create a new *Run/Debug Configuration* by going to *Run* > *Edit
Configurations...*. Click on the **+** button and select *Application*. Name it
`pio train` and put in the following:
* Main class:
```
org.apache.predictionio.workflow.CreateWorkflow
```
* VM options:
```
-Dspark.master=local -Dlog4j.configuration=file:/<your_pio_path>/conf/log4j.properties -Dpio.log.dir=<path_of_log_file>
```
* Program arguments:
```
--engine-id dummy --engine-version dummy --engine-variant engine.json --env dummy=dummy
```
Make sure *Working directory* is set to the base directory of the template that
you are working on.
Click the folder button to the right of *Environment variables*, and paste the
relevant values from `conf/pio-env.sh` in your PredictionIO installation
directory. The following shows an example using JDBC and PostgreSQL.
![Example environment variables
settings](/images/intellij/pio-train-env-vars.png)
Make sure *Include dependencies with "Provided" scope* is checked.
The end result should look something similar to this.
![Run Configuration](/images/intellij/pio-train.png)
Save and you can run or debug `pio train` with the new configuration!
### Simulating `pio deploy`
For `pio deploy`, simply duplicate the previous configuration and replace with
the following.
* Main class:
```
org.apache.predictionio.workflow.CreateServer
```
* Program Arguments:
```
--engineInstanceId <id_from_pio_train> --engine-variant engine.json
```
### Executing a Query
You can execute a query with the correct SDK. For a recommender that has been
trained with the sample MovieLens dataset perhaps the easiest query is a `curl`
one. Start by running or debuging your `pio deploy` config so the service is
waiting for the query. Then go to the "Terminal" tab at the very bottom of the
IntelliJ IDEA window and enter the `curl` request:
```
curl -H "Content-Type: application/json" -d '{ "user": "1", "num": 4 }' http://localhost:8000/queries.json
```
This should return something like:
```
{"itemScores":[
{"item":"52","score":9.582509402541834},
{"item":"95","score":8.017236650368387},
{"item":"89","score":6.975951244053634},
{"item":"34","score":6.857457277981334}
]}
```
INFO: If you hit a breakpoint you are likely to get a connection timeout. To see
the data that would have been returned, just place a breakpoint where the
response is created or run the query with no breakpoints.