docs/manual/source/resources/intellij.html.md.erb - predictionio - Git at Google

 ---
 title: Developing Engines with IntelliJ IDEA
 ---

 <!--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 -->

 ## Prerequisites

 This documentation assumes that you have a fully functional PredictionIO setup.
 If you have not installed PredictionIO yet, please follow [these
 instructions](/install/).

 The following instructions have been tested with IntelliJ IDEA 2018.2.2
 Community Edition.


 ## Preparing IntelliJ for Engine Development


 ### Installing IntelliJ Scala Plugin

 First of all, you will need to install the [Scala
 plugin](https://plugins.jetbrains.com/plugin/?id=1347) if you have not already
 done so.

 Go to the *Preferences* menu item, and look for *Plugins*. You should see
 the following screen.

 ![IntelliJ Plugins](/images/intellij/intelliJ-scala-plugin.png)

 Click *Install JetBrains plugin...*, the search for *Scala*. You should arrive
 at something similar to the following.

 ![Scala Plugin](/images/intellij/intellij-scala-plugin-2.png)

 Click the green *Install plugin* button to install the plugin. Restart IntelliJ
 IDEA if asked to do so.


 ### Setting Up the Engine Directory

 Create an engine directory from a template. This requires that you download a
 template that you wish to start from or modify.

 Follow template [install](/start/download) and [deploy](/start/deploy)
 instructions or go through the [Quick
 Start](/templates/recommendation/quickstart/) if you are planning to modify a
 recommender. Make sure to build, train, and deploy the engine to make sure all
 is configured properly.

 From IntelliJ IDEA, choose *File* > *New* > *Project from Existing Sources...*.
 When asked to select a directory to import, browse to the engine directory that
 you downloaded too and proceed. Make sure you pick *Import project from external
 model* > *SBT*, then proceed to finish.

 You should be able to build the project at this point. To run and debug your
 template, continue on to the rest of the steps.


 ### Optional: Issues with Snappy on macOS

 If you are running on macOS and run into the following [known
 issue](http://bit.ly/12Abtvn), follow steps in this section.

 Edit `build.sbt` and add the following under `libraryDependencies`

 ```
 "org.xerial.snappy" % "snappy-java" % "1.1.1.7"
 ```

 ![Updating build.sbt](/images/intellij/intellij-buildsbt.png)

 When you are done editing, IntelliJ should prompt you to import new changes,
 unless you have already enabled auto import. Import this change to make it
 effective.


 ### Module Settings

 INFO: IntelliJ will recreate module settings whenever it imports changes of your
 project. You will need to repeat this section whenever that happens.

 Due to the way how `pio` command sources required classes during runtime, it is
 necessary to add them manually in module settings for *Run/Debug Configurations*
 to work properly.

 Right click on the project and click *Open Module Settings*. Hit the **+**
 button right below the list of dependencies, and select *JARs or
 directories...*.

 The first JAR that you need to add is the `pio-assembly-<%= data.versions.pio
 %>.jar` that contains all necessary classes. It can be found inside the
 `dist/lib` directory of your PredictionIO source installation directory (if you
 have built from sources) or the `lib` directory of your PredictionIO binary
 installation directory.

 Next, you will need to make sure some configuration files from your PredictionIO
 installation can be found during runtime. Add the `conf` directory of your
 PredictionIO installation directory. When asked about categories of the
 directory, pick *Classes*.

 Finally, you will need to add storage classes. The exact list of JARs that you
 will need to add depends on your storage configuration. These JARs can be found
 inside the `dist/lib/spark` directory of your PredictionIO source installation
 directory (if you have built from sources) or the `lib/spark` directory of your
 PredictionIO binary installation directory.

 *   `pio-data-elasticsearch-assembly-<%= data.versions.pio %>.jar`

     Add this JAR if your configuration uses Elasticsearch.

 *   `pio-data-hbase-assembly-<%= data.versions.pio %>.jar`

     Add this JAR if your configuration uses Apache HBase.

 *   `pio-data-hdfs-assembly-<%= data.versions.pio %>.jar`

     Add this JAR if your configuration uses HDFS.

 *   `pio-data-jdbc-assembly-<%= data.versions.pio %>.jar`

     Add this JAR if your configuration uses JDBC. Notice that you must also add any
     additional JDBC driver JARs.

 *   `pio-data-localfs-assembly-<%= data.versions.pio %>.jar`

     Add this JAR if your configuration uses local filesystem.

 *   `pio-data-s3-assembly-<%= data.versions.pio %>.jar`

     Add this JAR if your configuration uses Amazon Web Services S3.

 Make sure to change all these additions to *Runtime* scope. The following shows
 an example that uses the JDBC storage backend with PostgreSQL driver.

 ![Example module settings for a JDBC and PostgreSQL
 configuration](/images/intellij/intellij-module-settings.png)


 ## Running and Debugging in IntelliJ IDEA


 ### Simulating `pio train`

 Create a new *Run/Debug Configuration* by going to *Run* > *Edit
 Configurations...*. Click on the **+** button and select *Application*. Name it
 `pio train` and put in the following:

 *   Main class:

     ```
     org.apache.predictionio.workflow.CreateWorkflow
     ```

 *   VM options:

     ```
     -Dspark.master=local -Dlog4j.configuration=file:/<your_pio_path>/conf/log4j.properties -Dpio.log.dir=<path_of_log_file>
     ```

 *   Program arguments:

     ```
     --engine-id dummy --engine-version dummy --engine-variant engine.json --env dummy=dummy
     ```

 Make sure *Working directory* is set to the base directory of the template that
 you are working on.

 Click the folder button to the right of *Environment variables*, and paste the
 relevant values from `conf/pio-env.sh` in your PredictionIO installation
 directory. The following shows an example using JDBC and PostgreSQL.

 ![Example environment variables
 settings](/images/intellij/pio-train-env-vars.png)

 Make sure *Include dependencies with "Provided" scope* is checked.

 The end result should look something similar to this.

 ![Run Configuration](/images/intellij/pio-train.png)

 Save and you can run or debug `pio train` with the new configuration!


 ### Simulating `pio deploy`

 For `pio deploy`, simply duplicate the previous configuration and replace with
 the following.

 *   Main class:

     ```
     org.apache.predictionio.workflow.CreateServer
     ```

 *   Program Arguments:

     ```
     --engineInstanceId <id_from_pio_train> --engine-variant engine.json
     ```


 ### Executing a Query

 You can execute a query with the correct SDK. For a recommender that has been
 trained with the sample MovieLens dataset perhaps the easiest query is a `curl`
 one. Start by running or debuging your `pio deploy` config so the service is
 waiting for the query. Then go to the "Terminal" tab at the very bottom of the
 IntelliJ IDEA window and enter the `curl` request:

 ```
 curl -H "Content-Type: application/json" -d '{ "user": "1", "num": 4 }' http://localhost:8000/queries.json
 ```

 This should return something like:

 ```
 {"itemScores":[
   {"item":"52","score":9.582509402541834},
   {"item":"95","score":8.017236650368387},
   {"item":"89","score":6.975951244053634},
   {"item":"34","score":6.857457277981334}
 ]}
 ```

 INFO: If you hit a breakpoint you are likely to get a connection timeout. To see
 the data that would have been returned, just place a breakpoint where the
 response is created or run the query with no breakpoints.
	---
	title: Developing Engines with IntelliJ IDEA
	---

	<!--
	Licensed to the Apache Software Foundation (ASF) under one or more
	contributor license agreements. See the NOTICE file distributed with
	this work for additional information regarding copyright ownership.
	The ASF licenses this file to You under the Apache License, Version 2.0
	(the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->

	## Prerequisites

	This documentation assumes that you have a fully functional PredictionIO setup.
	If you have not installed PredictionIO yet, please follow [these
	instructions](/install/).

	The following instructions have been tested with IntelliJ IDEA 2018.2.2
	Community Edition.


	## Preparing IntelliJ for Engine Development


	### Installing IntelliJ Scala Plugin

	First of all, you will need to install the [Scala
	plugin](https://plugins.jetbrains.com/plugin/?id=1347) if you have not already
	done so.

	Go to the Preferences menu item, and look for Plugins. You should see
	the following screen.

	![IntelliJ Plugins](/images/intellij/intelliJ-scala-plugin.png)

	Click Install JetBrains plugin..., the search for Scala. You should arrive
	at something similar to the following.

	![Scala Plugin](/images/intellij/intellij-scala-plugin-2.png)

	Click the green Install plugin button to install the plugin. Restart IntelliJ
	IDEA if asked to do so.


	### Setting Up the Engine Directory

	Create an engine directory from a template. This requires that you download a
	template that you wish to start from or modify.

	Follow template [install](/start/download) and [deploy](/start/deploy)
	instructions or go through the [Quick
	Start](/templates/recommendation/quickstart/) if you are planning to modify a
	recommender. Make sure to build, train, and deploy the engine to make sure all
	is configured properly.

	From IntelliJ IDEA, choose File > New > Project from Existing Sources....
	When asked to select a directory to import, browse to the engine directory that
	you downloaded too and proceed. Make sure you pick *Import project from external
	model* > SBT, then proceed to finish.

	You should be able to build the project at this point. To run and debug your
	template, continue on to the rest of the steps.


	### Optional: Issues with Snappy on macOS

	If you are running on macOS and run into the following [known
	issue](http://bit.ly/12Abtvn), follow steps in this section.

	Edit `build.sbt` and add the following under `libraryDependencies`

	```
	"org.xerial.snappy" % "snappy-java" % "1.1.1.7"
	```

	![Updating build.sbt](/images/intellij/intellij-buildsbt.png)

	When you are done editing, IntelliJ should prompt you to import new changes,
	unless you have already enabled auto import. Import this change to make it
	effective.


	### Module Settings

	INFO: IntelliJ will recreate module settings whenever it imports changes of your
	project. You will need to repeat this section whenever that happens.

	Due to the way how `pio` command sources required classes during runtime, it is
	necessary to add them manually in module settings for Run/Debug Configurations
	to work properly.

	Right click on the project and click Open Module Settings. Hit the +
	button right below the list of dependencies, and select *JARs or
	directories...*.

	The first JAR that you need to add is the `pio-assembly-<%= data.versions.pio
	%>.jar` that contains all necessary classes. It can be found inside the
	`dist/lib` directory of your PredictionIO source installation directory (if you
	have built from sources) or the `lib` directory of your PredictionIO binary
	installation directory.

	Next, you will need to make sure some configuration files from your PredictionIO
	installation can be found during runtime. Add the `conf` directory of your
	PredictionIO installation directory. When asked about categories of the
	directory, pick Classes.

	Finally, you will need to add storage classes. The exact list of JARs that you
	will need to add depends on your storage configuration. These JARs can be found
	inside the `dist/lib/spark` directory of your PredictionIO source installation
	directory (if you have built from sources) or the `lib/spark` directory of your
	PredictionIO binary installation directory.

	* `pio-data-elasticsearch-assembly-<%= data.versions.pio %>.jar`

	Add this JAR if your configuration uses Elasticsearch.

	* `pio-data-hbase-assembly-<%= data.versions.pio %>.jar`

	Add this JAR if your configuration uses Apache HBase.

	* `pio-data-hdfs-assembly-<%= data.versions.pio %>.jar`

	Add this JAR if your configuration uses HDFS.

	* `pio-data-jdbc-assembly-<%= data.versions.pio %>.jar`

	Add this JAR if your configuration uses JDBC. Notice that you must also add any
	additional JDBC driver JARs.

	* `pio-data-localfs-assembly-<%= data.versions.pio %>.jar`

	Add this JAR if your configuration uses local filesystem.

	* `pio-data-s3-assembly-<%= data.versions.pio %>.jar`

	Add this JAR if your configuration uses Amazon Web Services S3.

	Make sure to change all these additions to Runtime scope. The following shows
	an example that uses the JDBC storage backend with PostgreSQL driver.

	![Example module settings for a JDBC and PostgreSQL
	configuration](/images/intellij/intellij-module-settings.png)


	## Running and Debugging in IntelliJ IDEA


	### Simulating `pio train`

	Create a new Run/Debug Configuration by going to Run > *Edit
	Configurations.... Click on the +* button and select Application. Name it
	`pio train` and put in the following:

	* Main class:

	```
	org.apache.predictionio.workflow.CreateWorkflow
	```

	* VM options:

	```
	-Dspark.master=local -Dlog4j.configuration=file:/<your_pio_path>/conf/log4j.properties -Dpio.log.dir=<path_of_log_file>
	```

	* Program arguments:

	```
	--engine-id dummy --engine-version dummy --engine-variant engine.json --env dummy=dummy
	```

	Make sure Working directory is set to the base directory of the template that
	you are working on.

	Click the folder button to the right of Environment variables, and paste the
	relevant values from `conf/pio-env.sh` in your PredictionIO installation
	directory. The following shows an example using JDBC and PostgreSQL.

	![Example environment variables
	settings](/images/intellij/pio-train-env-vars.png)

	Make sure Include dependencies with "Provided" scope is checked.

	The end result should look something similar to this.

	![Run Configuration](/images/intellij/pio-train.png)

	Save and you can run or debug `pio train` with the new configuration!


	### Simulating `pio deploy`

	For `pio deploy`, simply duplicate the previous configuration and replace with
	the following.

	* Main class:

	```
	org.apache.predictionio.workflow.CreateServer
	```

	* Program Arguments:

	```
	--engineInstanceId <id_from_pio_train> --engine-variant engine.json
	```


	### Executing a Query

	You can execute a query with the correct SDK. For a recommender that has been
	trained with the sample MovieLens dataset perhaps the easiest query is a `curl`
	one. Start by running or debuging your `pio deploy` config so the service is
	waiting for the query. Then go to the "Terminal" tab at the very bottom of the
	IntelliJ IDEA window and enter the `curl` request:

	```
	curl -H "Content-Type: application/json" -d '{ "user": "1", "num": 4 }' http://localhost:8000/queries.json
	```

	This should return something like:

	```
	{"itemScores":[
	{"item":"52","score":9.582509402541834},
	{"item":"95","score":8.017236650368387},
	{"item":"89","score":6.975951244053634},
	{"item":"34","score":6.857457277981334}
	]}
	```

	INFO: If you hit a breakpoint you are likely to get a connection timeout. To see
	the data that would have been returned, just place a breakpoint where the
	response is created or run the query with no breakpoints.