| --- |
| title: Developing Engines with IntelliJ IDEA |
| --- |
| |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| ## Prerequisites |
| |
| This documentation assumes that you have a fully functional PredictionIO setup. |
| If you have not installed PredictionIO yet, please follow [these |
| instructions](/install/). |
| |
| The following instructions have been tested with IntelliJ IDEA 2018.2.2 |
| Community Edition. |
| |
| |
| ## Preparing IntelliJ for Engine Development |
| |
| |
| ### Installing IntelliJ Scala Plugin |
| |
| First of all, you will need to install the [Scala |
| plugin](https://plugins.jetbrains.com/plugin/?id=1347) if you have not already |
| done so. |
| |
| Go to the *Preferences* menu item, and look for *Plugins*. You should see |
| the following screen. |
| |
| ![IntelliJ Plugins](/images/intellij/intelliJ-scala-plugin.png) |
| |
| Click *Install JetBrains plugin...*, the search for *Scala*. You should arrive |
| at something similar to the following. |
| |
| ![Scala Plugin](/images/intellij/intellij-scala-plugin-2.png) |
| |
| Click the green *Install plugin* button to install the plugin. Restart IntelliJ |
| IDEA if asked to do so. |
| |
| |
| ### Setting Up the Engine Directory |
| |
| Create an engine directory from a template. This requires that you download a |
| template that you wish to start from or modify. |
| |
| Follow template [install](/start/download) and [deploy](/start/deploy) |
| instructions or go through the [Quick |
| Start](/templates/recommendation/quickstart/) if you are planning to modify a |
| recommender. Make sure to build, train, and deploy the engine to make sure all |
| is configured properly. |
| |
| From IntelliJ IDEA, choose *File* > *New* > *Project from Existing Sources...*. |
| When asked to select a directory to import, browse to the engine directory that |
| you downloaded too and proceed. Make sure you pick *Import project from external |
| model* > *SBT*, then proceed to finish. |
| |
| You should be able to build the project at this point. To run and debug your |
| template, continue on to the rest of the steps. |
| |
| |
| ### Optional: Issues with Snappy on macOS |
| |
| If you are running on macOS and run into the following [known |
| issue](http://bit.ly/12Abtvn), follow steps in this section. |
| |
| Edit `build.sbt` and add the following under `libraryDependencies` |
| |
| ``` |
| "org.xerial.snappy" % "snappy-java" % "1.1.1.7" |
| ``` |
| |
| ![Updating build.sbt](/images/intellij/intellij-buildsbt.png) |
| |
| When you are done editing, IntelliJ should prompt you to import new changes, |
| unless you have already enabled auto import. Import this change to make it |
| effective. |
| |
| |
| ### Module Settings |
| |
| INFO: IntelliJ will recreate module settings whenever it imports changes of your |
| project. You will need to repeat this section whenever that happens. |
| |
| Due to the way how `pio` command sources required classes during runtime, it is |
| necessary to add them manually in module settings for *Run/Debug Configurations* |
| to work properly. |
| |
| Right click on the project and click *Open Module Settings*. Hit the **+** |
| button right below the list of dependencies, and select *JARs or |
| directories...*. |
| |
| The first JAR that you need to add is the `pio-assembly-<%= data.versions.pio |
| %>.jar` that contains all necessary classes. It can be found inside the |
| `dist/lib` directory of your PredictionIO source installation directory (if you |
| have built from sources) or the `lib` directory of your PredictionIO binary |
| installation directory. |
| |
| Next, you will need to make sure some configuration files from your PredictionIO |
| installation can be found during runtime. Add the `conf` directory of your |
| PredictionIO installation directory. When asked about categories of the |
| directory, pick *Classes*. |
| |
| Finally, you will need to add storage classes. The exact list of JARs that you |
| will need to add depends on your storage configuration. These JARs can be found |
| inside the `dist/lib/spark` directory of your PredictionIO source installation |
| directory (if you have built from sources) or the `lib/spark` directory of your |
| PredictionIO binary installation directory. |
| |
| * `pio-data-elasticsearch-assembly-<%= data.versions.pio %>.jar` |
| |
| Add this JAR if your configuration uses Elasticsearch. |
| |
| * `pio-data-hbase-assembly-<%= data.versions.pio %>.jar` |
| |
| Add this JAR if your configuration uses Apache HBase. |
| |
| * `pio-data-hdfs-assembly-<%= data.versions.pio %>.jar` |
| |
| Add this JAR if your configuration uses HDFS. |
| |
| * `pio-data-jdbc-assembly-<%= data.versions.pio %>.jar` |
| |
| Add this JAR if your configuration uses JDBC. Notice that you must also add any |
| additional JDBC driver JARs. |
| |
| * `pio-data-localfs-assembly-<%= data.versions.pio %>.jar` |
| |
| Add this JAR if your configuration uses local filesystem. |
| |
| * `pio-data-s3-assembly-<%= data.versions.pio %>.jar` |
| |
| Add this JAR if your configuration uses Amazon Web Services S3. |
| |
| Make sure to change all these additions to *Runtime* scope. The following shows |
| an example that uses the JDBC storage backend with PostgreSQL driver. |
| |
| ![Example module settings for a JDBC and PostgreSQL |
| configuration](/images/intellij/intellij-module-settings.png) |
| |
| |
| ## Running and Debugging in IntelliJ IDEA |
| |
| |
| ### Simulating `pio train` |
| |
| Create a new *Run/Debug Configuration* by going to *Run* > *Edit |
| Configurations...*. Click on the **+** button and select *Application*. Name it |
| `pio train` and put in the following: |
| |
| * Main class: |
| |
| ``` |
| org.apache.predictionio.workflow.CreateWorkflow |
| ``` |
| |
| * VM options: |
| |
| ``` |
| -Dspark.master=local -Dlog4j.configuration=file:/<your_pio_path>/conf/log4j.properties -Dpio.log.dir=<path_of_log_file> |
| ``` |
| |
| * Program arguments: |
| |
| ``` |
| --engine-id dummy --engine-version dummy --engine-variant engine.json --env dummy=dummy |
| ``` |
| |
| Make sure *Working directory* is set to the base directory of the template that |
| you are working on. |
| |
| Click the folder button to the right of *Environment variables*, and paste the |
| relevant values from `conf/pio-env.sh` in your PredictionIO installation |
| directory. The following shows an example using JDBC and PostgreSQL. |
| |
| ![Example environment variables |
| settings](/images/intellij/pio-train-env-vars.png) |
| |
| Make sure *Include dependencies with "Provided" scope* is checked. |
| |
| The end result should look something similar to this. |
| |
| ![Run Configuration](/images/intellij/pio-train.png) |
| |
| Save and you can run or debug `pio train` with the new configuration! |
| |
| |
| ### Simulating `pio deploy` |
| |
| For `pio deploy`, simply duplicate the previous configuration and replace with |
| the following. |
| |
| * Main class: |
| |
| ``` |
| org.apache.predictionio.workflow.CreateServer |
| ``` |
| |
| * Program Arguments: |
| |
| ``` |
| --engineInstanceId <id_from_pio_train> --engine-variant engine.json |
| ``` |
| |
| |
| ### Executing a Query |
| |
| You can execute a query with the correct SDK. For a recommender that has been |
| trained with the sample MovieLens dataset perhaps the easiest query is a `curl` |
| one. Start by running or debuging your `pio deploy` config so the service is |
| waiting for the query. Then go to the "Terminal" tab at the very bottom of the |
| IntelliJ IDEA window and enter the `curl` request: |
| |
| ``` |
| curl -H "Content-Type: application/json" -d '{ "user": "1", "num": 4 }' http://localhost:8000/queries.json |
| ``` |
| |
| This should return something like: |
| |
| ``` |
| {"itemScores":[ |
| {"item":"52","score":9.582509402541834}, |
| {"item":"95","score":8.017236650368387}, |
| {"item":"89","score":6.975951244053634}, |
| {"item":"34","score":6.857457277981334} |
| ]} |
| ``` |
| |
| INFO: If you hit a breakpoint you are likely to get a connection timeout. To see |
| the data that would have been returned, just place a breakpoint where the |
| response is created or run the query with no breakpoints. |