blob: b5dae35ef69817ca68bdecbba56a768374f04198 [file] [log] [blame]
---
title: Quick Start - Recommendation Engine Template
---
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
## Overview
This Recommendation Engine Template has integrated **Apache Spark MLlib**'s
Collaborative Filtering algorithm by default. You can customize it easily to fit
your specific needs.
We are going to show you how to create your own recommendation engine for
production use based on this template.
## Usage
### Event Data Requirements
By default, the template requires the following events to be collected:
- user 'rate' item events
- user 'buy' item events
NOTE: You can customize to use other event.
### Input Query
- user ID
- num of recommended items
### Output PredictedResult
- a ranked list of recommended itemIDs
## 1. Install and Run PredictionIO
<%= partial 'shared/quickstart/install' %>
## 2. Create a new Engine from an Engine Template
<%= partial 'shared/quickstart/create_engine', locals: { engine_name: 'MyRecommendation', template_name: 'Recommendation Engine Template', template_repo: 'apache/predictionio-template-recommender' } %>
## 3. Generate an App ID and Access Key
<%= partial 'shared/quickstart/create_app' %>
## 4. Collecting Data
Next, let's collect some training data. By default,
the Recommendation Engine Template supports 2 types of events: **rate** and
**buy**. A user can give a rating score to an item or buy an item. This template requires user-view-item and user-buy-item events.
INFO: This template can easily be customized to consider more user events such as *like*, *dislike* etc.
<%= partial 'shared/quickstart/collect_data' %>
### Example **rate** event
A user (ID "u0") gives an item (ID "i0") a rating of 5 at `2014-11-02T09:39:45.618-08:00` (current time will be used if eventTime is not specified)
Run the following `curl` command to send the `rate` event:
<div class="tabs">
<div data-tab="REST API" data-lang="json">
```
$ curl -i -X POST http://localhost:7070/events.json?accessKey=$ACCESS_KEY \
-H "Content-Type: application/json" \
-d '{
"event" : "rate",
"entityType" : "user",
"entityId" : "u0",
"targetEntityType" : "item",
"targetEntityId" : "i0",
"properties" : {
"rating" : 5
}
"eventTime" : "2014-11-02T09:39:45.618-08:00"
}'
```
</div>
<div data-tab="Python SDK" data-lang="python">
```python
import predictionio
client = predictionio.EventClient(
access_key=<ACCESS KEY>,
url=<URL OF EVENTSERVER>,
threads=5,
qsize=500
)
# A user rates an item
client.create_event(
event="rate",
entity_type="user",
entity_id=<USER ID>,
target_entity_type="item",
target_entity_id=<ITEM ID>,
properties= { "rating" : float(<RATING>) }
)
```
</div>
<div data-tab="PHP SDK" data-lang="php">
```php
<?php
require_once("vendor/autoload.php");
use predictionio\EventClient;
$client = new EventClient(<ACCESS KEY>, <URL OF EVENTSERVER>);
// A user rates an item
$client->createEvent(array(
'event' => 'rate',
'entityType' => 'user',
'entityId' => <USER ID>,
'targetEntityType' => 'item',
'targetEntityId' => <ITEM ID>,
'properties' => array('rating'=> <RATING>)
));
?>
```
</div>
<div data-tab="Ruby SDK" data-lang="ruby">
```ruby
# Create a client object.
client = PredictionIO::EventClient.new(<ACCESS KEY>, <URL OF EVENTSERVER>)
# A user rates an item.
client.create_event(
'rate',
'user',
<USER ID>, {
'targetEntityType' => 'item',
'targetEntityId' => <ITEM ID>,
'properties' => { 'rating' => <RATING (float)> }
}
)
```
</div>
<div data-tab="Java SDK" data-lang="java">
```java
import org.apache.predictionio.Event;
import org.apache.predictionio.EventClient;
EventClient client = new EventClient(<ACCESS KEY>, <URL OF EVENTSERVER>);
// A user rates an item
Event rateEvent = new Event()
.event("rate")
.entityType("user")
.entityId(<USER_ID>)
.targetEntityType("item")
.targetEntityId(<ITEM_ID>)
.property("rating", new Float(<RATING>));
client.createEvent(rateEvent);
```
</div>
</div>
### Example **buy** event
A user (ID "u1") buys an item (ID "i2") at `2014-11-10T12:34:56.123-08:00` (current time will be used if eventTime is not specified)
Run the following `curl` command to send the `buy` event:
<div class="tabs">
<div data-tab="REST API" data-lang="json">
```
$ curl -i -X POST http://localhost:7070/events.json?accessKey=$ACCESS_KEY \
-H "Content-Type: application/json" \
-d '{
"event" : "buy",
"entityType" : "user",
"entityId" : "u1",
"targetEntityType" : "item",
"targetEntityId" : "i2",
"eventTime" : "2014-11-10T12:34:56.123-08:00"
}'
```
</div>
<div data-tab="Python SDK" data-lang="python">
```python
# A user buys an item
client.create_event(
event="buy",
entity_type="user",
entity_id=<USER ID>,
target_entity_type="item",
target_entity_id=<ITEM ID>
)
```
</div>
<div data-tab="PHP SDK" data-lang="php">
```php
<?php
// A user buys an item
$client->createEvent(array(
'event' => 'buy',
'entityType' => 'user',
'entityId' => <USER ID>,
'targetEntityType' => 'item',
'targetEntityId' => <ITEM ID>
));
?>
```
</div>
<div data-tab="Ruby SDK" data-lang="ruby">
```ruby
# A user buys an item.
client.create_event(
'buy',
'user',
<USER ID>, {
'targetEntityType' => 'item',
'targetEntityId' => <ITEM ID>
}
)
```
</div>
<div data-tab="Java SDK" data-lang="java">
```java
// A user buys an item
Event buyEvent = new Event()
.event("buy")
.entityType("user")
.entityId(<USER_ID>)
.targetEntityType("item")
.targetEntityId(<ITEM_ID>);
client.createEvent(buyEvent);
```
</div>
</div>
<%= partial 'shared/quickstart/query_eventserver' %>
### Import More Sample Data
<%= partial 'shared/quickstart/import_sample_data' %>
A Python import script `import_eventserver.py` is provided in the template to import the data to
Event Server using Python SDK. Please upgrade to the latest Python SDK.
<%= partial 'shared/quickstart/install_python_sdk' %>
Execute the following to import the data:
WARNING: These commands must be executed in the Engine directory, for example: `MyRecomendation`.
```
$ cd MyRecommendation
$ curl https://raw.githubusercontent.com/apache/spark/master/data/mllib/sample_movielens_data.txt --create-dirs -o data/sample_movielens_data.txt
$ python data/import_eventserver.py --access_key $ACCESS_KEY
```
You should see the following output:
```
Importing data...
1501 events are imported.
```
Now the movie ratings data is stored as events inside the Event Store.
<%= partial 'shared/quickstart/query_eventserver_short' %>
INFO: By default, the template train the model with "rate" events (explicit rating). You can customize the engine to [read other custom events](/templates/recommendation/reading-custom-events/) and [handle events of implicit preference (such as, view, buy)](/templates/recommendation/training-with-implicit-preference/)
## 5. Deploy the Engine as a Service
<%= partial 'shared/quickstart/deploy_enginejson', locals: { engine_name: 'MyRecommendation' } %>
<%= partial 'shared/quickstart/deploy', locals: { engine_name: 'MyRecommendation' } %>
## 6. Use the Engine
Now, You can try to retrieve predicted results. To recommend 4 movies to user
whose id is 1, you send this JSON `{ "user": "1", "num": 4 }` to the deployed
engine and it will return a JSON of the recommended movies. Simply send a query
by making a HTTP request or through the `EngineClient` of an SDK.
With the deployed engine running, open another terminal and run the following `curl` command or use SDK to send the query:
<div class="tabs">
<div data-tab="REST API" data-lang="json">
```
$ curl -H "Content-Type: application/json" \
-d '{ "user": "1", "num": 4 }' http://localhost:8000/queries.json
```
</div>
<div data-tab="Python SDK" data-lang="python">
```python
import predictionio
engine_client = predictionio.EngineClient(url="http://localhost:8000")
print engine_client.send_query({"user": "1", "num": 4})
```
</div>
<div data-tab="PHP SDK" data-lang="php">
```php
<?php
require_once("vendor/autoload.php");
use predictionio\EngineClient;
$client = new EngineClient('http://localhost:8000');
$response = $client->sendQuery(array('user'=> 1, 'num'=> 4));
print_r($response);
?>
```
</div>
<div data-tab="Ruby SDK" data-lang="ruby">
```ruby
# Create client object.
client = PredictionIO::EngineClient.new(<ENGINE DEPLOY URL>)
# Query PredictionIO.
response = client.send_query('user' => <USER ID>, 'num' => <NUMBER (integer)>)
puts response
```
</div>
<div data-tab="Java SDK" data-lang="java">
```java
import com.google.common.collect.ImmutableMap;
import com.google.gson.JsonObject;
import org.apache.predictionio.EngineClient;
// create client object
EngineClient engineClient = new EngineClient(<ENGINE DEPLOY URL>);
// query
JsonObject response = engineClient.sendQuery(ImmutableMap.<String, Object>of(
"user", "1",
"num", 4
));
```
</div>
</div>
The following is sample JSON response:
```
{
"itemScores":[
{"item":"22","score":4.072304374729956},
{"item":"62","score":4.058482414005789},
{"item":"75","score":4.046063009943821},
{"item":"68","score":3.8153661512945325}
]
}
```
Congratulations, *MyRecommendation* is now running!
<%= partial 'shared/quickstart/production' %>
#### [Next: DASE Components Explained](/templates/recommendation/dase/)