commit | 61bf40c03c6507143db681669b4399eceb9b5616 | [log] [tgz] |
---|---|---|
author | DO YUNG YOON <steamshon@apache.org> | Fri Aug 11 13:37:48 2017 +0900 |
committer | DO YUNG YOON <steamshon@apache.org> | Fri Aug 11 13:37:48 2017 +0900 |
tree | d55cf911d71d4bba82f9dd19174641f35bce7392 | |
parent | 542a3f76aef9399cc94b85667c54a8894ba07e9b [diff] |
[S2GRAPH-159]: simplified path resolution that fixes the issue JIRA: [S2GRAPH-159] https://issues.apache.org/jira/browse/S2GRAPH-159 Pull Request: Closes #118 Author Sergio Fernández <sergio@wikier.org>
S2Graph is a graph database designed to handle transactional graph processing at scale. Its REST API allows you to store, manage and query relational information using edge and vertex representations in a fully asynchronous and non-blocking manner. This document covers some basic concepts and terms of S2Graph as well as help you get a feel for the S2Graph API.
To build S2Graph from the source, install the JDK 8 and SBT, and run the following command in the project root:
sbt package
This will create a distribution of S2Graph that is ready to be deployed.
One can find distribution on target/apache-s2graph-$version-incubating-bin
.
Once extracted the downloaded binary release of S2Graph or built from the source as described above, the following files and directories should be found in the directory.
DISCLAIMER LICENCE # the Apache License 2.0 NOTICE bin # scripts to manage the lifecycle of S2Graph conf # configuration files lib # contains the binary logs # application logs var # application data
This directory layout contains all binary and scripts required to launch S2Graph. The directories logs
and var
may not be present initially, and are created once S2Graph is launched.
The following will launch S2Graph, using HBase in the standalone mode for data storage and H2 as the metadata storage.
sh bin/start-s2graph.sh
To connect to a remote HBase cluster or use MySQL as the metastore, refer to the instructions in conf/application.conf
. S2Graph is tested on HBase versions 0.98, 1.0, 1.1, and 1.2 (https://hub.docker.com/r/harisekhon/hbase/tags/).
Here is what you can find in each subproject.
s2core
: The core library, containing the data abstractions for graph entities, storage adapters and utilities.s2rest_play
: The REST server built with Play framework, providing the write and query API.s2rest_netty
: The REST server built directly using Netty, implementing only the query API.loader
: A collection of Spark jobs for bulk loading streaming data into S2Graph.spark
: Spark utilities for loader
and s2counter_loader
.s2counter_core
: The core library providing data structures and logics for s2counter_loader
.s2counter_loader
: Spark streaming jobs that consume Kafka WAL logs and calculate various top-K results on-the-fly.s2graph_gremlin
: Gremlin plugin for tinkerpop users.The first three projects are for OLTP-style workloads, currently the main target of S2Graph. The other four projects could be helpful for OLAP-style or streaming workloads, especially for integrating S2Graph with Apache Spark and/or Kafka. Note that, the latter four projects are currently out-of-date, which we are planning to update and provide documentations in the upcoming releases.
Once the S2Graph server has been set up, you can now start to send HTTP queries to the server to create a graph and pour some data in it. This tutorial goes over a simple toy problem to get a sense of how S2Graph's API looks like. bin/example.sh
contains the example code below.
The toy problem is to create a timeline feature for a simple social media, like a simplified version of Facebook‘s timeline:stuck_out_tongue_winking_eye:. Using simple S2Graph queries it is possible to keep track of each user’s friends and their posts.
The following POST query will create a service named “KakaoFavorites”.
curl -XPOST localhost:9000/graphs/createService -H 'Content-Type: Application/json' -d ' {"serviceName": "KakaoFavorites", "compressionAlgorithm" : "gz"} '
To make sure the service is created correctly, check out the following.
curl -XGET localhost:9000/graphs/getService/KakaoFavorites
In S2Graph, relationships are organized as labels. Create a label called friends
using the following createLabel
API call:
curl -XPOST localhost:9000/graphs/createLabel -H 'Content-Type: Application/json' -d ' { "label": "friends", "srcServiceName": "KakaoFavorites", "srcColumnName": "userName", "srcColumnType": "string", "tgtServiceName": "KakaoFavorites", "tgtColumnName": "userName", "tgtColumnType": "string", "isDirected": "false", "indices": [], "props": [], "consistencyLevel": "strong" } '
Check if the label has been created correctly:+
curl -XGET localhost:9000/graphs/getLabel/friends
Now that the label friends
is ready, we can store the friendship data. Entries of a label are called edges, and you can add edges with edges/insert
API:
curl -XPOST localhost:9000/graphs/edges/insert -H 'Content-Type: Application/json' -d ' [ {"from":"Elmo","to":"Big Bird","label":"friends","props":{},"timestamp":1444360152477}, {"from":"Elmo","to":"Ernie","label":"friends","props":{},"timestamp":1444360152478}, {"from":"Elmo","to":"Bert","label":"friends","props":{},"timestamp":1444360152479}, {"from":"Cookie Monster","to":"Grover","label":"friends","props":{},"timestamp":1444360152480}, {"from":"Cookie Monster","to":"Kermit","label":"friends","props":{},"timestamp":1444360152481}, {"from":"Cookie Monster","to":"Oscar","label":"friends","props":{},"timestamp":1444360152482} ] '
Query friends of Elmo with getEdges
API:
curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Elmo"}], "steps": [ {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]} ] } '
Now query friends of Cookie Monster:
curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Cookie Monster"}], "steps": [ {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]} ] } '
We will need a new label post
for this data:
curl -XPOST localhost:9000/graphs/createLabel -H 'Content-Type: Application/json' -d ' { "label": "post", "srcServiceName": "KakaoFavorites", "srcColumnName": "userName", "srcColumnType": "string", "tgtServiceName": "KakaoFavorites", "tgtColumnName": "url", "tgtColumnType": "string", "isDirected": "true", "indices": [], "props": [], "consistencyLevel": "strong" } '
Now, insert some posts of the users:
curl -XPOST localhost:9000/graphs/edges/insert -H 'Content-Type: Application/json' -d ' [ {"from":"Big Bird","to":"www.kakaocorp.com/en/main","label":"post","props":{},"timestamp":1444360152477}, {"from":"Big Bird","to":"github.com/kakao/s2graph","label":"post","props":{},"timestamp":1444360152478}, {"from":"Ernie","to":"groups.google.com/forum/#!forum/s2graph","label":"post","props":{},"timestamp":1444360152479}, {"from":"Grover","to":"hbase.apache.org/forum/#!forum/s2graph","label":"post","props":{},"timestamp":1444360152480}, {"from":"Kermit","to":"www.playframework.com","label":"post","props":{},"timestamp":1444360152481}, {"from":"Oscar","to":"www.scala-lang.org","label":"post","props":{},"timestamp":1444360152482} ] '
Query posts of Big Bird:
curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Big Bird"}], "steps": [ {"step": [{"label": "post", "direction": "out", "offset": 0, "limit": 10}]} ] } '
friends
and post
, and stored some edges to them.+This should be enough for creating the timeline feature! The following two-step query will return the URLs for Elmo‘s timeline, which are the posts of Elmo’s friends:
curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Elmo"}], "steps": [ {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]}, {"step": [{"label": "post", "direction": "out", "offset": 0, "limit": 10}]} ] } '
Also try Cookie Monster's timeline:
curl -XPOST localhost:9000/graphs/getEdges -H 'Content-Type: Application/json' -d ' { "srcVertices": [{"serviceName": "KakaoFavorites", "columnName": "userName", "id":"Cookie Monster"}], "steps": [ {"step": [{"label": "friends", "direction": "out", "offset": 0, "limit": 10}]}, {"step": [{"label": "post", "direction": "out", "offset": 0, "limit": 10}]} ] } '
The example above is by no means a full blown social network timeline, but it gives you an idea of how to represent, store and query graph data with S2Graph.+