| //// |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| //// |
| [[graph]] |
| The Graph |
| ========= |
| |
| image::gremlin-standing.png[width=125] |
| |
| Features |
| -------- |
| |
| A `Feature` implementation describes the capabilities of a `Graph` instance. This interface is implemented by graph |
| system providers for two purposes: |
| |
| . It tells users the capabilities of their `Graph` instance. |
| . It allows the features they do comply with to be tested against the Gremlin Test Suite - tests that do not comply are "ignored"). |
| |
| The following example in the Gremlin Console shows how to print all the features of a `Graph`: |
| |
| [gremlin-groovy] |
| ---- |
| graph = TinkerGraph.open() |
| graph.features() |
| ---- |
| |
| A common pattern for using features is to check their support prior to performing an operation: |
| |
| [gremlin-groovy] |
| ---- |
| graph.features().graph().supportsTransactions() |
| graph.features().graph().supportsTransactions() ? g.tx().commit() : "no tx" |
| ---- |
| |
| TIP: To ensure provider agnostic code, always check feature support prior to usage of a particular function. In that |
| way, the application can behave gracefully in case a particular implementation is provided at runtime that does not |
| support a function being accessed. |
| |
| WARNING: Assignments of a `GraphStrategy` can alter the base features of a `Graph` in dynamic ways, such that checks |
| against a `Feature` may not always reflect the behavior exhibited when the `GraphStrategy` is in use. |
| |
| [[vertex-properties]] |
| Vertex Properties |
| ----------------- |
| |
| image:vertex-properties.png[width=215,float=left] TinkerPop3 introduces the concept of a `VertexProperty<V>`. All the |
| properties of a `Vertex` are a `VertexProperty`. A `VertexProperty` implements `Property` and as such, it has a |
| key/value pair. However, `VertexProperty` also implements `Element` and thus, can have a collection of key/value |
| pairs. Moreover, while an `Edge` can only have one property of key "name" (for example), a `Vertex` can have multiple |
| "name" properties. With the inclusion of vertex properties, two features are introduced which ultimately advance the |
| graph modelers toolkit: |
| |
| . Multiple properties (*multi-properties*): a vertex property key can have multiple values. For example, a vertex can have |
| multiple "name" properties. |
| . Properties on properties (*meta-properties*): a vertex property can have properties (i.e. a vertex property can |
| have key/value data associated with it). |
| |
| Possible use cases for meta-properties: |
| |
| . *Permissions*: Vertex properties can have key/value ACL-type permission information associated with them. |
| . *Auditing*: When a vertex property is manipulated, it can have key/value information attached to it saying who the |
| creator, deletor, etc. are. |
| . *Provenance*: The "name" of a vertex can be declared by multiple users. For example, there may be multiple spellings |
| of a name from different sources. |
| |
| A running example using vertex properties is provided below to demonstrate and explain the API. |
| |
| [gremlin-groovy] |
| ---- |
| graph = TinkerGraph.open() |
| g = graph.traversal(standard()) |
| v = g.addV('name','marko','name','marko a. rodriguez').next() |
| g.V(v).properties('name').count() <1> |
| v.property(list, 'name', 'm. a. rodriguez') <2> |
| g.V(v).properties('name').count() |
| g.V(v).properties() |
| g.V(v).properties('name') |
| g.V(v).properties('name').hasValue('marko') |
| g.V(v).properties('name').hasValue('marko').property('acl','private') <3> |
| g.V(v).properties('name').hasValue('marko a. rodriguez') |
| g.V(v).properties('name').hasValue('marko a. rodriguez').property('acl','public') |
| g.V(v).properties('name').has('acl','public').value() |
| g.V(v).properties('name').has('acl','public').drop() <4> |
| g.V(v).properties('name').has('acl','public').value() |
| g.V(v).properties('name').has('acl','private').value() |
| g.V(v).properties() |
| g.V(v).properties().properties() <5> |
| g.V(v).properties().property('date',2014) <6> |
| g.V(v).properties().property('creator','stephen') |
| g.V(v).properties().properties() |
| g.V(v).properties('name').valueMap() |
| g.V(v).property('name','okram') <7> |
| g.V(v).properties('name') |
| g.V(v).values('name') <8> |
| ---- |
| |
| <1> A vertex can have zero or more properties with the same key associated with it. |
| <2> If a property is added with a cardinality of `Cardinality.list`, an additional property with the provided key will be added. |
| <3> A vertex property can have standard key/value properties attached to it. |
| <4> Vertex property removal is identical to property removal. |
| <5> It is property to get the properties of a vertex property. |
| <6> A vertex property can have any number of key/value properties attached to it. |
| <7> `property(...)` will remove all existing key'd properties before adding the new single property (see `VertexProperty.Cardinality`). |
| <8> If only the value of a property is needed, then `values()` can be used. |
| |
| If the concept of vertex properties is difficult to grasp, then it may be best to think of vertex properties in terms |
| of "literal vertices." A vertex can have an edge to a "literal vertex" that has a single value key/value -- e.g. |
| "value=okram." The edge that points to that literal vertex has an edge-label of "name." The properties on the edge |
| represent the literal vertex's properties. The "literal vertex" can not have any other edges to it (only one from the |
| associated vertex). |
| |
| [[the-crew-toy-graph]] |
| TIP: A toy graph demonstrating all of the new TinkerPop3 graph structure features is available at |
| `TinkerFactory.createTheCrew()` and `data/tinkerpop-crew*`. This graph demonstrates multi-properties and meta-properties. |
| |
| .TinkerPop Crew |
| image::the-crew-graph.png[width=685] |
| |
| [gremlin-groovy,theCrew] |
| ---- |
| g.V().as('a'). |
| properties('location').as('b'). |
| hasNot('endTime').as('c'). |
| select('a','b','c').by('name').by(value).by('startTime') // determine the current location of each person |
| g.V().has('name','gremlin').inE('uses'). |
| order().by('skill',incr).as('a'). |
| outV().as('b'). |
| select('a','b').by('skill').by('name') // rank the users of gremlin by their skill level |
| ---- |
| |
| Graph Variables |
| --------------- |
| |
| TinkerPop3 introduces the concept of `Graph.Variables`. Variables are key/value pairs associated with the graph |
| itself -- in essence, a `Map<String,Object>`. These variables are intended to store metadata about the graph. Example |
| use cases include: |
| |
| * *Schema information*: What do the namespace prefixes resolve to and when was the schema last modified? |
| * *Global permissions*: What are the access rights for particular groups? |
| * *System user information*: Who are the admins of the system? |
| |
| An example of graph variables in use is presented below: |
| |
| [gremlin-groovy] |
| ---- |
| graph = TinkerGraph.open() |
| graph.variables() |
| graph.variables().set('systemAdmins',['stephen','peter','pavel']) |
| graph.variables().set('systemUsers',['matthias','marko','josh']) |
| graph.variables().keys() |
| graph.variables().get('systemUsers') |
| graph.variables().get('systemUsers').get() |
| graph.variables().remove('systemAdmins') |
| graph.variables().keys() |
| ---- |
| |
| IMPORTANT: Graph variables are not intended to be subject to heavy, concurrent mutation nor to be used in complex |
| computations. The intention is to have a location to store data about the graph for administrative purposes. |
| |
| [[transactions]] |
| Graph Transactions |
| ------------------ |
| |
| image:gremlin-coins.png[width=100,float=right] A link:http://en.wikipedia.org/wiki/Database_transaction[database transaction] |
| represents a unit of work to execute against the database. Transactions are controlled by an implementation of the |
| `Transaction` interface and that object can be obtained from the `Graph` interface using the `tx()` method. It is |
| important to note that the `Transaction` object does not represent a "transaction" itself. It merely exposes the |
| methods for working with transactions (e.g. committing, rolling back, etc). |
| |
| Most `Graph` implementations that `supportsTransactions` will implement an "automatic" `ThreadLocal` transaction, |
| which means that when a read or write occurs after the `Graph` is instantiated, a transaction is automatically |
| started within that thread. There is no need to manually call a method to "create" or "start" a transaction. Simply |
| modify the graph as required and call `graph.tx().commit()` to apply changes or `graph.tx().rollback()` to undo them. |
| When the next read or write action occurs against the graph, a new transaction will be started within that current |
| thread of execution. |
| |
| When using transactions in this fashion, especially in web application (e.g. REST server), it is important to ensure |
| that transactions do not leak from one request to the next. In other words, unless a client is somehow bound via |
| session to process every request on the same server thread, every request must be committed or rolled back at the end |
| of the request. By ensuring that the request encapsulates a transaction, it ensures that a future request processed |
| on a server thread is starting in a fresh transactional state and will not have access to the remains of one from an |
| earlier request. A good strategy is to rollback a transaction at the start of a request, so that if it so happens that |
| a transactional leak does occur between requests somehow, a fresh transaction is assured by the fresh request. |
| |
| TIP: The `tx()` method is on the `Graph` interface, but it is also available on the `TraversalSource` spawned from a |
| `Graph`. Calls to `TraversalSource.tx()` are proxied through to the underlying `Graph` as a convenience. |
| |
| Configuring |
| ~~~~~~~~~~~ |
| |
| Determining when a transaction starts is dependent upon the behavior assigned to the `Transaction`. It is up to the |
| `Graph` implementation to determine the default behavior and unless the implementation doesn't allow it, the behavior |
| itself can be altered via these `Transaction` methods: |
| |
| [source,java] |
| ---- |
| public Transaction onReadWrite(final Consumer<Transaction> consumer); |
| |
| public Transaction onClose(final Consumer<Transaction> consumer); |
| ---- |
| |
| Providing a `Consumer` function to `onReadWrite` allows definition of how a transaction starts when a read or a write |
| occurs. `Transaction.READ_WRITE_BEHAVIOR` contains pre-defined `Consumer` functions to supply to the `onReadWrite` |
| method. It has two options: |
| |
| * `AUTO` - automatic transactions where the transaction is started implicitly to the read or write operation |
| * `MANUAL` - manual transactions where it is up to the user to explicitly open a transaction, throwing an exception |
| if the transaction is not open |
| |
| Providing a `Consumer` function to `onClose` allows configuration of how a transaction is handled when |
| `Transaction.close()` is called. `Transaction.CLOSE_BEHAVIOR` has several pre-defined options that can be supplied to |
| this method: |
| |
| * `COMMIT` - automatically commit an open transaction |
| * `ROLLBACK` - automatically rollback an open transaction |
| * `MANUAL` - throw an exception if a transaction is open, forcing the user to explicitly close the transaction |
| |
| IMPORTANT: As transactions are `ThreadLocal` in nature, so are the transaction configurations for `onReadWrite` and |
| `onClose`. |
| |
| Once there is an understanding for how transactions are configured, most of the rest of the `Transaction` interface |
| is self-explanatory. Note that <<neo4j-gremlin,Neo4j-Gremlin>> is used for the examples to follow as TinkerGraph does |
| not support transactions. |
| |
| [source,groovy] |
| ---- |
| gremlin> graph = Neo4jGraph.open('/tmp/neo4j') |
| ==>neo4jgraph[EmbeddedGraphDatabase [/tmp/neo4j]] |
| gremlin> graph.features() |
| ==>FEATURES |
| > GraphFeatures |
| >-- Transactions: true <1> |
| >-- Computer: false |
| >-- Persistence: true |
| ... |
| gremlin> graph.tx().onReadWrite(Transaction.READ_WRITE_BEHAVIOR.AUTO) <2> |
| ==>org.apache.tinkerpop.gremlin.neo4j.structure.Neo4jGraph$Neo4jTransaction@1c067c0d |
| gremlin> graph.addVertex("name","stephen") <3> |
| ==>v[0] |
| gremlin> graph.tx().commit() <4> |
| ==>null |
| gremlin> graph.tx().onReadWrite(Transaction.READ_WRITE_BEHAVIOR.MANUAL) <5> |
| ==>org.apache.tinkerpop.gremlin.neo4j.structure.Neo4jGraph$Neo4jTransaction@1c067c0d |
| gremlin> graph.tx().isOpen() |
| ==>false |
| gremlin> graph.addVertex("name","marko") <6> |
| Open a transaction before attempting to read/write the transaction |
| gremlin> graph.tx().open() <7> |
| ==>null |
| gremlin> graph.addVertex("name","marko") <8> |
| ==>v[1] |
| gremlin> graph.tx().commit() |
| ==>null |
| ---- |
| |
| <1> Check `features` to ensure that the graph supports transactions. |
| <2> By default, `Neo4jGraph` is configured with "automatic" transactions, so it is set here for demonstration purposes only. |
| <3> When the vertex is added, the transaction is automatically started. From this point, more mutations can be staged |
| or other read operations executed in the context of that open transaction. |
| <4> Calling `commit` finalizes the transaction. |
| <5> Change transaction behavior to require manual control. |
| <6> Adding a vertex now results in failure because the transaction was not explicitly opened. |
| <7> Explicitly open a transaction. |
| <8> Adding a vertex now succeeds as the transaction was manually opened. |
| |
| NOTE: It may be important to consult the documentation of the `Graph` implementation you are using when it comes to the |
| specifics of how transactions will behave. TinkerPop allows some latitude in this area and implementations may not have |
| the exact same behaviors and link:https://en.wikipedia.org/wiki/ACID[ACID] guarantees. |
| |
| Retries |
| ~~~~~~~ |
| |
| There are times when transactions fail. Failure may be indicative of some permanent condition, but other failures |
| might simply require the transaction to be retried for possible future success. The `Transaction` object also exposes |
| a method for executing automatic transaction retries: |
| |
| [gremlin-groovy] |
| ---- |
| graph = Neo4jGraph.open('/tmp/neo4j') |
| graph.tx().submit {it.addVertex("name","josh")}.retry(10) |
| graph.tx().submit {it.addVertex("name","daniel")}.exponentialBackoff(10) |
| graph.close() |
| ---- |
| |
| As shown above, the `submit` method takes a `Function<Graph, R>` which is the unit of work to execute and possibly |
| retry on failure. The method returns a `Transaction.Workload` object which has a number of default methods for common |
| retry strategies. It is also possible to supply a custom retry function if a default one does not suit the required |
| purpose. |
| |
| Threaded Transactions |
| ~~~~~~~~~~~~~~~~~~~~~ |
| |
| Most `Graph` implementations that support transactions do so in a `ThreadLocal` manner, where the current transaction |
| is bound to the current thread of execution. Consider the following example to demonstrate: |
| |
| [source,java] |
| ---- |
| graph.addVertex("name","stephen"); |
| |
| Thread t1 = new Thread(() -> { |
| graph.addVertex("name","josh"); |
| }); |
| |
| Thread t2 = new Thread(() -> { |
| graph.addVertex("name","marko"); |
| }); |
| |
| t1.start() |
| t2.start() |
| |
| t1.join() |
| t2.join() |
| |
| graph.tx().commit(); |
| ---- |
| |
| The above code shows three vertices added to `graph` in three different threads: the current thread, `t1` and |
| `t2`. One might expect that by the time this body of code finished executing, that there would be three vertices |
| persisted to the `Graph`. However, given the `ThreadLocal` nature of transactions, there really were three separate |
| transactions created in that body of code (i.e. one for each thread of execution) and the only one committed was the |
| first call to `addVertex` in the primary thread of execution. The other two calls to that method within `t1` and `t2` |
| were never committed and thus orphaned. |
| |
| A `Graph` that `supportsThreadedTransactions` is one that allows for a `Graph` to operate outside of that constraint, |
| thus allowing multiple threads to operate within the same transaction. Therefore, if there was a need to have three |
| different threads operating within the same transaction, the above code could be re-written as follows: |
| |
| [source,java] |
| ---- |
| Graph threaded = graph.tx().createThreadedTx(); |
| threaded.addVertex("name","stephen"); |
| |
| Thread t1 = new Thread(() -> { |
| threaded.addVertex("name","josh"); |
| }); |
| |
| Thread t2 = new Thread(() -> { |
| threaded.addVertex("name","marko"); |
| }); |
| |
| t1.start() |
| t2.start() |
| |
| t1.join() |
| t2.join() |
| |
| threaded.tx().commit(); |
| ---- |
| |
| In the above case, the call to `graph.tx().createThreadedTx()` creates a new `Graph` instance that is unbound from the |
| `ThreadLocal` transaction, thus allowing each thread to operate on it in the same context. In this case, there would |
| be three separate vertices persisted to the `Graph`. |
| |
| Gremlin I/O |
| ----------- |
| |
| image:gremlin-io.png[width=250,float=right] The task of getting data in and out of `Graph` instances is the job of |
| the Gremlin I/O packages. Gremlin I/O provides two interfaces for reading and writing `Graph` instances: `GraphReader` |
| and `GraphWriter`. These interfaces expose methods that support: |
| |
| * Reading and writing an entire `Graph` |
| * Reading and writing a `Traversal<Vertex>` as adjacency list format |
| * Reading and writing a single `Vertex` (with and without associated `Edge` objects) |
| * Reading and writing a single `Edge` |
| * Reading and writing a single `VertexProperty` |
| * Reading and writing a single `Property` |
| * Reading and writing an arbitrary `Object` |
| |
| In all cases, these methods operate in the currency of `InputStream` and `OutputStream` objects, allowing graphs and |
| their related elements to be written to and read from files, byte arrays, etc. The `Graph` interface offers the `io` |
| method, which provides access to "reader/writer builder" objects that are pre-configured with serializers provided by |
| the `Graph`, as well as helper methods for the various I/O capabilities. Unless there are very advanced requirements |
| for the serialization process, it is always best to utilize the methods on the `Io` interface to construct |
| `GraphReader` and `GraphWriter` instances, as the implementation may provide some custom settings that would otherwise |
| have to be configured manually by the user to do the serialization. |
| |
| It is up to the implementations of the `GraphReader` and `GraphWriter` interfaces to choose the methods they |
| implement and the manner in which they work together. The only characteristic enforced and expected is that the write |
| methods should produce output that is compatible with the corresponding read method. For example, the output of |
| `writeVertices` should be readable as input to `readVertices` and the output of `writeProperty` should be readable as |
| input to `readProperty`. |
| |
| GraphML Reader/Writer |
| ~~~~~~~~~~~~~~~~~~~~~ |
| |
| image:gremlin-graphml.png[width=350,float=left] The link:http://graphml.graphdrawing.org/[GraphML] file format is a |
| common XML-based representation of a graph. It is widely supported by graph-related tools and libraries making it a |
| solid interchange format for TinkerPop. In other words, if the intent is to work with graph data in conjunction with |
| applications outside of TinkerPop, GraphML may be the best choice to do that. Common use cases might be: |
| |
| * Generate a graph using link:https://networkx.github.io/[NetworkX], export it with GraphML and import it to TinkerPop. |
| * Produce a subgraph and export it to GraphML to be consumed by and visualized in link:https://gephi.org/[Gephi]. |
| * Migrate the data of an entire graph to a different graph database not supported by TinkerPop. |
| |
| As GraphML is a specification for the serialization of an entire graph and not the individual elements of a graph, |
| methods that support input and output of single vertices, edges, etc. are not supported. |
| |
| WARNING: GraphML is a "lossy" format in that it only supports primitive values for properties and does not have |
| support for `Graph` variables. It will use `toString` to serialize property values outside of those primitives. |
| |
| WARNING: GraphML as a specification allows for `<edge>` and `<node>` elements to appear in any order. Most software |
| that writes GraphML (including as TinkerPop's `GraphMLWriter`) write `<node>` elements before `<edge>` elements. However it |
| is important to note that `GraphMLReader` will read this data in order and order can matter. This is because TinkerPop |
| does not allow the vertex label to be changed after the vertex has been created. Therefore, if an `<edge>` element |
| comes before the `<node>`, the label on the vertex will be ignored. It is thus better to order `<node>` elements in the |
| GraphML to appear before all `<edge>` elements if vertex labels are important to the graph. |
| |
| The following code shows how to write a `Graph` instance to file called `tinkerpop-modern.xml` and then how to read |
| that file back into a different instance: |
| |
| [source,java] |
| ---- |
| final Graph graph = TinkerFactory.createModern(); |
| graph.io(IoCore.graphml()).writeGraph("tinkerpop-modern.xml"); |
| final Graph newGraph = TinkerGraph.open(); |
| newGraph.io(IoCore.graphml()).readGraph("tinkerpop-modern.xml"); |
| ---- |
| |
| If a custom configuration is required, then have the `Graph` generate a `GraphReader` or `GraphWriter` "builder" instance: |
| |
| [source,java] |
| ---- |
| final Graph graph = TinkerFactory.createModern(); |
| try (final OutputStream os = new FileOutputStream("tinkerpop-modern.xml")) { |
| graph.io(IoCore.graphml()).writer().normalize(true).create().writeGraph(os, graph); |
| } |
| |
| final Graph newGraph = TinkerGraph.open(); |
| try (final InputStream stream = new FileInputStream("tinkerpop-modern.xml")) { |
| newGraph.io(IoCore.graphml()).reader().vertexIdKey("name").create().readGraph(stream, newGraph); |
| } |
| ---- |
| |
| [[graphson-reader-writer]] |
| GraphSON Reader/Writer |
| ~~~~~~~~~~~~~~~~~~~~~~ |
| |
| image:gremlin-graphson.png[width=350,float=left] GraphSON is a link:http://json.org/[JSON]-based format extended |
| from earlier versions of TinkerPop. It is important to note that TinkerPop3's GraphSON is not backwards compatible |
| with prior TinkerPop GraphSON versions. GraphSON has some support from graph-related application outside of TinkerPop, |
| but it is generally best used in two cases: |
| |
| * A text format of the graph or its elements is desired (e.g. debugging, usage in source control, etc.) |
| * The graph or its elements need to be consumed by code that is not JVM-based (e.g. JavaScript, Python, .NET, etc.) |
| |
| GraphSON supports all of the `GraphReader` and `GraphWriter` interface methods and can therefore read or write an |
| entire `Graph`, vertices, arbitrary objects, etc. The following code shows how to write a `Graph` instance to file |
| called `tinkerpop-modern.json` and then how to read that file back into a different instance: |
| |
| [source,java] |
| ---- |
| final Graph graph = TinkerFactory.createModern(); |
| graph.io(IoCore.graphson()).writeGraph("tinkerpop-modern.json"); |
| |
| final Graph newGraph = TinkerGraph.open(); |
| newGraph.io(IoCore.graphson()).readGraph("tinkerpop-modern.json"); |
| ---- |
| |
| If a custom configuration is required, then have the `Graph` generate a `GraphReader` or `GraphWriter` "builder" instance: |
| |
| [source,java] |
| ---- |
| final Graph graph = TinkerFactory.createModern(); |
| try (final OutputStream os = new FileOutputStream("tinkerpop-modern.json")) { |
| final GraphSONMapper mapper = graph.io(IoCore.graphson()).mapper().normalize(true).create() |
| graph.io(IoCore.graphson()).writer().mapper(mapper).create().writeGraph(os, graph) |
| } |
| |
| final Graph newGraph = TinkerGraph.open(); |
| try (final InputStream stream = new FileInputStream("tinkerpop-modern.json")) { |
| newGraph.io(IoCore.graphson()).reader().vertexIdKey("name").create().readGraph(stream, newGraph); |
| } |
| ---- |
| |
| One of the important configuration options of the `GraphSONReader` and `GraphSONWriter` is the ability to embed type |
| information into the output. By embedding the types, it becomes possible to serialize a graph without losing type |
| information that might be important when being consumed by another source. The importance of this concept is |
| demonstrated in the following example where a single `Vertex` is written to GraphSON using the Gremlin Console: |
| |
| [gremlin-groovy] |
| ---- |
| graph = TinkerFactory.createModern() |
| g = graph.traversal() |
| f = new FileOutputStream("vertex-1.json") |
| graph.io(graphson()).writer().create().writeVertex(f, g.V(1).next(), BOTH) |
| f.close() |
| ---- |
| |
| The following GraphSON example shows the output of `GraphSonWriter.writeVertex()` with associated edges: |
| |
| [source,json] |
| ---- |
| { |
| "id": 1, |
| "label": "person", |
| "outE": { |
| "created": [ |
| { |
| "id": 9, |
| "inV": 3, |
| "properties": { |
| "weight": 0.4 |
| } |
| } |
| ], |
| "knows": [ |
| { |
| "id": 7, |
| "inV": 2, |
| "properties": { |
| "weight": 0.5 |
| } |
| }, |
| { |
| "id": 8, |
| "inV": 4, |
| "properties": { |
| "weight": 1 |
| } |
| } |
| ] |
| }, |
| "properties": { |
| "name": [ |
| { |
| "id": 0, |
| "value": "marko" |
| } |
| ], |
| "age": [ |
| { |
| "id": 1, |
| "value": 29 |
| } |
| ] |
| } |
| } |
| ---- |
| |
| The vertex properly serializes to valid JSON but note that a consuming application will not automatically know how to |
| interpret the numeric values. In coercing those Java values to JSON, such information is lost. |
| |
| With a minor change to the construction of the `GraphSONWriter` the lossy nature of GraphSON can be avoided: |
| |
| [gremlin-groovy] |
| ---- |
| graph = TinkerFactory.createModern() |
| g = graph.traversal() |
| f = new FileOutputStream("vertex-1.json") |
| mapper = graph.io(graphson()).mapper().embedTypes(true).create() |
| graph.io(graphson()).writer().mapper(mapper).create().writeVertex(f, g.V(1).next(), BOTH) |
| f.close() |
| ---- |
| |
| In the above code, the `embedTypes` option is set to `true` and the output below shows the difference in the output: |
| |
| [source,json] |
| ---- |
| { |
| "@class": "java.util.HashMap", |
| "id": 1, |
| "label": "person", |
| "outE": { |
| "@class": "java.util.HashMap", |
| "created": [ |
| "java.util.ArrayList", |
| [ |
| { |
| "@class": "java.util.HashMap", |
| "id": 9, |
| "inV": 3, |
| "properties": { |
| "@class": "java.util.HashMap", |
| "weight": 0.4 |
| } |
| } |
| ] |
| ], |
| "knows": [ |
| "java.util.ArrayList", |
| [ |
| { |
| "@class": "java.util.HashMap", |
| "id": 7, |
| "inV": 2, |
| "properties": { |
| "@class": "java.util.HashMap", |
| "weight": 0.5 |
| } |
| }, |
| { |
| "@class": "java.util.HashMap", |
| "id": 8, |
| "inV": 4, |
| "properties": { |
| "@class": "java.util.HashMap", |
| "weight": 1 |
| } |
| } |
| ] |
| ] |
| }, |
| "properties": { |
| "@class": "java.util.HashMap", |
| "name": [ |
| "java.util.ArrayList", |
| [ |
| { |
| "@class": "java.util.HashMap", |
| "id": [ |
| "java.lang.Long", |
| 0 |
| ], |
| "value": "marko" |
| } |
| ] |
| ], |
| "age": [ |
| "java.util.ArrayList", |
| [ |
| { |
| "@class": "java.util.HashMap", |
| "id": [ |
| "java.lang.Long", |
| 1 |
| ], |
| "value": 29 |
| } |
| ] |
| ] |
| } |
| } |
| ---- |
| |
| The ambiguity of components of the GraphSON is now removed by the `@class` property, which contains Java class |
| information for the data it is associated with. The `@class` property is used for all non-final types, with the |
| exception of a small number of "natural" types (String, Boolean, Integer, and Double) which can be correctly inferred |
| from JSON typing. While the output is more verbose, it comes with the security of not losing type information. While |
| non-JVM languages won't be able to consume this information automatically, at least there is a hint as to how the |
| values should be coerced back into the correct types in the target language. |
| |
| [[gryo-reader-writer]] |
| Gryo Reader/Writer |
| ~~~~~~~~~~~~~~~~~~ |
| |
| image:gremlin-kryo.png[width=400,float=left] link:https://github.com/EsotericSoftware/kryo[Kryo] is a popular |
| serialization package for the JVM. Gremlin-Kryo is a binary `Graph` serialization format for use on the JVM by JVM |
| languages. It is designed to be space efficient, non-lossy and is promoted as the standard format to use when working |
| with graph data inside of the TinkerPop stack. A list of common use cases is presented below: |
| |
| * Migration from one Gremlin Structure implementation to another (e.g. `TinkerGraph` to `Neo4jGraph`) |
| * Serialization of individual graph elements to be sent over the network to another JVM. |
| * Backups of in-memory graphs or subgraphs. |
| |
| WARNING: When migrating between Gremlin Structure implementations, Kryo may not lose data, but it is important to |
| consider the features of each `Graph` and whether or not the data types supported in one will be supported in the |
| other. Failure to do so, may result in errors. |
| |
| Kryo supports all of the `GraphReader` and `GraphWriter` interface methods and can therefore read or write an entire |
| `Graph`, vertices, edges, etc. The following code shows how to write a `Graph` instance to file called |
| `tinkerpop-modern.kryo` and then how to read that file back into a different instance: |
| |
| [source,java] |
| ---- |
| final Graph graph = TinkerFactory.createModern(); |
| graph.io(IoCore.gryo()).writeGraph("tinkerpop-modern.kryo"); |
| |
| final Graph newGraph = TinkerGraph.open(); |
| newGraph.io(IoCore.gryo()).readGraph("tinkerpop-modern.kryo")' |
| ---- |
| |
| If a custom configuration is required, then have the `Graph` generate a `GraphReader` or `GraphWriter` "builder" instance: |
| |
| [source,java] |
| ---- |
| final Graph graph = TinkerFactory.createModern(); |
| try (final OutputStream os = new FileOutputStream("tinkerpop-modern.kryo")) { |
| graph.io(IoCore.gryo()).writer().create().writeGraph(os, graph); |
| } |
| |
| final Graph newGraph = TinkerGraph.open(); |
| try (final InputStream stream = new FileInputStream("tinkerpop-modern.kryo")) { |
| newGraph.io(IoCore.gryo()).reader().vertexIdKey("name").create().readGraph(stream, newGraph); |
| } |
| ---- |
| |
| NOTE: The preferred extension for files names produced by Gryo is `.kryo`. |
| |
| TinkerPop2 Data Migration |
| ~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| image:data-migration.png[width=300,float=right] For those using TinkerPop2, migrating to TinkerPop3 will mean a number |
| of programming changes, but may also require a migration of the data depending on the graph implementation. For |
| example, trying to open `TinkerGraph` data from TinkerPop2 with TinkerPop3 code will not work, however opening a |
| TinkerPop2 `Neo4jGraph` with a TinkerPop3 `Neo4jGraph` should work provided there aren't Neo4j version compatibility |
| mismatches preventing the read. |
| |
| If such a situation arises that a particular TinkerPop2 `Graph` can not be read by TinkerPop3, a "legacy" data |
| migration approach exists. The migration involves writing the TinkerPop2 `Graph` to GraphSON, then reading it to |
| TinkerPop3 with the `LegacyGraphSONReader` (a limited implementation of the `GraphReader` interface). |
| |
| The following represents an example migration of the "classic" toy graph. In this example, the "classic" graph is |
| saved to GraphSON using TinkerPop2. |
| |
| [source,groovy] |
| ---- |
| gremlin> Gremlin.version() |
| ==>2.5.z |
| gremlin> graph = TinkerGraphFactory.createTinkerGraph() |
| ==>tinkergraph[vertices:6 edges:6] |
| gremlin> GraphSONWriter.outputGraph(graph,'/tmp/tp2.json',GraphSONMode.EXTENDED) |
| ==>null |
| ---- |
| |
| The above console session uses the `gremlin-groovy` distribution from TinkerPop2. It is important to generate the |
| `tp2.json` file using the `EXTENDED` mode as it will include data types when necessary which will help limit |
| "lossiness" on the TinkerPop3 side when imported. Once `tp2.json` is created, it can then be imported to a TinkerPop3 |
| `Graph`. |
| |
| [source,groovy] |
| ---- |
| gremlin> Gremlin.version() |
| ==>x.y.z |
| gremlin> graph = TinkerGraph.open() |
| ==>tinkergraph[vertices:0 edges:0] |
| gremlin> r = LegacyGraphSONReader.build().create() |
| ==>org.apache.tinkerpop.gremlin.structure.io.graphson.LegacyGraphSONReader@64337702 |
| gremlin> r.readGraph(new FileInputStream('/tmp/tp2.json'), graph) |
| ==>null |
| gremlin> g = graph.traversal(standard()) |
| ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] |
| gremlin> g.E() |
| ==>e[11][4-created->3] |
| ==>e[12][6-created->3] |
| ==>e[7][1-knows->2] |
| ==>e[8][1-knows->4] |
| ==>e[9][1-created->3] |
| ==>e[10][4-created->5] |
| ---- |
| |
| Namespace Conventions |
| --------------------- |
| |
| End users, <<implementations,graph system providers>>, <<graphcomputer,`GraphComputer`>> algorithm designers, |
| <<gremlin-plugins,GremlinPlugin>> creators, etc. all leverage properties on elements to store information. There are |
| a few conventions that should be respected when naming property keys to ensure that conflicts between these |
| stakeholders do not conflict. |
| |
| * End users are granted the _flat namespace_ (e.g. `name`, `age`, `location`) to key their properties and label their elements. |
| * Graph system providers are granted the _hidden namespace_ (e.g. `~metadata`) to key their properties and labels. |
| Data keyed as such is only accessible via the graph system implementation and no other stakeholders are granted read |
| nor write access to data prefixed with "~" (see `Graph.Hidden`). Test coverage and exceptions exist to ensure that |
| graph systems respect this hard boundary. |
| * <<vertexprogram,`VertexProgram`>> and <<mapreduce,`MapReduce`>> developers should, like `GraphStrategy` developers, |
| leverage _qualified namespaces_ particular to their domain (e.g. `mydomain.myvertexprogram.computedata`). |
| * `GremlinPlugin` creators should prefix their plugin name with their domain (e.g. `mydomain.myplugin`). |
| |
| IMPORTANT: TinkerPop uses `tinkerpop.` and `gremlin.` as the prefixes for provided strategies, vertex programs, map |
| reduce implementations, and plugins. |
| |
| The only truly protected namespace is the _hidden namespace_ provided to graph systems. From there, it's up to |
| engineers to respect the namespacing conventions presented. |