blob: e80d87e38e05e901bade783fcef57d3b5955cbb8 [file] [log] [blame]
////
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
////
[[graph]]
= The Graph
image::gremlin-standing.png[width=125]
== Features
A `Feature` implementation describes the capabilities of a `Graph` instance. This interface is implemented by graph
system providers for two purposes:
. It tells users the capabilities of their `Graph` instance.
. It allows the features they do comply with to be tested against the Gremlin Test Suite - tests that do not comply are "ignored").
The following example in the Gremlin Console shows how to print all the features of a `Graph`:
[gremlin-groovy]
----
graph = TinkerGraph.open()
graph.features()
----
A common pattern for using features is to check their support prior to performing an operation:
[gremlin-groovy]
----
graph.features().graph().supportsTransactions()
graph.features().graph().supportsTransactions() ? g.tx().commit() : "no tx"
----
TIP: To ensure provider agnostic code, always check feature support prior to usage of a particular function. In that
way, the application can behave gracefully in case a particular implementation is provided at runtime that does not
support a function being accessed.
[[vertex-properties]]
== Vertex Properties
image:vertex-properties.png[width=215,float=left] TinkerPop3 introduces the concept of a `VertexProperty<V>`. All the
properties of a `Vertex` are a `VertexProperty`. A `VertexProperty` implements `Property` and as such, it has a
key/value pair. However, `VertexProperty` also implements `Element` and thus, can have a collection of key/value
pairs. Moreover, while an `Edge` can only have one property of key "name" (for example), a `Vertex` can have multiple
"name" properties. With the inclusion of vertex properties, two features are introduced which ultimately advance the
graph modelers toolkit:
. Multiple properties (*multi-properties*): a vertex property key can have multiple values. For example, a vertex can have
multiple "name" properties.
. Properties on properties (*meta-properties*): a vertex property can have properties (i.e. a vertex property can
have key/value data associated with it).
Possible use cases for meta-properties:
. *Permissions*: Vertex properties can have key/value ACL-type permission information associated with them.
. *Auditing*: When a vertex property is manipulated, it can have key/value information attached to it saying who the
creator, deletor, etc. are.
. *Provenance*: The "name" of a vertex can be declared by multiple users. For example, there may be multiple spellings
of a name from different sources.
A running example using vertex properties is provided below to demonstrate and explain the API.
[gremlin-groovy]
----
graph = TinkerGraph.open()
g = graph.traversal()
v = g.addV().property('name','marko').property('name','marko a. rodriguez').next()
g.V(v).properties('name').count() <1>
v.property(list, 'name', 'm. a. rodriguez') <2>
g.V(v).properties('name').count()
g.V(v).properties()
g.V(v).properties('name')
g.V(v).properties('name').hasValue('marko')
g.V(v).properties('name').hasValue('marko').property('acl','private') <3>
g.V(v).properties('name').hasValue('marko a. rodriguez')
g.V(v).properties('name').hasValue('marko a. rodriguez').property('acl','public')
g.V(v).properties('name').has('acl','public').value()
g.V(v).properties('name').has('acl','public').drop() <4>
g.V(v).properties('name').has('acl','public').value()
g.V(v).properties('name').has('acl','private').value()
g.V(v).properties()
g.V(v).properties().properties() <5>
g.V(v).properties().property('date',2014) <6>
g.V(v).properties().property('creator','stephen')
g.V(v).properties().properties()
g.V(v).properties('name').valueMap()
g.V(v).property('name','okram') <7>
g.V(v).properties('name')
g.V(v).values('name') <8>
----
<1> A vertex can have zero or more properties with the same key associated with it.
<2> If a property is added with a cardinality of `Cardinality.list`, an additional property with the provided key will be added.
<3> A vertex property can have standard key/value properties attached to it.
<4> Vertex property removal is identical to property removal.
<5> Gets the meta-properties of each vertex property.
<6> A vertex property can have any number of key/value properties attached to it.
<7> `property(...)` will remove all existing key'd properties before adding the new single property (see `VertexProperty.Cardinality`).
<8> If only the value of a property is needed, then `values()` can be used.
If the concept of vertex properties is difficult to grasp, then it may be best to think of vertex properties in terms
of "literal vertices." A vertex can have an edge to a "literal vertex" that has a single value key/value -- e.g.
"value=okram." The edge that points to that literal vertex has an edge-label of "name." The properties on the edge
represent the literal vertex's properties. The "literal vertex" can not have any other edges to it (only one from the
associated vertex).
[[the-crew-toy-graph]]
TIP: A toy graph demonstrating all of the new TinkerPop3 graph structure features is available at
`TinkerFactory.createTheCrew()` and `data/tinkerpop-crew*`. This graph demonstrates multi-properties and meta-properties.
.TinkerPop Crew
image::the-crew-graph.png[width=685]
[gremlin-groovy,theCrew]
----
g.V().as('a').
properties('location').as('b').
hasNot('endTime').as('c').
select('a','b','c').by('name').by(value).by('startTime') // determine the current location of each person
g.V().has('name','gremlin').inE('uses').
order().by('skill',asc).as('a').
outV().as('b').
select('a','b').by('skill').by('name') // rank the users of gremlin by their skill level
----
== Graph Variables
TinkerPop3 introduces the concept of `Graph.Variables`. Variables are key/value pairs associated with the graph
itself -- in essence, a `Map<String,Object>`. These variables are intended to store metadata about the graph. Example
use cases include:
* *Schema information*: What do the namespace prefixes resolve to and when was the schema last modified?
* *Global permissions*: What are the access rights for particular groups?
* *System user information*: Who are the admins of the system?
An example of graph variables in use is presented below:
[gremlin-groovy]
----
graph = TinkerGraph.open()
graph.variables()
graph.variables().set('systemAdmins',['stephen','peter','pavel'])
graph.variables().set('systemUsers',['matthias','marko','josh'])
graph.variables().keys()
graph.variables().get('systemUsers')
graph.variables().get('systemUsers').get()
graph.variables().remove('systemAdmins')
graph.variables().keys()
----
IMPORTANT: Graph variables are not intended to be subject to heavy, concurrent mutation nor to be used in complex
computations. The intention is to have a location to store data about the graph for administrative purposes.
[[transactions]]
== Graph Transactions
image:gremlin-coins.png[width=100,float=right] A link:http://en.wikipedia.org/wiki/Database_transaction[database transaction]
represents a unit of work to execute against the database. Transactions are controlled by an implementation of the
`Transaction` interface and that object can be obtained from the `Graph` interface using the `tx()` method. It is
important to note that the `Transaction` object does not represent a "transaction" itself. It merely exposes the
methods for working with transactions (e.g. committing, rolling back, etc).
Most `Graph` implementations that `supportsTransactions` will implement an "automatic" `ThreadLocal` transaction,
which means that when a read or write occurs after the `Graph` is instantiated, a transaction is automatically
started within that thread. There is no need to manually call a method to "create" or "start" a transaction. Simply
modify the graph as required and call `graph.tx().commit()` to apply changes or `graph.tx().rollback()` to undo them.
When the next read or write action occurs against the graph, a new transaction will be started within that current
thread of execution.
When using transactions in this fashion, especially in web application (e.g. HTTP server), it is important to ensure
that transactions do not leak from one request to the next. In other words, unless a client is somehow bound via
session to process every request on the same server thread, every request must be committed or rolled back at the end
of the request. By ensuring that the request encapsulates a transaction, it ensures that a future request processed
on a server thread is starting in a fresh transactional state and will not have access to the remains of one from an
earlier request. A good strategy is to rollback a transaction at the start of a request, so that if it so happens that
a transactional leak does occur between requests somehow, a fresh transaction is assured by the fresh request.
TIP: The `tx()` method is on the `Graph` interface, but it is also available on the `TraversalSource` spawned from a
`Graph`. Calls to `TraversalSource.tx()` are proxied through to the underlying `Graph` as a convenience.
WARNING: TinkerPop provides for basic transaction control, however, like many aspects of TinkerPop, it is up to the
graph system provider to choose the specific aspects of how their implementation will work and how it fits into the
TinkerPop stack. Be sure to understand the transaction semantics of the specific graph implementation that is being
utilized as it may present differing functionality than described here.
=== Configuring
Determining when a transaction starts is dependent upon the behavior assigned to the `Transaction`. It is up to the
`Graph` implementation to determine the default behavior and unless the implementation doesn't allow it, the behavior
itself can be altered via these `Transaction` methods:
[source,java]
----
public Transaction onReadWrite(Consumer<Transaction> consumer);
public Transaction onClose(Consumer<Transaction> consumer);
----
Providing a `Consumer` function to `onReadWrite` allows definition of how a transaction starts when a read or a write
occurs. `Transaction.READ_WRITE_BEHAVIOR` contains pre-defined `Consumer` functions to supply to the `onReadWrite`
method. It has two options:
* `AUTO` - automatic transactions where the transaction is started implicitly to the read or write operation
* `MANUAL` - manual transactions where it is up to the user to explicitly open a transaction, throwing an exception
if the transaction is not open
Providing a `Consumer` function to `onClose` allows configuration of how a transaction is handled when
`Transaction.close()` is called. `Transaction.CLOSE_BEHAVIOR` has several pre-defined options that can be supplied to
this method:
* `COMMIT` - automatically commit an open transaction
* `ROLLBACK` - automatically rollback an open transaction
* `MANUAL` - throw an exception if a transaction is open, forcing the user to explicitly close the transaction
IMPORTANT: As transactions are `ThreadLocal` in nature, so are the transaction configurations for `onReadWrite` and
`onClose`.
Once there is an understanding for how transactions are configured, most of the rest of the `Transaction` interface
is self-explanatory. Note that <<neo4j-gremlin,Neo4j-Gremlin>> is used for the examples to follow as TinkerGraph does
not support transactions.
[source,groovy]
----
gremlin> graph = Neo4jGraph.open('/tmp/neo4j')
==>neo4jgraph[EmbeddedGraphDatabase [/tmp/neo4j]]
gremlin> g = graph.traversal()
==>graphtraversalsource[neo4jgraph[community single [/tmp/neo4j]], standard]
gremlin> graph.features()
==>FEATURES
> GraphFeatures
>-- Transactions: true <1>
>-- Computer: false
>-- Persistence: true
...
gremlin> g.tx().onReadWrite(Transaction.READ_WRITE_BEHAVIOR.AUTO) <2>
==>org.apache.tinkerpop.gremlin.neo4j.structure.Neo4jGraph$Neo4jTransaction@1c067c0d
gremlin> g.addV("person").("name","stephen") <3>
==>v[0]
gremlin> g.tx().commit() <4>
==>null
gremlin> g.tx().onReadWrite(Transaction.READ_WRITE_BEHAVIOR.MANUAL) <5>
==>org.apache.tinkerpop.gremlin.neo4j.structure.Neo4jGraph$Neo4jTransaction@1c067c0d
gremlin> g.tx().isOpen()
==>false
gremlin> g.addV("person").("name","marko") <6>
Open a transaction before attempting to read/write the transaction
gremlin> g.tx().open() <7>
==>null
gremlin> g.addV("person").("name","marko") <8>
==>v[1]
gremlin> g.tx().commit()
==>null
----
<1> Check `features` to ensure that the graph supports transactions.
<2> By default, `Neo4jGraph` is configured with "automatic" transactions, so it is set here for demonstration purposes only.
<3> When the vertex is added, the transaction is automatically started. From this point, more mutations can be staged
or other read operations executed in the context of that open transaction.
<4> Calling `commit` finalizes the transaction.
<5> Change transaction behavior to require manual control.
<6> Adding a vertex now results in failure because the transaction was not explicitly opened.
<7> Explicitly open a transaction.
<8> Adding a vertex now succeeds as the transaction was manually opened.
NOTE: It may be important to consult the documentation of the `Graph` implementation you are using when it comes to the
specifics of how transactions will behave. TinkerPop allows some latitude in this area and implementations may not have
the exact same behaviors and link:https://en.wikipedia.org/wiki/ACID[ACID] guarantees.
=== Threaded Transactions
Most `Graph` implementations that support transactions do so in a `ThreadLocal` manner, where the current transaction
is bound to the current thread of execution. Consider the following example to demonstrate:
[source,java]
----
GraphTraversalSource g = graph.traversal();
g.addV("person").("name","stephen").iterate();
Thread t1 = new Thread(() -> {
g.addV("person").("name","josh").iterate();
});
Thread t2 = new Thread(() -> {
g.addV("person").("name","marko").iterate();
});
t1.start()
t2.start()
t1.join()
t2.join()
g.tx().commit();
----
The above code shows three vertices added to `graph` in three different threads: the current thread, `t1` and
`t2`. One might expect that by the time this body of code finished executing, that there would be three vertices
persisted to the `Graph`. However, given the `ThreadLocal` nature of transactions, there really were three separate
transactions created in that body of code (i.e. one for each thread of execution) and the only one committed was the
first call to `addV()` in the primary thread of execution. The other two calls to that method within `t1` and `t2`
were never committed and thus orphaned.
A `Graph` that `supportsThreadedTransactions` is one that allows for a `Graph` to operate outside of that constraint,
thus allowing multiple threads to operate within the same transaction. Therefore, if there was a need to have three
different threads operating within the same transaction, the above code could be re-written as follows:
[source,java]
----
Graph threaded = graph.tx().createThreadedTx();
GraphTraversalSource g = graph.traversal();
g.addV("person").("name","stephen").iterate();
Thread t1 = new Thread(() -> {
threaded.addV("person").("name","josh").iterate();
});
Thread t2 = new Thread(() -> {
threaded.addV("person").("name","marko").iterate();
});
t1.start()
t2.start()
t1.join()
t2.join()
g.tx().commit();
----
In the above case, the call to `graph.tx().createThreadedTx()` creates a new `Graph` instance that is unbound from the
`ThreadLocal` transaction, thus allowing each thread to operate on it in the same context. In this case, there would
be three separate vertices persisted to the `Graph`.
== Gremlin I/O
image:gremlin-io.png[width=250,float=right] The task of getting data in and out of `Graph` instances is the job of
the Gremlin I/O packages. Gremlin I/O provides two interfaces for reading and writing `Graph` instances: `GraphReader`
and `GraphWriter`. These interfaces expose methods that support:
* Reading and writing an entire `Graph`
* Reading and writing a `Traversal<Vertex>` as adjacency list format
* Reading and writing a single `Vertex` (with and without associated `Edge` objects)
* Reading and writing a single `Edge`
* Reading and writing a single `VertexProperty`
* Reading and writing a single `Property`
* Reading and writing an arbitrary `Object`
In all cases, these methods operate in the currency of `InputStream` and `OutputStream` objects, allowing graphs and
their related elements to be written to and read from files, byte arrays, etc. The `Graph` interface offers the `io`
method, which provides access to "reader/writer builder" objects that are pre-configured with serializers provided by
the `Graph`, as well as helper methods for the various I/O capabilities. Unless there are very advanced requirements
for the serialization process, it is always best to utilize the methods on the `Io` interface to construct
`GraphReader` and `GraphWriter` instances, as the implementation may provide some custom settings that would otherwise
have to be configured manually by the user to do the serialization.
It is up to the implementations of the `GraphReader` and `GraphWriter` interfaces to choose the methods they
implement and the manner in which they work together. The only characteristic enforced and expected is that the write
methods should produce output that is compatible with the corresponding read method. For example, the output of
`writeVertices` should be readable as input to `readVertices` and the output of `writeProperty` should be readable as
input to `readProperty`.
NOTE: Additional documentation for TinkerPop IO formats can be found in the link:https://tinkerpop.apache.org/docs/x.y.z/dev/io/[IO Reference].
=== GraphML Reader/Writer
image:gremlin-graphml.png[width=350,float=left] The link:http://graphml.graphdrawing.org/[GraphML] file format is a
common XML-based representation of a graph. It is widely supported by graph-related tools and libraries making it a
solid interchange format for TinkerPop. In other words, if the intent is to work with graph data in conjunction with
applications outside of TinkerPop, GraphML may be the best choice to do that. Common use cases might be:
* Generate a graph using link:https://networkx.github.io/[NetworkX], export it with GraphML and import it to TinkerPop.
* Produce a subgraph and export it to GraphML to be consumed by and visualized in link:https://gephi.org/[Gephi].
* Migrate the data of an entire graph to a different graph database not supported by TinkerPop.
As GraphML is a specification for the serialization of an entire graph and not the individual elements of a graph,
methods that support input and output of single vertices, edges, etc. are not supported.
WARNING: GraphML is a "lossy" format in that it only supports primitive values for properties and does not have
support for `Graph` variables. It will use `toString` to serialize property values outside of those primitives.
WARNING: GraphML as a specification allows for `<edge>` and `<node>` elements to appear in any order. Most software
that writes GraphML (including as TinkerPop's `GraphMLWriter`) write `<node>` elements before `<edge>` elements. However it
is important to note that `GraphMLReader` will read this data in order and order can matter. This is because TinkerPop
does not allow the vertex label to be changed after the vertex has been created. Therefore, if an `<edge>` element
comes before the `<node>`, the label on the vertex will be ignored. It is thus better to order `<node>` elements in the
GraphML to appear before all `<edge>` elements if vertex labels are important to the graph.
The following code shows how to write a `Graph` instance to file called `tinkerpop-modern.xml` and then how to read
that file back into a different instance:
[source,java]
----
Graph graph = TinkerFactory.createModern();
graph.io(IoCore.graphml()).writeGraph("tinkerpop-modern.xml");
Graph newGraph = TinkerGraph.open();
newGraph.io(IoCore.graphml()).readGraph("tinkerpop-modern.xml");
----
If a custom configuration is required, then have the `Graph` generate a `GraphReader` or `GraphWriter` "builder" instance:
[source,java]
----
Graph graph = TinkerFactory.createModern();
try (OutputStream os = new FileOutputStream("tinkerpop-modern.xml")) {
graph.io(IoCore.graphml()).writer().normalize(true).create().writeGraph(os, graph);
}
Graph newGraph = TinkerGraph.open();
try (InputStream stream = new FileInputStream("tinkerpop-modern.xml")) {
newGraph.io(IoCore.graphml()).reader().create().readGraph(stream, newGraph);
}
----
GraphML was a supported format in TinkerPop 2.x, but there were several issues that made it inconsistent with the
specification that were corrected for 3.x. As a result, attempting to read a GraphML file generated by 2.x with the
3.x `GraphMLReader` will result in error. To help with this problem, an XSLT file is provided as a resource in
`gremlin-core` which will transform 2.x GraphML to 3.x GraphML. It can be used as follows:
[source,java]
----
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamSource;
import javax.xml.transform.stream.StreamResult;
InputStream stylesheet = Thread.currentThread().getContextClassLoader().getResourceAsStream("tp2-to-tp3-graphml.xslt");
File datafile = new File('/tmp/tp2-graphml.xml');
File outfile = new File('/tmp/tp3-graphml.xml');
TransformerFactory tFactory = TransformerFactory.newInstance();
StreamSource stylesource = new StreamSource(stylesheet);
Transformer transformer = tFactory.newTransformer(stylesource);
StreamSource source = new StreamSource(datafile);
StreamResult result = new StreamResult(new FileWriter(outfile));
transformer.transform(source, result);
----
[[graphson-reader-writer]]
=== GraphSON Reader/Writer
image:gremlin-graphson.png[width=350,float=left] GraphSON is a link:http://json.org/[JSON]-based format extended
from earlier versions of TinkerPop. It is important to note that TinkerPop3's GraphSON is not backwards compatible
with prior TinkerPop GraphSON versions. GraphSON has some support from graph-related application outside of TinkerPop,
but it is generally best used in two cases:
* A text format of the graph or its elements is desired (e.g. debugging, usage in source control, etc.)
* The graph or its elements need to be consumed by code that is not JVM-based (e.g. JavaScript, Python, .NET, etc.)
GraphSON supports all of the `GraphReader` and `GraphWriter` interface methods and can therefore read or write an
entire `Graph`, vertices, arbitrary objects, etc. The following code shows how to write a `Graph` instance to file
called `tinkerpop-modern.json` and then how to read that file back into a different instance:
[source,java]
----
Graph graph = TinkerFactory.createModern();
graph.io(graphson()).writeGraph("tinkerpop-modern.json");
Graph newGraph = TinkerGraph.open();
newGraph.io(graphson()).readGraph("tinkerpop-modern.json");
----
NOTE: Using `graphson()`, which is a static helper method of `IoCore`, will default to the most current version of GraphSON which is 3.0.
If a custom configuration is required, then have the `Graph` generate a `GraphReader` or `GraphWriter` "builder" instance:
[source,java]
----
Graph graph = TinkerFactory.createModern();
try (OutputStream os = new FileOutputStream("tinkerpop-modern.json")) {
GraphSONMapper mapper = graph.io(IoCore.graphson()).mapper().normalize(true).create()
graph.io(graphson()).writer().mapper(mapper).create().writeGraph(os, graph)
}
Graph newGraph = TinkerGraph.open();
try (InputStream stream = new FileInputStream("tinkerpop-modern.json")) {
newGraph.io(graphson()).reader().create().readGraph(stream, newGraph);
}
----
The following example shows how a single `Vertex` is written to GraphSON using the Gremlin Console:
[gremlin-groovy]
----
graph = TinkerFactory.createModern()
g = graph.traversal()
f = new ByteArrayOutputStream()
graph.io(graphson()).writer().create().writeVertex(f, g.V(1).next(), BOTH)
f.close()
----
The following GraphSON example shows the output of `GraphSONWriter.writeVertex()` with associated edges:
[source,json]
----
{
"id": {
"@type": "g:Int32",
"@value": 1
},
"label": "person",
"outE": {
"created": [{
"id": {
"@type": "g:Int32",
"@value": 9
},
"inV": {
"@type": "g:Int32",
"@value": 3
},
"properties": {
"weight": {
"@type": "g:Double",
"@value": 0.4
}
}
}],
"knows": [{
"id": {
"@type": "g:Int32",
"@value": 7
},
"inV": {
"@type": "g:Int32",
"@value": 2
},
"properties": {
"weight": {
"@type": "g:Double",
"@value": 0.5
}
}
}, {
"id": {
"@type": "g:Int32",
"@value": 8
},
"inV": {
"@type": "g:Int32",
"@value": 4
},
"properties": {
"weight": {
"@type": "g:Double",
"@value": 1.0
}
}
}]
},
"properties": {
"name": [{
"id": {
"@type": "g:Int64",
"@value": 0
},
"value": "marko"
}],
"age": [{
"id": {
"@type": "g:Int64",
"@value": 1
},
"value": {
"@type": "g:Int32",
"@value": 29
}
}]
}
}
----
GraphSON has several versions and each has differences that prevent complete compatibility with one another. While the
default version provided by `IoCore.graphson()` is recommended, it is possible to make changes to revert to an earlier
version. The following shows an example of how to use 1.0 (with type embedding):
[gremlin-groovy]
----
graph = TinkerFactory.createModern()
g = graph.traversal()
f = new ByteArrayOutputStream()
mapper = graph.io(GraphSONIo.build(GraphSONVersion.V1_0)).mapper().typeInfo(TypeInfo.PARTIAL_TYPES).create()
graph.io(GraphSONIo.build(GraphSONVersion.V1_0)).writer().mapper(mapper).create().writeVertex(f, g.V(1).next(), BOTH)
f.close()
----
NOTE: Additional documentation for GraphSON can be found in the link:https://tinkerpop.apache.org/docs/x.y.z/dev/io/#graphson[IO Reference].
IMPORTANT: When using the extended type system in Gremlin Server, support for these types when used in the context of
Gremlin Language Variants is dependent on the programming language, the driver and its serializers. These
implementations are only required to support the core types and not the extended ones.
Here's the same previous example of GraphSON 1.0, but with GraphSON 2.0:
[gremlin-groovy]
----
graph = TinkerFactory.createModern()
g = graph.traversal()
f = new ByteArrayOutputStream()
mapper = graph.io(graphson()).mapper().version(GraphSONVersion.V2_0).create()
graph.io(graphson()).writer().mapper(mapper).create().writeVertex(f, g.V(1).next(), BOTH)
f.close()
----
Creating a GraphSON 2.0 mapper is done by calling `.version(GraphSONVersion.V2_0)` on the mapper builder. Here's is the
example output from the code above:
[source,json]
----
{
"@type": "g:Vertex",
"@value": {
"id": {
"@type": "g:Int32",
"@value": 1
},
"label": "person",
"properties": {
"name": [{
"@type": "g:VertexProperty",
"@value": {
"id": {
"@type": "g:Int64",
"@value": 0
},
"value": "marko",
"label": "name"
}
}],
"uuid": [{
"@type": "g:VertexProperty",
"@value": {
"id": {
"@type": "g:Int64",
"@value": 12
},
"value": {
"@type": "g:UUID",
"@value": "829c7ddb-3831-4687-a872-e25201230cd3"
},
"label": "uuid"
}
}],
"age": [{
"@type": "g:VertexProperty",
"@value": {
"id": {
"@type": "g:Int64",
"@value": 1
},
"value": {
"@type": "g:Int32",
"@value": 29
},
"label": "age"
}
}]
}
}
}
----
Types can be disabled when creating a GraphSON 2.0 `Mapper` with:
[source,groovy]
----
graph.io(graphson()).mapper().
version(GraphSONVersion.V2_0).
typeInfo(GraphSONMapper.TypeInfo.NO_TYPES).create()
----
By disabling types, the JSON payload produced will lack the extra information that is written for types. Please note,
disabling types can be unsafe with regards to the written data in that types can be lost.
[[gryo-reader-writer]]
=== Gryo Reader/Writer
image:gremlin-kryo.png[width=400,float=left] link:https://github.com/EsotericSoftware/kryo[Kryo] is a popular
serialization package for the JVM. Gremlin-Kryo is a binary `Graph` serialization format for use on the JVM by JVM
languages. It is designed to be space efficient, non-lossy and is promoted as the standard format to use when working
with graph data inside of the TinkerPop stack. A list of common use cases is presented below:
* Migration from one Gremlin Structure implementation to another (e.g. `TinkerGraph` to `Neo4jGraph`)
* Serialization of individual graph elements to be sent over the network to another JVM.
* Backups of in-memory graphs or subgraphs.
WARNING: When migrating between Gremlin Structure implementations, Kryo may not lose data, but it is important to
consider the features of each `Graph` and whether or not the data types supported in one will be supported in the
other. Failure to do so, may result in errors.
Kryo supports all of the `GraphReader` and `GraphWriter` interface methods and can therefore read or write an entire
`Graph`, vertices, edges, etc. The following code shows how to write a `Graph` instance to file called
`tinkerpop-modern.kryo` and then how to read that file back into a different instance:
[source,java]
----
Graph graph = TinkerFactory.createModern();
graph.io(gryo()).writeGraph("tinkerpop-modern.kryo");
Graph newGraph = TinkerGraph.open();
newGraph.io(gryo()).readGraph("tinkerpop-modern.kryo");
----
NOTE: Using `gryo()`, which is a static helper method of `IoCore`, will default to the most current version of Gryo which is 3.0.
If a custom configuration is required, then have the `Graph` generate a `GraphReader` or `GraphWriter` "builder" instance:
[source,java]
----
Graph graph = TinkerFactory.createModern();
try (OutputStream os = new FileOutputStream("tinkerpop-modern.kryo")) {
graph.io(GryoIo.build(GryoVersion.V1_0)).writer().create().writeGraph(os, graph);
}
Graph newGraph = TinkerGraph.open();
try (InputStream stream = new FileInputStream("tinkerpop-modern.kryo")) {
newGraph.io(GryoIo.build(GryoVersion.V1_0)).reader().create().readGraph(stream, newGraph);
}
----
NOTE: The preferred extension for files names produced by Gryo is `.kryo`.
=== TinkerPop2 Data Migration
image:data-migration.png[width=300,float=right] For those using TinkerPop2, migrating to TinkerPop3 will mean a number
of programming changes, but may also require a migration of the data depending on the graph implementation. For
example, trying to open `TinkerGraph` data from TinkerPop2 with TinkerPop3 code will not work, however opening a
TinkerPop2 `Neo4jGraph` with a TinkerPop3 `Neo4jGraph` should work provided there aren't Neo4j version compatibility
mismatches preventing the read.
If such a situation arises that a particular TinkerPop2 `Graph` can not be read by TinkerPop3, a "legacy" data
migration approach exists. The migration involves writing the TinkerPop2 `Graph` to GraphSON, then reading it to
TinkerPop3 with the `LegacyGraphSONReader` (a limited implementation of the `GraphReader` interface).
The following represents an example migration of the "classic" toy graph. In this example, the "classic" graph is
saved to GraphSON using TinkerPop2.
[source,groovy]
----
gremlin> Gremlin.version()
==>2.5.z
gremlin> graph = TinkerGraphFactory.createTinkerGraph()
==>tinkergraph[vertices:6 edges:6]
gremlin> GraphSONWriter.outputGraph(graph,'/tmp/tp2.json',GraphSONMode.EXTENDED)
==>null
----
The above console session uses the `gremlin-groovy` distribution from TinkerPop2. It is important to generate the
`tp2.json` file using the `EXTENDED` mode as it will include data types when necessary which will help limit
"lossiness" on the TinkerPop3 side when imported. Once `tp2.json` is created, it can then be imported to a TinkerPop3
`Graph`.
[source,groovy]
----
gremlin> Gremlin.version()
==>x.y.z
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> r = LegacyGraphSONReader.build().create()
==>org.apache.tinkerpop.gremlin.structure.io.graphson.LegacyGraphSONReader@64337702
gremlin> r.readGraph(new FileInputStream('/tmp/tp2.json'), graph)
==>null
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.E()
==>e[11][4-created->3]
==>e[12][6-created->3]
==>e[7][1-knows->2]
==>e[8][1-knows->4]
==>e[9][1-created->3]
==>e[10][4-created->5]
----
== Namespace Conventions
End users, <<implementations,graph system providers>>, <<graphcomputer,`GraphComputer`>> algorithm designers,
<<gremlin-plugins,GremlinPlugin>> creators, etc. all leverage properties on elements to store information. There are
a few conventions that should be respected when naming property keys to ensure that conflicts between these
stakeholders do not conflict.
* End users are granted the _flat namespace_ (e.g. `name`, `age`, `location`) to key their properties and label their elements.
* Graph system providers are granted the _hidden namespace_ (e.g. `~metadata`) to key their properties and labels.
Data keyed as such is only accessible via the graph system implementation and no other stakeholders are granted read
nor write access to data prefixed with "~" (see `Graph.Hidden`). Test coverage and exceptions exist to ensure that
graph systems respect this hard boundary.
* <<vertexprogram,`VertexProgram`>> and <<mapreduce,`MapReduce`>> developers should leverage _qualified namespaces_
particular to their domain (e.g. `mydomain.myvertexprogram.computedata`).
* `GremlinPlugin` creators should prefix their plugin name with their domain (e.g. `mydomain.myplugin`).
IMPORTANT: TinkerPop uses `tinkerpop.` and `gremlin.` as the prefixes for provided strategies, vertex programs, map
reduce implementations, and plugins.
The only truly protected namespace is the _hidden namespace_ provided to graph systems. From there, it's up to
engineers to respect the namespacing conventions presented.