blob: c27c432293f839ed10dbd63daa843ff6e4acfc9c [file] [log] [blame]
////
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
////
[[graph]]
The Graph
=========
image::gremlin-standing.png[width=125]
Features
--------
A `Feature` implementation describes the capabilities of a `Graph` instance. This interface is implemented by vendors for two purposes:
. It tells users the capabilities of their `Graph` instance.
. It allows the features they do comply with to be tested against the Gremlin Test Suite - tests that do not comply are "ignored").
The following example in the Gremlin Console shows how to print all the features of a `Graph`:
[gremlin-groovy]
----
graph = TinkerGraph.open()
graph.features()
----
A common pattern for using features is to check their support prior to performing an operation:
[gremlin-groovy]
----
graph.features().graph().supportsTransactions()
graph.features().graph().supportsTransactions() ? g.tx().commit() : "no tx"
----
TIP: To ensure vendor agnostic code, always check feature support prior to usage of a particular function. In that way, the application can behave gracefully in case a particular implementation is provided at runtime that does not support a function being accessed.
WARNING: Assignments of a `GraphStrategy` can alter the base features of a `Graph` in dynamic ways, such that checks against a `Feature` may not always reflect the behavior exhibited when the `GraphStrategy` is in use.
[[vertex-properties]]
Vertex Properties
-----------------
image:vertex-properties.png[width=215,float=left] TinkerPop3 introduces the concept of a `VertexProperty<V>`. All the properties of a `Vertex` are a `VertexProperty`. A `VertexProperty` implements `Property` and as such, it has a key/value pair. However, `VertexProperty` also implements `Element` and thus, can have a collection of key/value pairs. Moreover, while an `Edge` can only have one property of key "name" (for example), a `Vertex` can have multiple "name" properties. With the inclusion of vertex properties, two features are introduced which ultimately advance the graph modelers toolkit:
. Multiple properties (*multi-properties*): a vertex property key can have multiple values (i.e. a vertex can have multiple "name" properties).
. Properties on properties (*meta-properties*): a vertex property can have properties (i.e. a vertex property can have key/value data associated with it).
A collection of use cases are itemized below:
* *Permissions*: Vertex properties can have key/value ACL-type permission information associated with them.
* *Auditing*: When a vertex property is manipulated, it can have key/value information attached to it saying who the creator, deletor, etc. are.
* *Provenance*: The "name" of a vertex can be declared by multiple users.
A running example using vertex properties is provided below to demonstrate and explain the API.
[gremlin-groovy]
----
graph = TinkerGraph.open()
g = graph.traversal(standard())
v = g.addV('name','marko','name','marko a. rodriguez').next()
g.V(v).properties().count()
g.V(v).properties('name').count() <1>
g.V(v).properties()
g.V(v).properties('name')
g.V(v).properties('name').hasValue('marko')
g.V(v).properties('name').hasValue('marko').property('acl','private') <2>
g.V(v).properties('name').hasValue('marko a. rodriguez')
g.V(v).properties('name').hasValue('marko a. rodriguez').property('acl','public')
g.V(v).properties('name').has('acl','public').value()
g.V(v).properties('name').has('acl','public').drop() <3>
g.V(v).properties('name').has('acl','public').value()
g.V(v).properties('name').has('acl','private').value()
g.V(v).properties()
g.V(v).properties().properties() <4>
g.V(v).properties().property('date',2014) <5>
g.V(v).properties().property('creator','stephen')
g.V(v).properties().properties()
g.V(v).properties('name').valueMap()
g.V(v).property('name','okram') <6>
g.V(v).properties('name')
g.V(v).values('name') <7>
----
<1> A vertex can have zero or more properties with the same key associated with it.
<2> A vertex property can have standard key/value properties attached to it.
<3> Vertex property removal is identical to property removal.
<4> It is property to get the properties of a vertex property.
<5> A vertex property can have any number of key/value properties attached to it.
<6> `property(...)` will remove all existing key'd properties before adding the new single property (see `VertexProperty.Cardinality`).
<7> If only the value of a property is needed, then `values()` can be used.
If the concept of vertex properties is difficult to grasp, then it may be best to think of vertex properties in terms of "literal vertices." A vertex can have an edge to a "literal vertex" that has a single value key/value -- e.g. "value=okram." The edge that points to that literal vertex has an edge-label of "name." The properties on the edge represent the literal vertex's properties. The "literal vertex" can not have any other edges to it (only one from the associated vertex).
[[the-crew-toy-graph]]
TIP: A toy graph demonstrating all of the new TinkerPop3 graph structure features is available at `TinkerFactory.createTheCrew()` and `data/tinkerpop-crew*`. This graph demonstrates multi-properties and meta-properties.
.TinkerPop Crew
image::the-crew-graph.png[width=685]
[gremlin-groovy,theCrew]
----
g.V().as('a').
properties('location').as('b').
hasNot('endTime').as('c').
select('a','b','c').by('name').by(value).by('startTime') // determine the current location of each person
g.V().has('name','gremlin').inE('uses').
order().by('skill',incr).as('a').
outV().as('b').
select('a','b').by('skill').by('name') // rank the users of gremlin by their skill level
----
Graph Variables
---------------
TinkerPop3 introduces the concept of `Graph.Variables`. Variables are key/value pairs associated with the graph itself -- in essence, a `Map<String,Object>`. These variables are intended to store metadata about the graph. Example use cases include:
* *Schema information*: What do the namespace prefixes resolve to and when was the schema last modified?
* *Global permissions*: What are the access rights for particular groups?
* *System user information*: Who are the admins of the system?
An example of graph variables in use is presented below:
[gremlin-groovy]
----
graph = TinkerGraph.open()
graph.variables()
graph.variables().set('systemAdmins',['stephen','peter','pavel'])
graph.variables().set('systemUsers',['matthias','marko','josh'])
graph.variables().keys()
graph.variables().get('systemUsers')
graph.variables().get('systemUsers').get()
graph.variables().remove('systemAdmins')
graph.variables().keys()
----
IMPORTANT: Graph variables are not intended to be subject to heavy, concurrent mutation nor to be used in complex computations. The intention is to have a location to store data about the graph for administrative purposes.
[[transactions]]
Graph Transactions
------------------
image:gremlin-coins.png[width=100,float=right] A link:http://en.wikipedia.org/wiki/Database_transaction[database transaction] represents a unit of work to execute against the database. Transactions are controlled by an implementation of the `Transaction` interface and that object can be obtained from the `Graph` interface using the `tx()` method. It is important to note that the `Transaction` object does not represent a "transaction" itself. It merely exposes the methods for working with transactions (e.g. committing, rolling back, etc).
Most `Graph` implementations that `supportsTransactions` will implement an "automatic" `ThreadLocal` transaction, which means that when a read or write occurs after the `Graph` is instantiated a transaction is automatically started within that thread. There is no need to manually call a method to "create" or "start" a transaction. Simple modify the graph as required and call `graph.tx().commit()` to apply changes or `graph.tx().rollback()` to undo them. When the next read or write action occurs against the graph, a new transaction will be started within that current thread of execution.
When using transactions in this fashion, especially in web application (e.g. REST server), it is important to ensure that transaction do not leak from one request to the next. In other words, unless a client is somehow bound via session to process every request on the same server thread, ever request must be committed or rolled back at the end of the request. By ensuring that the request encapsulates a transaction, it ensures that a future request processed on a server thread is starting in a fresh transactional state and will not have access to the remains of one from an earlier request. A good strategy is to rollback a transaction at the start of a request, so that if it so happens that a transactional leak does occur between requests somehow, a fresh transaction is assured by the fresh request.
TIP: The `tx()` method is on the `Graph` interface, but it is also available on the `TraversalSource` spawned from a `Graph`. Calls to `TraversalSource.tx()` are proxied through to the underlying `Graph` as a convenience.
Configuring
~~~~~~~~~~~
Determining when a transaction starts is dependent upon the behavior assigned to the `Transaction`. It is up to the `Graph` implementation to determine the default behavior and unless the implementation doesn't allow it, the behavior itself can be altered via these `Transaction` methods:
[source,java]
----
public Transaction onReadWrite(final Consumer<Transaction> consumer);
public Transaction onClose(final Consumer<Transaction> consumer);
----
Providing a `Consumer` function to `onReadWrite` allows definition of how a transaction starts when a read or a write occurs. `Transaction.READ_WRITE_BEHAVIOR` contains pre-defined `Consumer` functions to supply to the `onReadWrite` method. It has two options:
* `AUTO` - automatic transactions where the transaction is started implicitly to the read or write operation
* `MANUAL` - manual transactions where it is up to the user to explicitly open a transaction, throwing an exception if the transaction is not open
Providing a `Consumer` function to `onClose` allows configuration of how a transaction is handled when `Transaction.close()` is called. `Transaction.CLOSE_BEHAVIOR` has several pre-defined options that can be supplied to this method:
* `COMMIT` - automatically commit an open transaction
* `ROLLBACK` - automatically rollback an open transaction
* `MANUAL` - throw an exception if a transaction is open, forcing the user to explicitly close the transaction
Once there is an understanding for how transactions are configured, most of the rest of the `Transaction` interface is self-explanatory. Note that <<neo4j-gremlin,Neo4j-Gremlin>> is used for the examples to follow as TinkerGraph does not support transactions.
[source,groovy]
----
gremlin> graph = Neo4jGraph.open('/tmp/neo4j')
==>neo4jgraph[EmbeddedGraphDatabase [/tmp/neo4j]]
gremlin> graph.features()
==>FEATURES
> GraphFeatures
>-- Transactions: true <1>
>-- Computer: false
>-- Persistence: true
...
gremlin> graph.tx().onReadWrite(Transaction.READ_WRITE_BEHAVIOR.AUTO) <2>
==>org.apache.tinkerpop.gremlin.neo4j.structure.Neo4jGraph$Neo4jTransaction@1c067c0d
gremlin> graph.addVertex("name","stephen") <3>
==>v[0]
gremlin> graph.tx().commit() <4>
==>null
gremlin> graph.tx().onReadWrite(Transaction.READ_WRITE_BEHAVIOR.MANUAL) <5>
==>org.apache.tinkerpop.gremlin.neo4j.structure.Neo4jGraph$Neo4jTransaction@1c067c0d
gremlin> graph.tx().isOpen()
==>false
gremlin> graph.addVertex("name","marko") <6>
Open a transaction before attempting to read/write the transaction
gremlin> graph.tx().open() <7>
==>null
gremlin> graph.addVertex("name","marko") <8>
==>v[1]
gremlin> graph.tx().commit()
==>null
----
<1> Check `features` to ensure that the graph supports transactions.
<2> By default, `Neo4jGraph` is configured with "automatic" transactions, so it is set here for demonstration purposes only.
<3> When the vertex is added, the transaction is automatically started. From this point, more mutations can be staged or other read operations executed in the context of that open transaction.
<4> Calling `commit` finalizes the transaction.
<5> Change transaction behavior to require manual control.
<6> Adding a vertex now results in failure because the transaction was not explicitly opened.
<7> Explicitly open a transaction.
<8> Adding a vertex now succeeds as the transaction was manually opened.
NOTE: It may be important to consult the documentation of the `Graph` implementation when it comes to the specifics of how transactions will behave. TinkerPop allows some latitude in this area and implementations may not have the exact same behaviors and link:https://en.wikipedia.org/wiki/ACID[ACID] guarantees.
Retries
~~~~~~~
There are times when transactions fail. Failure may be indicative of some permanent condition, but other failures might simply require the transaction to be retried for possible future success. The `Transaction` object also exposes a method for executing automatic transaction retries:
[gremlin-groovy]
----
graph = Neo4jGraph.open('/tmp/neo4j')
graph.tx().submit {it.addVertex("name","josh")}.retry(10)
graph.tx().submit {it.addVertex("name","daniel")}.exponentialBackoff(10)
graph.close()
----
As shown above, the `submit` method takes a `Function<Graph, R>` which is the unit of work to execute and possibly retry on failure. The method returns a `Transaction.Workload` object which has a number of default methods for common retry strategies. It is also possible to supply a custom retry function if a default one does not suit the required purpose.
Threaded Transactions
~~~~~~~~~~~~~~~~~~~~~
Most `Graph` implementations that support transactions do so in a `ThreadLocal` manner, where the current transaction is bound to the current thread of execution. Consider the following example to demonstrate:
[source,java]
----
graph.addVertex("name","stephen");
Thread t1 = new Thread(() -> {
graph.addVertex("name","josh");
});
Thread t2 = new Thread(() -> {
graph.addVertex("name","marko");
});
t1.start()
t2.start()
t1.join()
t2.join()
graph.tx().commit();
----
The above code shows three vertices added to `graph` in three different threads: the current thread, `t1` and `t2`. One might expect that by the time this body of code finished executing, that there would be three vertices persisted to the `Graph`. However, given the `ThreadLocal` nature of transactions, there really were three separate transactions created in that body of code (i.e. one for each thread of execution) and the only one committed was the first call to `addVertex` in the primary thread of execution. The other two calls to that method within `t1` and `t2` were never committed and thus orphaned.
A `Graph` that `supportsThreadedTransactions` is one that allows for a `Graph` to operate outside of that constraint, thus allowing multiple threads to operate within the same transaction. Therefore, if there was a need to have three different threads operating within the same transaction, the above code could be re-written as follows:
[source,java]
----
Graph threaded = graph.tx().newThreadedTx();
threaded.addVertex("name","stephen");
Thread t1 = new Thread(() -> {
threaded.addVertex("name","josh");
});
Thread t2 = new Thread(() -> {
threaded.addVertex("name","marko");
});
t1.start()
t2.start()
t1.join()
t2.join()
threaded.tx().commit();
----
In the above case, the call to `graph.tx().newThreadedTx()` creates a new `Graph` instance that is unbound from the `ThreadLocal` transaction, thus allowing each thread to operate on it in the same context. In this case, there would be three separate vertices persisted to the `Graph`.
Gremlin I/O
-----------
image:gremlin-io.png[width=250,float=right] The task of getting data in and out of `Graph` instances is the job of the Gremlin I/O packages. Gremlin I/O provides two interfaces for reading and writing `Graph` instances: `GraphReader` and `GraphWriter`. These interfaces expose methods that support:
* Reading and writing an entire `Graph`
* Reading and writing a `Traversal<Vertex>` as adjacency list format
* Reading and writing a single `Vertex` (with and without associated `Edge` objects)
* Reading and writing a single `Edge`
* Reading and writing a single `VertexProperty`
* Reading and writing a single `Property`
* Reading and writing an arbitrary `Object`
In all cases, these methods operate in the currency of `InputStream` and `OutputStream` objects, allowing graphs and their related elements to be written to and read from files, byte arrays, etc. The `Graph` interface offers the `io` method, which provides access to "reader/writer builder" objects that are pre-configured with serializers provided by the `Graph`, as well as helper methods for the various I/O capabilities. Unless there are very advanced requirements for the serialization process, it is always best to utilize the methods on the `Io` interface to construct `GraphReader` and `GraphWriter` instances, as the implementation may provide some custom settings that would otherwise have to be configured manually by the user to do the serialization.
It is up to the implementations of the `GraphReader` and `GraphWriter` interfaces to choose the methods they implement and the manner in which they work together. The only semantics enforced and expected is that the write methods should produce output that is compatible with the corresponding read method (e.g. the output of `writeVertices` should be readable as input to `readVertices` and the output of `writeProperty` should be readable as input to `readProperty`).
GraphML Reader/Writer
~~~~~~~~~~~~~~~~~~~~~
image:gremlin-graphml.png[width=350,float=left] The link:http://graphml.graphdrawing.org/[GraphML] file format is a common XML-based representation of a graph. It is widely supported by graph-related tools and libraries making it a solid interchange format for TinkerPop. In other words, if the intent is to work with graph data in conjunction with applications outside of TinkerPop, GraphML may be the best choice to do that. Common use cases might be:
* Generate a graph using link:https://networkx.github.io/[NetworkX], export it with GraphML and import it to TinkerPop.
* Produce a subgraph and export it to GraphML to be consumed by and visualized in link:https://gephi.org/[Gephi].
* Migrate the data of an entire graph to a different graph database not supported by TinkerPop.
As GraphML is a specification for the serialization of an entire graph and not the individual elements of a graph, methods that support input and output of single vertices, edges, etc. are not supported.
CAUTION: GraphML is a "lossy" format in that it only supports primitive values for properties and does not have support for `Graph` variables. It will use `toString` to serialize property values outside of those primitives.
CAUTION: GraphML, as a specification, allows for `<edge>` and `<node>` elements to appear in any order. The `GraphMLReader` will support that, however, that capability comes with a limitation. TinkerPop does not allow the vertex label to be changed after the vertex has been created. Therefore, if an `<edge>` element comes before the `<node>` the label on the vertex will be ignored. It is thus better to order `<node>` elements in the GraphML to appear before all `<edge>` elements if vertex labels are important to the graph.
The following code shows how to write a `Graph` instance to file called `tinkerpop-modern.xml` and then how to read that file back into a different instance:
[source,java]
----
final Graph graph = TinkerFactory.createModern();
graph.io(IoCore.graphml()).writeGraph("tinkerpop-modern.xml");
final Graph newGraph = TinkerGraph.open();
newGraph.io(IoCore.graphml()).readGraph("tinkerpop-modern.xml");
----
If a custom configuration is required, then have the `Graph` generate a `GraphReader` or `GraphWriter` "builder" instance:
[source,java]
----
final Graph graph = TinkerFactory.createModern();
try (final OutputStream os = new FileOutputStream("tinkerpop-modern.xml")) {
graph.io(IoCore.graphml()).writer().normalize(true).create().writeGraph(os, graph);
}
final Graph newGraph = TinkerGraph.open();
try (final InputStream stream = new FileInputStream("tinkerpop-modern.xml")) {
newGraph.io(IoCore.graphml()).reader().vertexIdKey("name").create().readGraph(stream, newGraph);
}
----
[[graphson-reader-writer]]
GraphSON Reader/Writer
~~~~~~~~~~~~~~~~~~~~~~
image:gremlin-graphson.png[width=350,float=left] GraphSON is a link:http://json.org/[JSON]-based format extended from earlier versions of TinkerPop. It is important to note that TinkerPop3's GraphSON is not backwards compatible with prior TinkerPop GraphSON versions. GraphSON has some support from graph-related application outside of TinkerPop, but it is generally best used in two cases:
* A text format of the graph or its elements is desired (e.g. debugging, usage in source control, etc.)
* The graph or its elements need to be consumed by code that is not JVM-based (e.g. JavaScript, Python, .NET, etc.)
GraphSON supports all of the `GraphReader` and `GraphWriter` interface methods and can therefore read or write an entire `Graph`, vertices, arbitrary objects, etc. The following code shows how to write a `Graph` instance to file called `tinkerpop-modern.json` and then how to read that file back into a different instance:
[source,java]
----
final Graph graph = TinkerFactory.createModern();
graph.io(IoCore.graphson()).writeGraph("tinkerpop-modern.json");
final Graph newGraph = TinkerGraph.open();
newGraph.io(IoCore.graphson()).readGraph("tinkerpop-modern.json");
----
If a custom configuration is required, then have the `Graph` generate a `GraphReader` or `GraphWriter` "builder" instance:
[source,java]
----
final Graph graph = TinkerFactory.createModern();
try (final OutputStream os = new FileOutputStream("tinkerpop-modern.json")) {
final GraphSONMapper mapper = graph.io(IoCore.graphson()).mapper().normalize(true).create()
graph.io(IoCore.graphson()).writer().mapper(mapper).create().writeGraph(os, graph)
}
final Graph newGraph = TinkerGraph.open();
try (final InputStream stream = new FileInputStream("tinkerpop-modern.json")) {
newGraph.io(IoCore.graphson()).reader().vertexIdKey("name").create().readGraph(stream, newGraph);
}
----
One of the important configuration options of the `GraphSONReader` and `GraphSONWriter` is the ability to embed type information into the output. By embedding the types, it becomes possible to serialize a graph without losing type information that might be important when being consumed by another source. The importance of this concept is demonstrated in the following example where a single `Vertex` is written to GraphSON using the Gremlin Console:
[gremlin-groovy]
----
graph = TinkerFactory.createModern()
g = graph.traversal()
f = new FileOutputStream("vertex-1.json")
graph.io(graphson()).writer().create().writeVertex(f, g.V(1).next(), BOTH)
f.close()
----
The following GraphSON example shows the output of `GraphSonWriter.writeVertex()` with associated edges:
[source,json]
----
{
"id": 1,
"label": "person",
"outE": {
"created": [
{
"id": 9,
"inV": 3,
"properties": {
"weight": 0.4
}
}
],
"knows": [
{
"id": 7,
"inV": 2,
"properties": {
"weight": 0.5
}
},
{
"id": 8,
"inV": 4,
"properties": {
"weight": 1
}
}
]
},
"properties": {
"name": [
{
"id": 0,
"value": "marko"
}
],
"age": [
{
"id": 1,
"value": 29
}
]
}
}
----
The vertex properly serializes to valid JSON but note that a consuming application will not automatically know how to interpret the numeric values. In coercing those Java values to JSON, such information is lost.
With a minor change to the construction of the `GraphSONWriter` the lossy nature of GraphSON can be avoided:
[gremlin-groovy]
----
graph = TinkerFactory.createModern()
g = graph.traversal()
f = new FileOutputStream("vertex-1.json")
mapper = graph.io(graphson()).mapper().embedTypes(true).create()
graph.io(graphson()).writer().mapper(mapper).create().writeVertex(f, g.V(1).next(), BOTH)
f.close()
----
In the above code, the `embedTypes` option is set to `true` and the output below shows the difference in the output:
[source,json]
----
{
"@class": "java.util.HashMap",
"id": 1,
"label": "person",
"outE": {
"@class": "java.util.HashMap",
"created": [
"java.util.ArrayList",
[
{
"@class": "java.util.HashMap",
"id": 9,
"inV": 3,
"properties": {
"@class": "java.util.HashMap",
"weight": 0.4
}
}
]
],
"knows": [
"java.util.ArrayList",
[
{
"@class": "java.util.HashMap",
"id": 7,
"inV": 2,
"properties": {
"@class": "java.util.HashMap",
"weight": 0.5
}
},
{
"@class": "java.util.HashMap",
"id": 8,
"inV": 4,
"properties": {
"@class": "java.util.HashMap",
"weight": 1
}
}
]
]
},
"properties": {
"@class": "java.util.HashMap",
"name": [
"java.util.ArrayList",
[
{
"@class": "java.util.HashMap",
"id": [
"java.lang.Long",
0
],
"value": "marko"
}
]
],
"age": [
"java.util.ArrayList",
[
{
"@class": "java.util.HashMap",
"id": [
"java.lang.Long",
1
],
"value": 29
}
]
]
}
}
----
The ambiguity of components of the GraphSON is now removed by the `@class` property, which contains Java class information for the data it is associated with. The `@class` property is used for all non-final types, with the exception of a small number of "natural" types (String, Boolean, Integer, and Double) which can be correctly inferred from JSON typing. While the output is more verbose, it comes with the security of not losing type information. While non-JVM languages won't be able to consume this information automatically, at least there is a hint as to how the values should be coerced back into the correct types in the target language.
[[gryo-reader-writer]]
Gryo Reader/Writer
~~~~~~~~~~~~~~~~~~
image:gremlin-kryo.png[width=400,float=left] link:https://github.com/EsotericSoftware/kryo[Kryo] is a popular serialization package for the JVM. Gremlin-Kryo is a binary `Graph` serialization format for use on the JVM by JVM languages. It is designed to be space efficient, non-lossy and is promoted as the standard format to use when working with graph data inside of the TinkerPop stack. A list of common use cases is presented below:
* Migration from one Gremlin Structure implementation to another (e.g. `TinkerGraph` to `Neo4jGraph`)
* Serialization of individual graph elements to be sent over the network to another JVM.
* Backups of in-memory graphs or subgraphs.
CAUTION: When migrating between Gremlin Structure implementations, Kryo may not lose data, but it is important to consider the features of each `Graph` and whether or not the data types supported in one will be supported in the other. Failure to do so, may result in errors.
Kryo supports all of the `GraphReader` and `GraphWriter` interface methods and can therefore read or write an entire `Graph`, vertices, edges, etc. The following code shows how to write a `Graph` instance to file called `tinkerpop-modern.kryo` and then how to read that file back into a different instance:
[source,java]
----
final Graph graph = TinkerFactory.createModern();
graph.io(IoCore.gryo()).writeGraph("tinkerpop-modern.kryo");
final Graph newGraph = TinkerGraph.open();
newGraph.io(IoCore.gryo()).readGraph("tinkerpop-modern.kryo")'
----
If a custom configuration is required, then have the `Graph` generate a `GraphReader` or `GraphWriter` "builder" instance:
[source,java]
----
final Graph graph = TinkerFactory.createModern();
try (final OutputStream os = new FileOutputStream("tinkerpop-modern.kryo")) {
graph.io(IoCore.gryo()).writer().create().writeGraph(os, graph);
}
final Graph newGraph = TinkerGraph.open();
try (final InputStream stream = new FileInputStream("tinkerpop-modern.kryo")) {
newGraph.io(IoCore.gryo()).reader().vertexIdKey("name").create().readGraph(stream, newGraph);
}
----
NOTE: The preferred extension for files names produced by Gryo is `.kryo`.
TinkerPop2 Data Migration
~~~~~~~~~~~~~~~~~~~~~~~~~
image:data-migration.png[width=300,float=right] For those using TinkerPop2, migrating to TinkerPop3 will mean a number of programming changes, but may also require a migration of the data depending on the graph implementation. For example, trying to open `TinkerGraph` data from TinkerPop2 with TinkerPop3 code will not work, however opening a TinkerPop2 `Neo4jGraph` with a TinkerPop3 `Neo4jGraph` should work provided there aren't Neo4j version compatibility mismatches preventing the read.
If such a situation arises that a particular TinkerPop2 `Graph` can not be read by TinkerPop3, a "legacy" data migration approach exists. The migration involves writing the TinkerPop2 `Graph` to GraphSON, then reading it to TinkerPop3 with the `LegacyGraphSONReader` (a limited implementation of the `GraphReader` interface).
The following represents an example migration of the "classic" toy graph. In this example, the "classic" graph is saved to GraphSON using TinkerPop2.
[source,groovy]
----
gremlin> Gremlin.version()
==>2.5.z
gremlin> graph = TinkerGraphFactory.createTinkerGraph()
==>tinkergraph[vertices:6 edges:6]
gremlin> GraphSONWriter.outputGraph(graph,'/tmp/tp2.json',GraphSONMode.EXTENDED)
==>null
----
The above console session uses the `gremlin-groovy` distribution from TinkerPop2. It is important to generate the `tp2.json` file using the `EXTENDED` mode as it will include data types when necessary which will help limit "lossiness" on the TinkerPop3 side when imported. Once `tp2.json` is created, it can then be imported to a TinkerPop3 `Graph`.
[source,groovy]
----
gremlin> Gremlin.version()
==>x.y.z
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> r = LegacyGraphSONReader.build().create()
==>org.apache.tinkerpop.gremlin.structure.io.graphson.LegacyGraphSONReader@64337702
gremlin> r.readGraph(new FileInputStream('/tmp/tp2.json'), graph)
==>null
gremlin> g = graph.traversal(standard())
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.E()
==>e[11][4-created->3]
==>e[12][6-created->3]
==>e[7][1-knows->2]
==>e[8][1-knows->4]
==>e[9][1-created->3]
==>e[10][4-created->5]
----
Namespace Conventions
---------------------
End users, <<implementations,vendors>>, <<graphcomputer,`GraphComputer`>> algorithm designers, <<gremlin-plugins,GremlinPlugin>> creators, etc. all leverage properties on elements to store information. There are a few conventions that should be respected when naming property keys to ensure that conflicts between these stakeholders do not conflict.
* End users are granted the _flat namespace_ (e.g. `name`, `age`, `location`) to key their properties and label their elements.
* Vendors are granted the _hidden namespace_ (e.g. `~metadata`) to key their properties and labels. Data key'd as such is only accessible via the vendor implementation code and no other stakeholders are granted read nor write access to data prefixed with "~" (see `Graph.Hidden`). Test coverage and exceptions exist to ensure that vendors respect this hard boundary.
* <<vertexprogram,`VertexProgram`>> and <<mapreduce,`MapReduce`>> developers should, like `GraphStrategy` developers, leverage _qualified namespaces_ particular to their domain (e.g. `mydomain.myvertexprogram.computedata`).
* `GremlinPlugin` creators should prefix their plugin name with their domain (e.g. `mydomain.myplugin`).
IMPORTANT: TinkerPop uses `tinkerpop.` and `gremlin.` as the prefixes for provided strategies, vertex programs, map reduce implementations, and plugins.
The only truly protected namespace is the _hidden namespace_ provided to vendors. From there, its up to engineers to respect the namespacing conventions presented.