docs/src/reference/implementations-tinkergraph.asciidoc - tinkerpop - Git at Google

 ////
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 ////
 [[tinkergraph-gremlin]]
 == TinkerGraph-Gremlin

 [source,xml]
 ----
 <dependency>
    <groupId>org.apache.tinkerpop</groupId>
    <artifactId>tinkergraph-gremlin</artifactId>
    <version>x.y.z</version>
 </dependency>
 ----

 image:tinkerpop-character.png[width=100,float=left] TinkerGraph is a single machine, in-memory (with optional
 persistence), non-transactional graph engine that provides both OLTP and OLAP functionality. It is deployed with
 TinkerPop3 and serves as the reference implementation for other providers to study in order to understand the
 semantics of the various methods of the TinkerPop3 API. Its status as a reference implementation does not however imply
 that it is not suitable for production. TinkerGraph has many practical use cases in production applications and their
 development. Some examples of TinkerGraph use cases include:

 * Ad-hoc analysis of large immutable graphs that fit in memory.
 * Extract subgraphs, from larger graphs that don't fit in memory, into TinkerGraph for further analysis or other
 purposes.
 * Use TinkerGraph as a sandbox to develop and debug complex traversals by simulating data from a larger graph inside
 a TinkerGraph.

 Constructing a simple graph using TinkerGraph in Java8 is presented below:

 [source,java]
 ----
 Graph graph = TinkerGraph.open();
 GraphTraversalSource g = graph.traversal();
 Vertex marko = g.addV("person").property("name","marko").property("age",29).next();
 Vertex lop = g.addV("software").property("name","lop").property("lang","java").next();
 g.addE("created").from(marko).to(lop).property("weight",0.6d).iterate();
 ----

 The above Gremlin creates two vertices named "marko" and "lop" and connects them via a created-edge with a weight=0.6
 property. The addition of these two vertices and the edge between them could also be done in a single Gremlin statement
 as follows:

 [source,java]
 ----
 g.addV("person").property("name","marko").property("age",29).as("m").
   addV("software").property("name","lop").property("lang","java").as("l").
   addE("created").from("m").to("l").property("weight",0.6d).iterate();
 ----

 IMPORTANT: Pay attention to the fact that traversals end with `next()` or `iterate()`. These methods advance the
 objects in the traversal stream and without those methods, the traversal does nothing. Review the
 link:https://tinkerpop.apache.org/docs/x.y.z/tutorials/the-gremlin-console/#result-iteration[Result Iteration Section]
 of The Gremlin Console tutorial for more information.

 Next, the graph can be queried as such.

 [source,java]
 g.V().has("name","marko").out("created").values("name")

 The `g.V().has("name","marko")` part of the query can be executed in two ways.

  * A linear scan of all vertices filtering out those vertices that don't have the name "marko"
  * A `O(log(|V|))` index lookup for all vertices with the name "marko"

 Given the initial graph construction in the first code block, no index was defined and thus, a linear scan is executed.
 However, if the graph was constructed as such, then an index lookup would be used.

 [source,java]
 Graph g = TinkerGraph.open();
 g.createIndex("name",Vertex.class)

 The execution times for a vertex lookup by property is provided below for both no-index and indexed version of
 TinkerGraph over the Grateful Dead graph.

 [gremlin-groovy]
 ----
 graph = TinkerGraph.open()
 g = graph.traversal()
 graph.io(graphml()).readGraph('data/grateful-dead.xml')
 clock(1000) {g.V().has('name','Garcia').iterate()} <1>
 graph = TinkerGraph.open()
 g = graph.traversal()
 graph.createIndex('name',Vertex.class)
 graph.io(graphml()).readGraph('data/grateful-dead.xml')
 clock(1000){g.V().has('name','Garcia').iterate()} <2>
 ----

 <1> Determine the average runtime of 1000 vertex lookups when no `name`-index is defined.
 <2> Determine the average runtime of 1000 vertex lookups when a `name`-index is defined.

 IMPORTANT: Each graph system will have different mechanism by which indices and schemas are defined. TinkerPop3
 does not require any conformance in this area. In TinkerGraph, the only definitions are around indices. With other
 graph systems, property value types, indices, edge labels, etc. may be required to be defined _a priori_ to adding
 data to the graph.

 NOTE: TinkerGraph is distributed with Gremlin Server and is therefore automatically available to it for configuration.

 === Configuration

 TinkerGraph has several settings that can be provided on creation via `Configuration` object:

 [width="100%",cols="2,10",options="header"]
 |=========================================================
 |Property |Description
 |gremlin.graph |`org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph`
 |gremlin.tinkergraph.vertexIdManager |The `IdManager` implementation to use for vertices.
 |gremlin.tinkergraph.edgeIdManager |The `IdManager` implementation to use for edges.
 |gremlin.tinkergraph.vertexPropertyIdManager |The `IdManager` implementation to use for vertex properties.
 |gremlin.tinkergraph.defaultVertexPropertyCardinality |The default `VertexProperty.Cardinality` to use when `Vertex.property(k,v)` is called.
 |gremlin.tinkergraph.graphLocation |The path and file name for where TinkerGraph should persist the graph data. If a
 value is specified here, the `gremlin.tinkergraph.graphFormat` should also be specified.  If this value is not
 included (default), then the graph will stay in-memory and not be loaded/persisted to disk.
 |gremlin.tinkergraph.graphFormat |The format to use to serialize the graph which may be one of the following:
 `graphml`, `graphson`, `gryo`, or a fully qualified class name that implements Io.Builder interface (which allows for
 external third party graph reader/writer formats to be used for persistence).
 If a value is specified here, then the `gremlin.tinkergraph.graphLocation` should
 also be specified.  If this value is not included (default), then the graph will stay in-memory and not be
 loaded/persisted to disk.
 |=========================================================

 The `IdManager` settings above refer to how TinkerGraph will control identifiers for vertices, edges and vertex
 properties.  There are several options for each of these settings: `ANY`, `LONG`, `INTEGER`, `UUID`, or the fully
 qualified class name of an `IdManager` implementation on the classpath.  When not specified, the default values
 for all settings is `ANY`, meaning that the graph will work with any object on the JVM as the identifier and will
 generate new identifiers from `Long` when the identifier is not user supplied.  TinkerGraph will also expect the
 user to understand the types used for identifiers when querying, meaning that `g.V(1)` and `g.V(1L)` could return
 two different vertices.  `LONG`, `INTEGER` and `UUID` settings will try to coerce identifier values to the expected
 type as well as generate new identifiers with that specified type.

 If the TinkerGraph is configured for persistence with `gremlin.tinkergraph.graphLocation` and
 `gremlin.tinkergraph.graphFormat`, then the graph will be written to the specified location with the specified
 format when `Graph.close()` is called.  In addition, if these settings are present, TinkerGraph will attempt to
 load the graph from the specified location.

 IMPORTANT: If choosing `graphson` as the `gremlin.tinkergraph.graphFormat`, be sure to also establish the  various
 `IdManager` settings as well to ensure that identifiers are properly coerced to the appropriate types as GraphSON
 can lose the identifier's type during serialization (i.e. it will assume `Integer` when the default for TinkerGraph
 is `Long`, which could lead to load errors that result in a message like, "Vertex with id already exists").

 It is important to consider the data being imported to TinkerGraph with respect to `defaultVertexPropertyCardinality`
 setting.  For example, if a `.gryo` file is known to contain multi-property data, be sure to set the default
 cardinality to `list` or else the data will import as `single`.  Consider the following:

 [gremlin-groovy]
 ----
 graph = TinkerGraph.open()
 graph.io(gryo()).readGraph("data/tinkerpop-crew.kryo")
 g = graph.traversal()
 g.V().properties()
 conf = new BaseConfiguration()
 conf.setProperty("gremlin.tinkergraph.defaultVertexPropertyCardinality","list")
 graph = TinkerGraph.open(conf)
 graph.io(gryo()).readGraph("data/tinkerpop-crew.kryo")
 g = graph.traversal()
 g.V().properties()
 ----
	////
	Licensed to the Apache Software Foundation (ASF) under one or more
	contributor license agreements. See the NOTICE file distributed with
	this work for additional information regarding copyright ownership.
	The ASF licenses this file to You under the Apache License, Version 2.0
	(the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	////
	[[tinkergraph-gremlin]]
	== TinkerGraph-Gremlin

	[source,xml]
	----
	<dependency>
	<groupId>org.apache.tinkerpop</groupId>
	<artifactId>tinkergraph-gremlin</artifactId>
	<version>x.y.z</version>
	</dependency>
	----

	image:tinkerpop-character.png[width=100,float=left] TinkerGraph is a single machine, in-memory (with optional
	persistence), non-transactional graph engine that provides both OLTP and OLAP functionality. It is deployed with
	TinkerPop3 and serves as the reference implementation for other providers to study in order to understand the
	semantics of the various methods of the TinkerPop3 API. Its status as a reference implementation does not however imply
	that it is not suitable for production. TinkerGraph has many practical use cases in production applications and their
	development. Some examples of TinkerGraph use cases include:

	* Ad-hoc analysis of large immutable graphs that fit in memory.
	* Extract subgraphs, from larger graphs that don't fit in memory, into TinkerGraph for further analysis or other
	purposes.
	* Use TinkerGraph as a sandbox to develop and debug complex traversals by simulating data from a larger graph inside
	a TinkerGraph.

	Constructing a simple graph using TinkerGraph in Java8 is presented below:

	[source,java]
	----
	Graph graph = TinkerGraph.open();
	GraphTraversalSource g = graph.traversal();
	Vertex marko = g.addV("person").property("name","marko").property("age",29).next();
	Vertex lop = g.addV("software").property("name","lop").property("lang","java").next();
	g.addE("created").from(marko).to(lop).property("weight",0.6d).iterate();
	----

	The above Gremlin creates two vertices named "marko" and "lop" and connects them via a created-edge with a weight=0.6
	property. The addition of these two vertices and the edge between them could also be done in a single Gremlin statement
	as follows:

	[source,java]
	----
	g.addV("person").property("name","marko").property("age",29).as("m").
	addV("software").property("name","lop").property("lang","java").as("l").
	addE("created").from("m").to("l").property("weight",0.6d).iterate();
	----

	IMPORTANT: Pay attention to the fact that traversals end with `next()` or `iterate()`. These methods advance the
	objects in the traversal stream and without those methods, the traversal does nothing. Review the
	link:https://tinkerpop.apache.org/docs/x.y.z/tutorials/the-gremlin-console/#result-iteration[Result Iteration Section]
	of The Gremlin Console tutorial for more information.

	Next, the graph can be queried as such.

	[source,java]
	g.V().has("name","marko").out("created").values("name")

	The `g.V().has("name","marko")` part of the query can be executed in two ways.

	* A linear scan of all vertices filtering out those vertices that don't have the name "marko"
	* A `O(log(\|V\|))` index lookup for all vertices with the name "marko"

	Given the initial graph construction in the first code block, no index was defined and thus, a linear scan is executed.
	However, if the graph was constructed as such, then an index lookup would be used.

	[source,java]
	Graph g = TinkerGraph.open();
	g.createIndex("name",Vertex.class)

	The execution times for a vertex lookup by property is provided below for both no-index and indexed version of
	TinkerGraph over the Grateful Dead graph.

	[gremlin-groovy]
	----
	graph = TinkerGraph.open()
	g = graph.traversal()
	graph.io(graphml()).readGraph('data/grateful-dead.xml')
	clock(1000) {g.V().has('name','Garcia').iterate()} <1>
	graph = TinkerGraph.open()
	g = graph.traversal()
	graph.createIndex('name',Vertex.class)
	graph.io(graphml()).readGraph('data/grateful-dead.xml')
	clock(1000){g.V().has('name','Garcia').iterate()} <2>
	----

	<1> Determine the average runtime of 1000 vertex lookups when no `name`-index is defined.
	<2> Determine the average runtime of 1000 vertex lookups when a `name`-index is defined.

	IMPORTANT: Each graph system will have different mechanism by which indices and schemas are defined. TinkerPop3
	does not require any conformance in this area. In TinkerGraph, the only definitions are around indices. With other
	graph systems, property value types, indices, edge labels, etc. may be required to be defined _a priori_ to adding
	data to the graph.

	NOTE: TinkerGraph is distributed with Gremlin Server and is therefore automatically available to it for configuration.

	=== Configuration

	TinkerGraph has several settings that can be provided on creation via `Configuration` object:

	[width="100%",cols="2,10",options="header"]
	\|=========================================================
	\|Property \|Description
	\|gremlin.graph \|`org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph`
	\|gremlin.tinkergraph.vertexIdManager \|The `IdManager` implementation to use for vertices.
	\|gremlin.tinkergraph.edgeIdManager \|The `IdManager` implementation to use for edges.
	\|gremlin.tinkergraph.vertexPropertyIdManager \|The `IdManager` implementation to use for vertex properties.
	\|gremlin.tinkergraph.defaultVertexPropertyCardinality \|The default `VertexProperty.Cardinality` to use when `Vertex.property(k,v)` is called.
	\|gremlin.tinkergraph.graphLocation \|The path and file name for where TinkerGraph should persist the graph data. If a
	value is specified here, the `gremlin.tinkergraph.graphFormat` should also be specified. If this value is not
	included (default), then the graph will stay in-memory and not be loaded/persisted to disk.
	\|gremlin.tinkergraph.graphFormat \|The format to use to serialize the graph which may be one of the following:
	`graphml`, `graphson`, `gryo`, or a fully qualified class name that implements Io.Builder interface (which allows for
	external third party graph reader/writer formats to be used for persistence).
	If a value is specified here, then the `gremlin.tinkergraph.graphLocation` should
	also be specified. If this value is not included (default), then the graph will stay in-memory and not be
	loaded/persisted to disk.
	\|=========================================================

	The `IdManager` settings above refer to how TinkerGraph will control identifiers for vertices, edges and vertex
	properties. There are several options for each of these settings: `ANY`, `LONG`, `INTEGER`, `UUID`, or the fully
	qualified class name of an `IdManager` implementation on the classpath. When not specified, the default values
	for all settings is `ANY`, meaning that the graph will work with any object on the JVM as the identifier and will
	generate new identifiers from `Long` when the identifier is not user supplied. TinkerGraph will also expect the
	user to understand the types used for identifiers when querying, meaning that `g.V(1)` and `g.V(1L)` could return
	two different vertices. `LONG`, `INTEGER` and `UUID` settings will try to coerce identifier values to the expected
	type as well as generate new identifiers with that specified type.

	If the TinkerGraph is configured for persistence with `gremlin.tinkergraph.graphLocation` and
	`gremlin.tinkergraph.graphFormat`, then the graph will be written to the specified location with the specified
	format when `Graph.close()` is called. In addition, if these settings are present, TinkerGraph will attempt to
	load the graph from the specified location.

	IMPORTANT: If choosing `graphson` as the `gremlin.tinkergraph.graphFormat`, be sure to also establish the various
	`IdManager` settings as well to ensure that identifiers are properly coerced to the appropriate types as GraphSON
	can lose the identifier's type during serialization (i.e. it will assume `Integer` when the default for TinkerGraph
	is `Long`, which could lead to load errors that result in a message like, "Vertex with id already exists").

	It is important to consider the data being imported to TinkerGraph with respect to `defaultVertexPropertyCardinality`
	setting. For example, if a `.gryo` file is known to contain multi-property data, be sure to set the default
	cardinality to `list` or else the data will import as `single`. Consider the following:

	[gremlin-groovy]
	----
	graph = TinkerGraph.open()
	graph.io(gryo()).readGraph("data/tinkerpop-crew.kryo")
	g = graph.traversal()
	g.V().properties()
	conf = new BaseConfiguration()
	conf.setProperty("gremlin.tinkergraph.defaultVertexPropertyCardinality","list")
	graph = TinkerGraph.open(conf)
	graph.io(gryo()).readGraph("data/tinkerpop-crew.kryo")
	g = graph.traversal()
	g.V().properties()
	----