docs/src/dev/provider/index.asciidoc - tinkerpop - Git at Google

 ////
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 ////
 image::apache-tinkerpop-logo.png[width=500,link="https://tinkerpop.apache.org"]

 *x.y.z*

 :toc-position: left

 = Introduction

 image:tinkerpop-cityscape.png[]

 This document discusses Apache TinkerPop™ implementation details that are most useful to developers who implement
 TinkerPop interfaces and the Gremlin language. This document may also be helpful to Gremlin users who simply want a
 deeper understanding of how TinkerPop works and what the behavioral semantics of Gremlin are. The
 <<providers,Provider Section>> outlines the various integration and extension points that TinkerPop has while the
 <<gremlin-semantics,Gremlin Semantics Section>> documents the Gremlin language itself.

 Providers who rely on the TinkerPop execution engine generally receive the behaviors described in the Gremlin Semantics
 section for free, but those who develop their own engine or extend upon the certain features should refer to that
 section for the details required for a consistent Gremlin experience.

 [[providers]]
 = Provider Documentation

 TinkerPop exposes a set of interfaces, protocols, and tests that make it possible for third-parties to build
 libraries and systems that plug-in to the TinkerPop stack.  TinkerPop refers to those third-parties as "providers" and
 this documentation is designed to help providers understand what is involved in developing code on these lower levels
 of the TinkerPop API.

 This document attempts to address the needs of the different providers that have been identified:

 * Graph System Provider
 ** Graph Database Provider
 ** Graph Processor Provider
 * Graph Driver Provider
 * Graph Language Provider
 * Graph Plugin Provider

 [[graph-system-provider-requirements]]
 == Graph System Provider Requirements

 image:tinkerpop-enabled.png[width=140,float=left] At the core of TinkerPop 3.x is a Java API. The implementation of this
 core API and its validation via the `gremlin-test` suite is all that is required of a graph system provider wishing to
 provide a TinkerPop-enabled graph engine. Once a graph system has a valid implementation, then all the applications
 provided by TinkerPop (e.g. Gremlin Console, Gremlin Server, etc.) and 3rd-party developers (e.g. Gremlin-Scala,
 Gremlin-JS, etc.) will integrate properly. Finally, please feel free to use the logo on the left to promote your
 TinkerPop implementation.

 [[graph-structure-api]]
 === Graph Structure API

 The graph structure API of TinkerPop provides the interfaces necessary to create a TinkerPop enabled system and
 exposes the basic components of a property graph to include `Graph`, `Vertex`, `Edge`, `VertexProperty` and `Property`.
 The structure API can be used directly as follows:

 [source,java]
 ----
 Graph graph = TinkerGraph.open(); <1>
 Vertex marko = graph.addVertex(T.label, "person", T.id, 1, "name", "marko", "age", 29); <2>
 Vertex vadas = graph.addVertex(T.label, "person", T.id, 2, "name", "vadas", "age", 27);
 Vertex lop = graph.addVertex(T.label, "software", T.id, 3, "name", "lop", "lang", "java");
 Vertex josh = graph.addVertex(T.label, "person", T.id, 4, "name", "josh", "age", 32);
 Vertex ripple = graph.addVertex(T.label, "software", T.id, 5, "name", "ripple", "lang", "java");
 Vertex peter = graph.addVertex(T.label, "person", T.id, 6, "name", "peter", "age", 35);
 marko.addEdge("knows", vadas, T.id, 7, "weight", 0.5f); <3>
 marko.addEdge("knows", josh, T.id, 8, "weight", 1.0f);
 marko.addEdge("created", lop, T.id, 9, "weight", 0.4f);
 josh.addEdge("created", ripple, T.id, 10, "weight", 1.0f);
 josh.addEdge("created", lop, T.id, 11, "weight", 0.4f);
 peter.addEdge("created", lop, T.id, 12, "weight", 0.2f);
 ----

 <1> Create a new in-memory `TinkerGraph` and assign it to the variable `graph`.
 <2> Create a vertex along with a set of key/value pairs with `T.label` being the vertex label and `T.id` being the vertex id.
 <3> Create an edge along with a  set of key/value pairs with the edge label being specified as the first argument.

 In the above code all the vertices are created first and then their respective edges. There are two "accessor tokens":
 `T.id` and `T.label`. When any of these, along with a set of other key value pairs is provided to
 `Graph.addVertex(Object...)` or `Vertex.addEdge(String,Vertex,Object...)`, the respective element is created along
 with the provided key/value pair properties appended to it.

 Below is a sequence of basic graph mutation operations represented in Java:

 image:basic-mutation.png[width=240,float=right]
 [source,java]
 ----
 // create a new graph
 Graph graph = TinkerGraph.open();
 // add a software vertex with a name property
 Vertex gremlin = graph.addVertex(T.label, "software",
                              "name", "gremlin"); <1>
 // only one vertex should exist
 assert(IteratorUtils.count(graph.vertices()) == 1)
 // no edges should exist as none have been created
 assert(IteratorUtils.count(graph.edges()) == 0)
 // add a new property
 gremlin.property("created",2009) <2>
 // add a new software vertex to the graph
 Vertex blueprints = graph.addVertex(T.label, "software",
                                 "name", "blueprints"); <3>
 // connect gremlin to blueprints via a dependsOn-edge
 gremlin.addEdge("dependsOn",blueprints); <4>
 // now there are two vertices and one edge
 assert(IteratorUtils.count(graph.vertices()) == 2)
 assert(IteratorUtils.count(graph.edges()) == 1)
 // add a property to blueprints
 blueprints.property("created",2010) <5>
 // remove that property
 blueprints.property("created").remove() <6>
 // connect gremlin to blueprints via encapsulates
 gremlin.addEdge("encapsulates",blueprints) <7>
 assert(IteratorUtils.count(graph.vertices()) == 2)
 assert(IteratorUtils.count(graph.edges()) == 2)
 // removing a vertex removes all its incident edges as well
 blueprints.remove() <8>
 gremlin.remove() <9>
 // the graph is now empty
 assert(IteratorUtils.count(graph.vertices()) == 0)
 assert(IteratorUtils.count(graph.edges()) == 0)
 // tada!
 ----

 The above code samples are just examples of how the structure API can be used to access a graph. Those APIs are then
 used internally by the process API (i.e. Gremlin) to access any graph that implements those structure API interfaces
 to execute queries. Typically, the structure API methods are not used directly by end-users.

 === Implementing Gremlin-Core

 The classes that a graph system provider should focus on implementing are itemized below. It is a good idea to study
 the link:https://tinkerpop.apache.org/docs/x.y.z/reference/#tinkergraph-gremlin[TinkerGraph] (in-memory OLTP and OLAP
 in `tinkergraph-gremlin`), link:https://tinkerpop.apache.org/docs/x.y.z/reference/#neo4j-gremlin[Neo4jGraph]
 (OLTP w/ transactions in `neo4j-gremlin`) and/or
 link:https://tinkerpop.apache.org/docs/x.y.z/reference/#hadoop-gremlin[HadoopGraph] (OLAP in `hadoop-gremlin`)
 implementations for ideas and patterns.

 . Online Transactional Processing Graph Systems (*OLTP*)
  .. Structure API: `Graph`, `Element`, `Vertex`, `Edge`, `Property` and `Transaction` (if transactions are supported).
  .. Process API: `TraversalStrategy` instances for optimizing Gremlin traversals to the provider's graph system (i.e. `TinkerGraphStepStrategy`).
 . Online Analytics Processing Graph Systems (*OLAP*)
  .. Everything required of OLTP is required of OLAP (but not vice versa).
  .. GraphComputer API: `GraphComputer`, `Messenger`, `Memory`.

 Please consider the following implementation notes:

 * Use `StringHelper` to ensuring that the `toString()` representation of classes are consistent with other
 implementations.
 * Ensure that your implementation's `Features` (Graph, Vertex, etc.) are correct so that test cases handle particulars
 accordingly.
 * Use the numerous static method helper classes such as `ElementHelper`, `GraphComputerHelper`, `VertexProgramHelper`, etc.
 * There are a number of default methods on the provided interfaces that are semantically correct. However, if they are
 not efficient for the implementation, override them.
 * Implement the `structure/` package interfaces first and then, if desired, interfaces in the `process/` package
 interfaces.
 * `ComputerGraph` is a `Wrapper` system that ensure proper semantics during a GraphComputer computation.
 * The link:https://tinkerpop.apache.org/javadocs/x.y.z/core/[javadoc] is often a good resource in understanding
 expectations from both the user's perspective as well as the graph provider's perspective. Also consider examining
 the javadoc of TinkerGraph which is often well annotated and the interfaces and classes of the test suite itself.

 [[oltp-implementations]]
 ==== OLTP Implementations

 image:pipes-character-1.png[width=110,float=right] The most important interfaces to implement are in the `structure/`
 package. These include interfaces like `Graph`, `Vertex`, `Edge`, `Property`, `Transaction`, etc. The
 `StructureStandardSuite` will ensure that the semantics of the methods implemented are correct. Moreover, there are
 numerous `Exceptions` classes with static exceptions that should be thrown by the graph system so that all the
 exceptions and their messages are consistent amongst all TinkerPop implementations.

 The following bullets provide some tips to consider when implementing the structure interfaces:

 * `Graph`
 ** Be sure the `Graph` implementation is named as `XXXGraph` (e.g. TinkerGraph, Neo4jGraph, HadoopGraph, etc.).
 ** This implementation needs to be `GraphFactory` compatible which means that the implementation should have a static
 `Graph open(Configuration)` method where the `Configuration` is an Apache Commons class of that name. Alternatively, the
 `Graph` implementation can have the `GraphFactoryClass` annotation which specifies a class with that static
 `Graph open(Configuration)` method.
 * `VertexProperty`
 ** This interface is both a `Property` and an `Element` as `VertexProperty` is a first-class graph element in that it
 can have its own properties (i.e. meta-properties). Even if the implementation does not intend to support
 meta-properties, the `VertexProperty` needs to be implemented as an `Element`.

 [[olap-implementations]]
 ==== OLAP Implementations

 image:furnace-character-1.png[width=110,float=right] Implementing the OLAP interfaces may be a bit more complicated.
 Note that before OLAP interfaces are implemented, it is necessary for the OLTP interfaces to be, at minimal,
 implemented as specified in <<oltp-implementations,OLTP Implementations>>. A summary of each required interface
 implementation is presented below:

 . `GraphComputer`: A fluent builder for specifying an isolation level, a VertexProgram, and any number of MapReduce jobs to be submitted.
 . `Memory`: A global blackboard for ANDing, ORing, INCRing, and SETing values for specified keys.
 . `Messenger`: The system that collects and distributes messages being propagated by vertices executing the VertexProgram application.
 . `MapReduce.MapEmitter`: The system that collects key/value pairs being emitted by the MapReduce applications map-phase.
 . `MapReduce.ReduceEmitter`: The system that collects key/value pairs being emitted by the MapReduce applications combine- and reduce-phases.

 NOTE: The VertexProgram and MapReduce interfaces in the `process/computer/` package are not required by the graph
 system. Instead, these are interfaces to be implemented by application developers writing VertexPrograms and MapReduce jobs.

 IMPORTANT: TinkerPop provides two OLAP implementations:
 link:https://tinkerpop.apache.org/docs/x.y.z/reference/#tinkergraph-gremlin[TinkerGraphComputer] (TinkerGraph),
 and
 link:https://tinkerpop.apache.org/docs/x.y.z/reference/#sparkgraphcomputer[SparkGraphComputer] (Hadoop).
 Given the complexity of the OLAP system, it is good to study and copy many of the patterns used in these reference
 implementations.

 ===== Implementing GraphComputer

 image:furnace-character-3.png[width=150,float=right] The most complex method in GraphComputer is the `submit()`-method. The method must do the following:

 . Ensure the GraphComputer has not already been executed.
 . Ensure that at least there is a VertexProgram or 1 MapReduce job.
 . If there is a VertexProgram, validate that it can execute on the GraphComputer given the respectively defined features.
 . Create the Memory to be used for the computation.
 . Execute the VertexProgram.setup() method once and only once.
 . Execute the VertexProgram.execute() method for each vertex.
 . Execute the VertexProgram.terminate() method once and if true, repeat VertexProgram.execute().
 . When VertexProgram.terminate() returns true, move to MapReduce job execution.
 . MapReduce jobs are not required to be executed in any specified order.
 . For each Vertex, execute MapReduce.map(). Then (if defined) execute MapReduce.combine() and MapReduce.reduce().
 . Update Memory with runtime information.
 . Construct a new `ComputerResult` containing the compute Graph and Memory.

 ===== Implementing Memory

 image:gremlin-brain.png[width=175,float=left] The Memory object is initially defined by `VertexProgram.setup()`.
 The memory data is available in the first round of the `VertexProgram.execute()` method. Each Vertex, when executing
 the VertexProgram, can update the Memory in its round. However, the update is not seen by the other vertices until
 the next round. At the end of the first round, all the updates are aggregated and the new memory data is available
 on the second round. This process repeats until the VertexProgram terminates.

 ===== Implementing Messenger

 The Messenger object is similar to the Memory object in that a vertex can read and write to the Messenger. However,
 the data it reads are the messages sent to the vertex in the previous step and the data it writes are the messages
 that will be readable by the receiving vertices in the subsequent round.

 ===== Implementing MapReduce Emitters

 image:hadoop-logo-notext.png[width=150,float=left] The MapReduce framework in TinkerPop is similar to the model
 popularized by link:http://hadoop.apache.org[Hadoop]. The primary difference is that all Mappers process the vertices
 of the graph, not an arbitrary key/value pair. However, the vertices' edges can not be accessed -- only their
 properties. This greatly reduces the amount of data needed to be pushed through the MapReduce engine as any edge
 information required, can be computed in the VertexProgram.execute() method. Moreover, at this stage, vertices can
 not be mutated, only their token and property data read. A Gremlin OLAP system needs to provide implementations for
 to particular classes: `MapReduce.MapEmitter` and `MapReduce.ReduceEmitter`. TinkerGraph's implementation is provided
 below which demonstrates the simplicity of the algorithm (especially when the data is all within the same JVM).

 [source,java]
 ----
 public class TinkerMapEmitter<K, V> implements MapReduce.MapEmitter<K, V> {

     public Map<K, Queue<V>> reduceMap;
     public Queue<KeyValue<K, V>> mapQueue;
     private final boolean doReduce;

     public TinkerMapEmitter(final boolean doReduce) { <1>
         this.doReduce = doReduce;
         if (this.doReduce)
             this.reduceMap = new ConcurrentHashMap<>();
         else
             this.mapQueue = new ConcurrentLinkedQueue<>();
     }

     @Override
     public void emit(K key, V value) {
         if (this.doReduce)
             this.reduceMap.computeIfAbsent(key, k -> new ConcurrentLinkedQueue<>()).add(value); <2>
         else
             this.mapQueue.add(new KeyValue<>(key, value)); <3>
     }

     protected void complete(final MapReduce<K, V, ?, ?, ?> mapReduce) {
         if (!this.doReduce && mapReduce.getMapKeySort().isPresent()) { <4>
             final Comparator<K> comparator = mapReduce.getMapKeySort().get();
             final List<KeyValue<K, V>> list = new ArrayList<>(this.mapQueue);
             Collections.sort(list, Comparator.comparing(KeyValue::getKey, comparator));
             this.mapQueue.clear();
             this.mapQueue.addAll(list);
         } else if (mapReduce.getMapKeySort().isPresent()) {
             final Comparator<K> comparator = mapReduce.getMapKeySort().get();
             final List<Map.Entry<K, Queue<V>>> list = new ArrayList<>();
             list.addAll(this.reduceMap.entrySet());
             Collections.sort(list, Comparator.comparing(Map.Entry::getKey, comparator));
             this.reduceMap = new LinkedHashMap<>();
             list.forEach(entry -> this.reduceMap.put(entry.getKey(), entry.getValue()));
         }
     }
 }
 ----

 <1> If the MapReduce job has a reduce, then use one data structure (`reduceMap`), else use another (`mapList`). The
 difference being that a reduction requires a grouping by key and therefore, the `Map<K,Queue<V>>` definition. If no
 reduction/grouping is required, then a simple `Queue<KeyValue<K,V>>` can be leveraged.
 <2> If reduce is to follow, then increment the Map with a new value for the key. `MapHelper` is a TinkerPop class
 with static methods for adding data to a Map.
 <3> If no reduce is to follow, then simply append a KeyValue to the queue.
 <4> When the map phase is complete, any map-result sorting required can be executed at this point.

 [source,java]
 ----
 public class TinkerReduceEmitter<OK, OV> implements MapReduce.ReduceEmitter<OK, OV> {

     protected Queue<KeyValue<OK, OV>> reduceQueue = new ConcurrentLinkedQueue<>();

     @Override
     public void emit(final OK key, final OV value) {
         this.reduceQueue.add(new KeyValue<>(key, value));
     }

     protected void complete(final MapReduce<?, ?, OK, OV, ?> mapReduce) {
         if (mapReduce.getReduceKeySort().isPresent()) {
             final Comparator<OK> comparator = mapReduce.getReduceKeySort().get();
             final List<KeyValue<OK, OV>> list = new ArrayList<>(this.reduceQueue);
             Collections.sort(list, Comparator.comparing(KeyValue::getKey, comparator));
             this.reduceQueue.clear();
             this.reduceQueue.addAll(list);
         }
     }
 }
 ----

 The method `MapReduce.reduce()` is defined as:

 [source,java]
 public void reduce(final OK key, final Iterator<OV> values, final ReduceEmitter<OK, OV> emitter) { ... }

 In other words, for the TinkerGraph implementation, iterate through the entrySet of the `reduceMap` and call the
 `reduce()` method on each entry. The `reduce()` method can emit key/value pairs which are simply aggregated into a
 `Queue<KeyValue<OK,OV>>` in an analogous fashion to `TinkerMapEmitter` when no reduce is to follow. These two emitters
 are tied together in `TinkerGraphComputer.submit()`.

 [source,java]
 ----
 ...
 for (final MapReduce mapReduce : mapReducers) {
     if (mapReduce.doStage(MapReduce.Stage.MAP)) {
         final TinkerMapEmitter<?, ?> mapEmitter = new TinkerMapEmitter<>(mapReduce.doStage(MapReduce.Stage.REDUCE));
         final SynchronizedIterator<Vertex> vertices = new SynchronizedIterator<>(this.graph.vertices());
         workers.setMapReduce(mapReduce);
         workers.mapReduceWorkerStart(MapReduce.Stage.MAP);
         workers.executeMapReduce(workerMapReduce -> {
             while (true) {
                 final Vertex vertex = vertices.next();
                 if (null == vertex) return;
                 workerMapReduce.map(ComputerGraph.mapReduce(vertex), mapEmitter);
             }
         });
         workers.mapReduceWorkerEnd(MapReduce.Stage.MAP);

         // sort results if a map output sort is defined
         mapEmitter.complete(mapReduce);

         // no need to run combiners as this is single machine
         if (mapReduce.doStage(MapReduce.Stage.REDUCE)) {
             final TinkerReduceEmitter<?, ?> reduceEmitter = new TinkerReduceEmitter<>();
             final SynchronizedIterator<Map.Entry<?, Queue<?>>> keyValues = new SynchronizedIterator((Iterator) mapEmitter.reduceMap.entrySet().iterator());
             workers.mapReduceWorkerStart(MapReduce.Stage.REDUCE);
             workers.executeMapReduce(workerMapReduce -> {
                 while (true) {
                     final Map.Entry<?, Queue<?>> entry = keyValues.next();
                     if (null == entry) return;
                         workerMapReduce.reduce(entry.getKey(), entry.getValue().iterator(), reduceEmitter);
                     }
                 });
             workers.mapReduceWorkerEnd(MapReduce.Stage.REDUCE);
             reduceEmitter.complete(mapReduce); // sort results if a reduce output sort is defined
             mapReduce.addResultToMemory(this.memory, reduceEmitter.reduceQueue.iterator()); <1>
         } else {
             mapReduce.addResultToMemory(this.memory, mapEmitter.mapQueue.iterator()); <2>
         }
     }
 }
 ...
 ----

 <1> Note that the final results of the reducer are provided to the Memory as specified by the application developer's
 `MapReduce.addResultToMemory()` implementation.
 <2> If there is no reduce stage, the map-stage results are inserted into Memory as specified by the application
 developer's `MapReduce.addResultToMemory()` implementation.

 ==== Hadoop-Gremlin Usage

 Hadoop-Gremlin is centered around `InputFormats` and `OutputFormats`. If a 3rd-party graph system provider wishes to
 leverage Hadoop-Gremlin (and its respective `GraphComputer` engines), then they need to provide, at minimum, a
 Hadoop2 `InputFormat<NullWritable,VertexWritable>` for their graph system. If the provider wishes to persist computed
 results back to their graph system (and not just to HDFS via a `FileOutputFormat`), then a graph system specific
 `OutputFormat<NullWritable,VertexWritable>` must be developed as well.

 Conceptually, `HadoopGraph` is a wrapper around a `Configuration` object. There is no "data" in the `HadoopGraph` as
 the `InputFormat` specifies where and how to get the graph data at OLAP (and OLTP) runtime. Thus, `HadoopGraph` is a
 small object with little overhead. Graph system providers should realize `HadoopGraph` as the gateway to the OLAP
 features offered by Hadoop-Gremlin. For example, a graph system specific `Graph.compute(Class<? extends GraphComputer>
 graphComputerClass)`-method may look as follows:

 [source,java]
 ----
 public <C extends GraphComputer> C compute(final Class<C> graphComputerClass) throws IllegalArgumentException {
   try {
     if (AbstractHadoopGraphComputer.class.isAssignableFrom(graphComputerClass))
       return graphComputerClass.getConstructor(HadoopGraph.class).newInstance(this);
     else
       throw Graph.Exceptions.graphDoesNotSupportProvidedGraphComputer(graphComputerClass);
   } catch (final Exception e) {
     throw new IllegalArgumentException(e.getMessage(),e);
   }
 }
 ----

 Note that the configurations for Hadoop are assumed to be in the `Graph.configuration()` object. If this is not the
 case, then the `Configuration` provided to `HadoopGraph.open()` should be dynamically created within the
 `compute()`-method. It is in the provided configuration that `HadoopGraph` gets the various properties which
 determine how to read and write data to and from Hadoop. For instance, `gremlin.hadoop.graphReader` and
 `gremlin.hadoop.graphWriter`.

 ===== GraphFilterAware Interface

 link:https://tinkerpop.apache.org/docs/x.y.z/reference/#graph-filter[Graph filters] by OLAP processors to only pull a subgraph of the full graph from the graph data source. For instance, the
 example below constructs a `GraphFilter` that will only pull the "knows"-graph amongst people into the `GraphComputer`
 for processing.

 [source,java]
 ----
 graph.compute().vertices(hasLabel("person")).edges(bothE("knows"))
 ----

 If the provider has a custom `InputRDD`, they can implement `GraphFilterAware` and that graph filter will be provided to their
 `InputRDD` at load time. For providers that use an `InputFormat`, state but the graph filter can be accessed from the configuration
 as such:

 [source,java]
 ----
 if (configuration.containsKey(Constants.GREMLIN_HADOOP_GRAPH_FILTER))
   this.graphFilter = VertexProgramHelper.deserialize(configuration, Constants.GREMLIN_HADOOP_GRAPH_FILTER);
 ----

 ===== PersistResultGraphAware Interface

 A graph system provider's `OutputFormat` should implement the `PersistResultGraphAware` interface which
 determines which persistence options are available to the user. For the standard file-based `OutputFormats` provided
 by Hadoop-Gremlin (e.g. link:++https://tinkerpop.apache.org/docs/x.y.z/reference/#gryo-io-format++[`GryoOutputFormat`], link:++https://tinkerpop.apache.org/docs/x.y.z/reference/#graphson-io-format++[`GraphSONOutputFormat`],
 and link:++https://tinkerpop.apache.org/docs/x.y.z/reference/#script-io-format++[`ScriptInputOutputFormat`]) `ResultGraph.ORIGINAL` is not supported as the original graph
 data files are not random access and are, in essence, immutable. Thus, these file-based `OutputFormats` only support
 `ResultGraph.NEW` which creates a copy of the data specified by the `Persist` enum.

 [[io-implementations]]
 ==== IO Implementations

 If a `Graph` requires custom serializers for IO to work properly, implement the `Graph.io` method.  A typical example
 of where a `Graph` would require such a custom serializers is if their identifier system uses non-primitive values,
 such as OrientDB's `Rid` class.  From basic serialization of a single `Vertex` all the way up the stack to Gremlin
 Server, the need to know how to handle these complex identifiers is an important requirement.

 The first step to implementing custom serializers is to first implement the `IoRegistry` interface and register the
 custom classes and serializers to it. Each `Io` implementation has different requirements for what it expects from the
 `IoRegistry`:

 * *GraphML* - No custom serializers expected/allowed.
 * *GraphSON* - Register a Jackson `SimpleModule`.  The `SimpleModule` encapsulates specific classes to be serialized,
 so it does not need to be registered to a specific class in the `IoRegistry` (use `null`).
 * *Gryo* - Expects registration of one of three objects:
 ** Register just the custom class with a `null` Kryo `Serializer` implementation - this class will use default "field-level" Kryo serialization.
 ** Register the custom class with a specific Kryo `Serializer' implementation.
 ** Register the custom class with a `Function<Kryo, Serializer>` for those cases where the Kryo `Serializer` requires the `Kryo` instance to get constructed.

 This implementation should provide a zero-arg constructor as the stack may require instantiation via reflection.
 Consider extending `AbstractIoRegistry` for convenience as follows:

 [source,java]
 ----
 public class MyGraphIoRegistry extends AbstractIoRegistry {
     public MyGraphIoRegistry() {
         register(GraphSONIo.class, null, new MyGraphSimpleModule());
         register(GryoIo.class, MyGraphIdClass.class, new MyGraphIdSerializer());
     }
 }
 ----

 In the `Graph.io` method, provide the `IoRegistry` object to the supplied `Builder` and call the `create` method to
 return that `Io` instance as follows:

 [source,java]
 ----
 public <I extends Io> I io(final Io.Builder<I> builder) {
     return (I) builder.graph(this).registry(myGraphIoRegistry).create();
 }}
 ----

 In this way, `Graph` implementations can pre-configure custom serializers for IO interactions and users will not need
 to know about those details. Following this pattern will ensure proper execution of the test suite as well as
 simplified usage for end-users.

 IMPORTANT: Proper implementation of IO is critical to successful `Graph` operations in Gremlin Server.  The Test Suite
 does have "serialization" tests that provide some assurance that an implementation is working properly, but those
 tests cannot make assertions against any specifics of a custom serializer.  It is the responsibility of the
 implementer to test the specifics of their custom serializers.

 TIP: Consider separating serializer code into its own module, if possible, so that clients that use the `Graph`
 implementation remotely don't need a full dependency on the entire `Graph` - just the IO components and related
 classes being serialized.

 There is an important implication to consider when the addition of a custom serializer. Presumably, the custom
 serializer was written for the JVM to be deployed with a `Graph` instance. For example, a graph may expose a
 geographical type like a `Point` or something similar. The library that contains `Point` assuming users expected to
 deserialize back to a `Point` would need to have the library with `Point` and the "`PointSerializer`" class available
 to them. In cases where that deployment approach is not desirable, it is possible to coerce a class like `Point` to
 a type that is already in the list of types supported in TinkerPop. For example, `Point` could be coerced one-way to
 `Map` of keys "x" and "y". Of course, on the client side, users would have to construct a `Map` for a `Point` which
 isn't quite as user-friendly.

 If doing a type coercion is not desired, then it is important to remember that writing a `Point` class and related
 serializer in Java is not sufficient for full support of Gremlin, as users of non-JVM Gremlin Language Variants (GLV)
 will not be able to consume them. Getting full support would mean writing similar classes for each GLV. While
 developing  those classes is not hard, it also means more code to support.

 ===== Supporting Gremlin-Python IO

 The serialization system of Gremlin-Python provides ways to add new types by creating serializers and deserializers in
 Python and registering them with the `RemoteConnection`.

 [source,python]
 ----
 class MyType(object):
   GRAPHSON_PREFIX = "providerx"
   GRAPHSON_BASE_TYPE = "MyType"
   GRAPHSON_TYPE = GraphSONUtil.formatType(GRAPHSON_PREFIX, GRAPHSON_BASE_TYPE)

   def __init__(self, x, y):
     self.x = x
     self.y = y

   @classmethod
   def objectify(cls, value, reader):
     return cls(value['x'], value['y'])

   @classmethod
   def dictify(cls, value, writer):
     return GraphSONUtil.typedValue(cls.GRAPHSON_BASE_TYPE,
                                   {'x': value.x, 'y': value.y},
                                   cls.GRAPHSON_PREFIX)

 graphson_reader = GraphSONReader({MyType.GRAPHSON_TYPE: MyType})
 graphson_writer = GraphSONWriter({MyType: MyType})

 connection = DriverRemoteConnection('ws://localhost:8182/gremlin', 'g',
                                      graphson_reader=graphson_reader,
                                      graphson_writer=graphson_writer)
 ----

 ===== Supporting Gremlin.Net IO

 The serialization system of Gremlin.Net provides ways to add new types by creating serializers and deserializers in
 any .NET language and registering them with the `GremlinClient`.

 [source,csharp]
 ----
 include::../../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Dev/Provider/IndexTests.cs[tags=myTypeSerialization]

 include::../../../../gremlin-dotnet/test/Gremlin.Net.IntegrationTest/Docs/Dev/Provider/IndexTests.cs[tags=supportingGremlinNetIO]
 ----

 [[remoteconnection-implementations]]
 ==== RemoteConnection Implementations

 A `RemoteConnection` is an interface that is important for usage on traversal sources configured using the
 link:https://tinkerpop.apache.org/docs/x.y.z/reference/#connecting-via-drivers[withRemote()] option. A `Traversal`
 that is generated from that source will apply a `RemoteStrategy` which will inject a `RemoteStep` to its end. That
 step will then send the `Bytecode` of the `Traversal` over the `RemoteConnection` to get the results that it will
 iterate.

 There is one method to implement on `RemoteConnection`:

 [source,java]
 public <E> CompletableFuture<RemoteTraversal<?, E>> submitAsync(final Bytecode bytecode) throws RemoteConnectionException;

 Note that it returns a `RemoteTraversal`. This interface should also be implemented and in most cases implementers can
 simply extend the `AbstractRemoteTraversal`.

 TinkerPop provides the `DriverRemoteConnection` as a useful and
 link:https://github.com/apache/tinkerpop/blob/x.y.z/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/remote[example implementation].
 `DriverRemoteConnection` serializes the `Traversal` as Gremlin bytecode and then submits it for remote processing on
 Gremlin Server. Gremlin Server rebinds the `Traversal` to a configured `Graph` instance and then iterates the results
 back as it would normally do.

 Implementing `RemoteConnection` is not something routinely done for those implementing `gremlin-core`. It is only
 something required if there is a need to exploit remote traversal submission. If a graph provider has a "graph server"
 similar to Gremlin Server that can accept bytecode-based requests on its own protocol, then that would be one example
 of a reason to implement this interface.

 [[bulk-import-export]]
 ==== Bulk Import Export

 When it comes to doing "bulk" operations, the diverse nature of the available graph databases and their specific
 capabilities, prevents TinkerPop from doing a good job of generalizing that capability well. TinkerPop thus maintains
 two positions on the concept of import and export:

 1. TinkerPop refers users to the bulk import/export facilities of specific graph providers as they tend to be more
 efficient and easier to use than the options TinkerPop has tried to generalize in the past.
 2. TinkerPop encourages graph providers to expose those capabilities via `g.io()` and the `IoStep` by way of a
 `TraversalStrategy`.

 That said, for graph providers that don't have a special bulk loading feature, they can either rely on the default
 OLTP (single-threaded) `GraphReader` and `GraphWriter` options that are embedded in `IoStep` or get a basic bulk loader
 from TinkerPop using the link:https://tinkerpop.apache.org/docs/x.y.z/reference/#clonevertexprogram[CloneVertexProgram].
 Simply provide a `InputFormat` and `OutputFormat` that can be referenced by a `HadoopGraph` instance as discussed
 in the link:https://tinkerpop.apache.org/docs/x.y.z/reference/#clonevertexprogram[Reference Documentation].

 [[validating-with-gremlin-test]]
 === Validating with Gremlin-Test

 image:gremlin-edumacated.png[width=225]

 [source,xml]
 ----
 <dependency>
   <groupId>org.apache.tinkerpop</groupId>
   <artifactId>gremlin-test</artifactId>
   <version>x.y.z</version>
 </dependency>
 ----

 Providers currently have two approaches to consider when validating their TinkerPop implementations. The first approach
 comes from the wholly JVM oriented original test suite which was developed in the early days of TinkerPop 3.x design
 and development. The second approach is available as of 3.6.0, is Gherkin-based and originates from the Gremlin
 Language Variant test suite which is language agnostic.

 The first approach is more complete and more opinionated as to how an implementation should behave and in many ways
 helpful in getting an implementation semantically correct from the ground up (i.e. first getting the `Graph` Structure
 API implemented well by getting the Structure Suite to pass which will almost inevitably ensure that the most of the
 Gremlin language oriented tests in the Process Suite pass early on). On the other hand, the fact that this test suite
 is rigorous also can make it harder to implement especially if your graph already exists and behaves in a certain
 fashion.

 The second approach only validates Gremlin semantics which is ultimately what users concern themselves with as that is
 the method by which they will interact with a provider's `Graph`. This test suite is less concerned with how a
 TinkerPop implementation does what it does, so long as it succeeds at processing Gremlin traversals. There is
 significant overlap between this test suite and the aforementioned Process Suite.

 At this time, it would be wise for providers to implement both approaches as the goal for TinkerPop is to move away
 from the rigors of the JVM Structure and Process Suites in favor of Gherkin. Over time, the Structure and Process
 Suites will be deprecated and removed.

 ==== JVM Test Suite

 The operational semantics of any OLTP or OLAP implementation are validated by `gremlin-test`. To implement these tests,
 provide test case implementations as shown below, where `XXX` below denotes the name of the graph implementation (e.g.
 TinkerGraph, Neo4jGraph, HadoopGraph, etc.).

 [source,java]
 ----
 // Structure API tests
 @RunWith(StructureStandardSuite.class)
 @GraphProviderClass(provider = XXXGraphProvider.class, graph = XXXGraph.class)
 public class XXXStructureStandardTest {}

 // Process API tests
 @RunWith(ProcessComputerSuite.class)
 @GraphProviderClass(provider = XXXGraphProvider.class, graph = XXXGraph.class)
 public class XXXProcessComputerTest {}

 @RunWith(ProcessStandardSuite.class)
 @GraphProviderClass(provider = XXXGraphProvider.class, graph = XXXGraph.class)
 public class XXXProcessStandardTest {}
 ----

 IMPORTANT: It is as important to look at "ignored" tests as it is to look at ones that fail.  The `gremlin-test`
 suite utilizes the `Feature` implementation exposed by the `Graph` to determine which tests to execute.  If a test
 utilizes features that are not supported by the graph, it will ignore them.  While that may be fine, implementers
 should validate that the ignored tests are appropriately bypassed and that there are no mistakes in their feature
 definitions.  Moreover, implementers should consider filling gaps in their own test suites, especially when
 IO-related tests are being ignored.

 TIP: If it is expensive to construct a new `Graph` instance, consider implementing `GraphProvider.getStaticFeatures()`
 which can help by caching a static feature set for instances produced by that `GraphProvider` and allow the test suite
 to avoid that construction cost if the test is ignored.

 The only test-class that requires any code investment is the `GraphProvider` implementation class. This class is a
 used by the test suite to construct `Graph` configurations and instances and provides information about the
 implementation itself.  In most cases, it is best to simply extend `AbstractGraphProvider` as it provides many
 default implementations of the `GraphProvider` interface.

 Finally, specify the test suites that will be supported by the `Graph` implementation using the `@Graph.OptIn`
 annotation.  See the `TinkerGraph` implementation below as an example:

 [source,java]
 ----
 @Graph.OptIn(Graph.OptIn.SUITE_STRUCTURE_STANDARD)
 @Graph.OptIn(Graph.OptIn.SUITE_PROCESS_STANDARD)
 @Graph.OptIn(Graph.OptIn.SUITE_PROCESS_COMPUTER)
 public class TinkerGraph implements Graph {
 ----

 Only include annotations for the suites the implementation will support.  Note that implementing the suite, but
 not specifying the appropriate annotation will prevent the suite from running (an obvious error message will appear
 in this case when running the mis-configured suite).

 There are times when there may be a specific test in the suite that the implementation cannot support (despite the
 features it implements) or should not otherwise be executed.  It is possible for implementers to "opt-out" of a test
 by using the `@Graph.OptOut` annotation. This annotation can be applied to either a `Graph` instance or a
 `GraphProvider` instance (the latter would typically be used for "opting out" for a particular `Graph` configuration
 that was under test). The following is an example of this annotation usage as taken from `HadoopGraph`:

 [source,java]
 ----
 @Graph.OptIn(Graph.OptIn.SUITE_PROCESS_STANDARD)
 @Graph.OptIn(Graph.OptIn.SUITE_PROCESS_COMPUTER)
 @Graph.OptOut(
         test = "org.apache.tinkerpop.gremlin.process.graph.step.map.MatchTest$Traversals",
         method = "g_V_matchXa_hasXname_GarciaX__a_inXwrittenByX_b__a_inXsungByX_bX",
         reason = "Hadoop-Gremlin is OLAP-oriented and for OLTP operations, linear-scan joins are required. This particular tests takes many minutes to execute.")
 @Graph.OptOut(
         test = "org.apache.tinkerpop.gremlin.process.graph.step.map.MatchTest$Traversals",
         method = "g_V_matchXa_inXsungByX_b__a_inXsungByX_c__b_outXwrittenByX_d__c_outXwrittenByX_e__d_hasXname_George_HarisonX__e_hasXname_Bob_MarleyXX",
         reason = "Hadoop-Gremlin is OLAP-oriented and for OLTP operations, linear-scan joins are required. This particular tests takes many minutes to execute.")
 @Graph.OptOut(
         test = "org.apache.tinkerpop.gremlin.process.computer.GraphComputerTest",
         method = "shouldNotAllowBadMemoryKeys",
         reason = "Hadoop does a hard kill on failure and stops threads which stops test cases. Exception handling semantics are correct though.")
 @Graph.OptOut(
         test = "org.apache.tinkerpop.gremlin.process.computer.GraphComputerTest",
         method = "shouldRequireRegisteringMemoryKeys",
         reason = "Hadoop does a hard kill on failure and stops threads which stops test cases. Exception handling semantics are correct though.")
 public class HadoopGraph implements Graph {
 ----

 The above examples show how to ignore individual tests.  It is also possible to:

 * Ignore an entire test case (i.e. all the methods within the test) by setting the `method` to "*".
 * Ignore a "base" test class such that test that extend from those classes will all be ignored.
 * Ignore a `GraphComputer` test based on the type of `GraphComputer` being used.  Specify the "computer" attribute on
 the `OptOut` (which is an array specification) which should have a value of the `GraphComputer` implementation class
 that should ignore that test. This attribute should be left empty for "standard" execution and by default all
 `GraphComputer` implementations will be included in the `OptOut` so if there are multiple implementations, explicitly
 specify the ones that should be excluded.

 Also note that some of the tests in the Gremlin Test Suite are parameterized tests and require an additional level of
 specificity to be properly ignored.  To ignore these types of tests, examine the name template of the parameterized
 tests.  It is defined by a Java annotation that looks like this:

 [source,java]
 @Parameterized.Parameters(name = "expect({0})")

 The annotation above shows that the name of each parameterized test will be prefixed with "expect" and have
 parentheses wrapped around the first parameter (at index 0) value supplied to each test.  This information can
 only be garnered by studying the test set up itself.  Once the pattern is determined and the specific unique name of
 the parameterized test is identified, add it to the `specific` property on the `OptOut` annotation in addition to
 the other arguments.

 These annotations help provide users a level of transparency into test suite compliance (via the
 link:https://tinkerpop.apache.org/docs/x.y.z/reference/#describe-graph[describeGraph()] utility function). It also
 allows implementers to have a lot of flexibility in terms of how they wish to support TinkerPop.  For example, maybe
 there is a single test case that prevents an implementer from claiming support of a `Feature`.  The implementer could
 choose to either not support the `Feature` or to support it but "opt-out" of the test with a "reason" as to why so
 that users understand the limitation.

 IMPORTANT: Before using `OptOut` be sure that the reason for using it is sound and it is more of a last resort.
 It is possible that a test from the suite doesn't properly represent the expectations of a feature, is too broad or
 narrow for the semantics it is trying to enforce or simply contains a bug.  Please consider raising issues in the
 developer mailing list with such concerns before assuming `OptOut` is the only answer.

 IMPORTANT: There are no tests that specifically validate complete compliance with Gremlin Server.  Generally speaking,
 a `Graph` that passes the full Test Suite, should be compliant with Gremlin Server.  The one area where problems can
 occur is in serialization.  Always ensure that IO is properly implemented, that custom serializers are tested fully
 and ultimately integration test the `Graph` with an actual Gremlin Server instance.

 WARNING: Configuring tests to run in parallel might result in errors that are difficult to debug as there is some
 shared state in test execution around graph configuration.  It is therefore recommended that parallelism be turned
 off for the test suite (the Maven SureFire Plugin is configured this way by default).  It may also be important to
 include this setting, `<reuseForks>false</reuseForks>`, in the SureFire configuration if tests are failing in an
 unexplainable way.

 WARNING: For graph implementations that require a schema, take note that TinkerPop tests were originally developed
 without too much concern for these types of graphs. While most tests utilize the standard toy graphs there are
 instances where tests will utilize their own independent schema that stands alone from all other tests. It may be
 necessary to create schemas specific to certain tests in those situations.

 TIP: When running the `gremlin-test` suite against your implementation, you may need to set `build.dir` as an
 environment variable, depending on your project layout. Some tests require this to find a writable directory for
 creating temporary files. The value is typically set to the project build directory. For example using the Maven
 SureFire Plugin, this is done via the configuration argLine with `-Dbuild.dir=${project.build.directory}`.

 ===== Checking Resource Leaks

 The TinkerPop query engine retrieves data by interfacing with the provider using iterators. These iterators (depending
 on the provider) may hold up resources in the underlying storage layer and hence, it is critical to close them after
 the query is finished.

 TinkerPop provides you with the ability to test for such resource leaks by checking for leaks when you run the
 Gremlin-Test suites against your implementation. To enable this leak detection, providers should increment the
 `StoreIteratorCounter` whenever a resource is opened and decrement it when it is closed. A reference implementation
 is provided with TinkerGraph as `TinkerGraphIterator.java`.

 Assertions for leak detection are enabled by default when running the test suite. They can be temporarily disabled
 by way of a system property - simply set `-DtestIteratorLeaks=false".

 [[gherkin-tests-suite]]
 ==== Gherkin Test Suite

 The Gherkin Test Suite is a language agnostic set of tests that verify Gremlin semantics. It provides a unified set of
 tests that validate many TinkerPop components internally. The tests themselves can be found in `gremlin-tests/features`
 (link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-test/src/main/resources/org/apache/tinkerpop/gremlin/test/features[here]) with their syntax described in the
 link:https://tinkerpop.apache.org/docs/x.y.z/dev/developer/#gremlin-language-test-cases[TinkerPop Developer Documentation].

 TinkerPop provides some infrastructure, for JVM based graphs, to help make it easier for providers to implement these
 tests against their implementations. This infrastructure is built on `cucumber-java` which is a dependency of
 `gremlin-test`. There are two main components to implementing the tests:

 1. A `org.apache.tinkerpop.gremlin.features.World` implementation which is a class in `gremlin-test`.
 2. A JUnit test class that will act as the runner for the tests with the appropriate annotations

 TIP: It may be helpful to get familiar with link:https://cucumber.io/docs/installation/java/[Cucumber] before
 proceeding with an implementation.

 The `World` implementation provides context to the tests and allows providers to intercept test events that might be
 important to proper execution specific to their implementations. The most important part of implementing `World` is
 properly implementing the `GraphTraversalSource getGraphTraversalSource(GraphData)` method which provides to the test
 the `GraphTraversalSource` to execute the test against.

 The JUnit test class is really just the test runner. It is a simple class which must include some Cucumber annotations.
 The following is just an example as taken from TinkerGraph:

 [source,java]
 ----
 @RunWith(Cucumber.class)
 @CucumberOptions(
         tags = "not @RemoteOnly",
         glue = { "org.apache.tinkerpop.gremlin.features" },
         features = { "classpath:/org/apache/tinkerpop/gremlin/test/features" },
         plugin = {"progress", "junit:target/cucumber.xml",
         objectFactory = GuiceFactory.class})
 ----

 The `@CucumberOptions` that are used are mostly implementation specific, so it will be up to the provider to make some
 choices as to what is right for their environment. For TinkerGraph, it needed to ignore Gherkin tests with the
 `@RemoteOnly` tag (the full list of possible tags can be found link:https://tinkerpop.apache.org/docs/x.y.z/dev/developer/#gherkin-tags[here]),
 as will most providers. The "glue" will be the same for all test implementers as it refers to a package containing
 TinkerPop's test infrastructure in `gremlin-test` (unless of course, a provider needs to develop their own
 infrastructure for some reason). The "features" is the path to the actual Gherkin test files that should be made
 available locally. The files can be referenced on the classpath assuming `gremlin-test` is a dependency. The "plugin"
 defines a JUnit style output, which happens to be understood by Maven.

 The "objectFactory" is the last component. Cucumber relies on dependency injection to get a `World` implementation into
 the test infrastructure. Providers may choose from multiple available implementations, but TinkerPop chose to use
 Guice. To follow this approach include the following module:

 [source,xml]
 ----
 <dependency>
     <groupId>com.google.inject</groupId>
     <artifactId>guice</artifactId>
     <version>4.2.3</version>
     <scope>test</scope>
 </dependency>
 ----

 Following the Neo4jGraph implementation, there are two classes to construct:

 [source,java]
 ----
 public class ServiceModule extends AbstractModule {
     @Override
     protected void configure() {
         bind(World.class).to(Neo4jGraphWorld.class);
     }
 }

 public class WorldInjectorSource implements InjectorSource {
     @Override
     public Injector getInjector() {
         return Guice.createInjector(Stage.PRODUCTION, CucumberModules.createScenarioModule(), new ServiceModule());
     }
 }
 ----

 The key here is that the `Neo4jGraphWorld` implementation gets bound to `World` in the `ServiceModule` and there is
 a `WorldInjectorSource` that specifies the `ServiceModule` to Cucumber. As a final step, the provider's test resources
 needs a `cucumber.properties` file with an entry that specifies the `InjectorSource` so that Guice can find it. Here
 is the example taken from TinkerGraph where the `WorldInjectorSource` is inner class of `TinkerGraphFeatureTest`
 itself.

 [source,text]
 ----
 guice.injector-source=org.apache.tinkerpop.gremlin.neo4j.Neo4jGraphFeatureTest$WorldInjectorSource
 ----

 In the event that a single `World` configuration is insufficient, it may be necessary to develop a custom
 `ObjectFactory`. An easy way to do this is to create a class that extends from the `AbstractGuiceFactory` in
 `gremlin-test` and provide that class to the `@CucumberOptions`. This approach does rely on the `ServiceLoader` which
 means it will be important to include a `io.cucumber.core.backend.ObjectFactory` file in `META-INF/services` and an
 entry that registers the custom implementation. Please see the TinkerGraph test code for further information on this
 approach.

 If implementing the Gherkin tests, providers can choose to opt-in to the slimmed down version of the normal JVM process
 test suite to help alleviate test duplication between the two frameworks:

 [source,java]
 ----
 @Graph.OptIn(Graph.OptIn.SUITE_PROCESS_LIMITED_STANDARD)
 @Graph.OptIn(Graph.OptIn.SUITE_PROCESS_LIMITED_COMPUTER)
 ----

 === Accessibility via GremlinPlugin

 image:gremlin-plugin.png[width=100,float=left] The applications distributed with TinkerPop do not distribute with
 any graph system implementations besides TinkerGraph. If your implementation is stored in a Maven repository (e.g.
 Maven Central Repository), then it is best to provide a <<gremlin-plugins,`GremlinPlugin`>> implementation so the respective jars can be
 downloaded according and when required by the user. Neo4j's GremlinPlugin is provided below for reference.

 [source,java]
 ----
 include::{basedir}/neo4j-gremlin/src/main/java/org/apache/tinkerpop/gremlin/neo4j/jsr223/Neo4jGremlinPlugin.java[]
 ----

 With the above plugin implementations, users can now download respective binaries for Gremlin Console, Gremlin Server, etc.

 [source,groovy]
 gremlin> g = Neo4jGraph.open('/tmp/neo4j')
 No such property: Neo4jGraph for class: groovysh_evaluate
 Display stack trace? [yN]
 gremlin> :install org.apache.tinkerpop neo4j-gremlin x.y.z
 ==>loaded: [org.apache.tinkerpop, neo4j-gremlin, …]
 gremlin> :plugin use tinkerpop.neo4j
 ==>tinkerpop.neo4j activated
 gremlin> g = Neo4jGraph.open('/tmp/neo4j')
 ==>neo4jgraph[EmbeddedGraphDatabase [/tmp/neo4j]]

 === In-Depth Implementations

 image:gremlin-painting.png[width=200,float=right] The graph system implementation details presented thus far are
 minimum requirements necessary to yield a valid TinkerPop implementation. However, there are other areas that a
 graph system provider can tweak to provide an implementation more optimized for their underlying graph engine. Typical
 areas of focus include:

 * Traversal Strategies: A link:https://tinkerpop.apache.org/docs/x.y.z/reference/#traversalstrategy[TraversalStrategy]
 can be used to alter a traversal prior to its execution. A typical example is converting a pattern of
 `g.V().has('name','marko')` into a global index lookup for all vertices with name "marko". In this way, a `O(|V|)`
 lookup becomes an `O(log(|V|))`. Please review `TinkerGraphStepStrategy` for ideas.
 * Step Implementations: Every link:https://tinkerpop.apache.org/docs/x.y.z/reference/#graph-traversal-steps[step] is
 ultimately referenced by the `GraphTraversal` interface. It is possible to extend `GraphTraversal` to use a graph
 system specific step implementation. Note that while it is sometimes possible to develop custom step implementations
 by extending from a TinkerPop step (typically, `AddVertexStep` and other `Mutating` steps), it's important to
 consider that doing so introduces some greater risk for code breaks on upgrades as opposed to other areas of the code
 base. As steps are more internal features of TinkerPop, they might be subject to breaking API and behavioral changes
 that would be less likely to be accepted by more public facing interfaces.

 == Graph Driver Provider Requirements

 image::gremlin-server-protocol.png[width=325]

 One of the roles for link:https://tinkerpop.apache.org/docs/x.y.z/reference/#gremlin-server[Gremlin Server] is to
 provide a bridge from TinkerPop to non-JVM languages (e.g. Go, Python, etc.).  Developers can build language bindings
 (or driver) that provide a way to submit Gremlin scripts to Gremlin Server and get back results.  Given the
 extensible nature of Gremlin Server, it is difficult to provide an authoritative guide to developing a driver.
 It is however possible to describe the core communication protocol using the standard out-of-the-box configuration
 which should provide enough information to develop a driver for a specific language.

 image::gremlin-server-flow.png[width=300,float=right]

 Gremlin Server is distributed with a configuration that utilizes link:http://en.wikipedia.org/wiki/WebSocket[WebSocket]
 with a custom sub-protocol.  Under this configuration, Gremlin Server accepts requests containing a Gremlin script,
 evaluates that script and then streams back the results.  The notion of "streaming" is depicted in the diagram to the
 right.

 The diagram shows an incoming request to process the Gremlin script of `g.V()`.  Gremlin Server evaluates that script,
 getting an `Iterator` of vertices as a result, and steps through each `Vertex` within it.  The vertices are batched
 together given the `resultIterationBatchSize` configuration.  In this case, that value must be `2` given that each
 "response" contains two vertices.  Each response is serialized given the requested serializer type (JSON is likely
 best for non-JVM languages) and written back to the requesting client immediately.  Gremlin Server does not wait for
 the entire result to be iterated, before sending back a response.  It will send the responses as they are realized.

 This approach allows for the processing of large result sets without having to serialize the entire result into memory
 for the response.  It places a bit of a burden on the developer of the driver however, because it becomes necessary to
 provide a way to reconstruct the entire result on the client side from all of the individual responses that Gremlin
 Server returns for a single request.  Again, this description of Gremlin Server's "flow" is related to the
 out-of-the-box configuration.  It is quite possible to construct other flows, that might be more amenable to a
 particular language or style of processing.

 It is recommended but not required that a driver include a `User-Agent` header as part of any web socket
 handshake request to Gremlin Server. Gremlin Server uses the user agent in building usage metrics
 as well as debugging. The standard format for connection user agents is:

 `"[Application Name] [GLV Name].[Version] [Language Runtime Version] [OS].[Version] [CPU Architecture]"`
 For example:
 `"MyTestApplication Gremlin-Java.3.5.4 11.0.16.1 Mac_OS_X.12.6.1 aarch64"`

 To formulate a request to Gremlin Server, a `RequestMessage` needs to be constructed.  The `RequestMessage` is a
 generalized representation of a request that carries a set of "standard" values in addition to optional ones that are
 dependent on the operation being performed.  A `RequestMessage` has these fields:

 [width="100%",cols="3,10",options="header"]
 |=========================================================
 |Key |Description
 |requestId |A link:http://en.wikipedia.org/wiki/Globally_unique_identifier[UUID] representing the unique identification for the request.
 |op |The name of the "operation" to execute based on the available `OpProcessor` configured in the Gremlin Server.  To evaluate a script, use `eval`.
 |processor |The name of the `OpProcessor` to utilize. The default `OpProcessor` for evaluating scripts is unnamed and therefore script evaluation purposes, this value can be an empty string.
 |args |A `Map` of arbitrary parameters to pass to Gremlin Server.  The requirements for the contents of this `Map` are dependent on the `op` selected.
 |=========================================================

 This message can be serialized in any fashion that is supported by Gremlin Server.  New serialization methods can
 be plugged in by implementing a `ServiceLoader` enabled `MessageSerializer`, however Gremlin Server provides for
 JSON serialization by default which will be good enough for purposes of most developers building drivers.
 A `RequestMessage` to evaluate a script with variable bindings looks like this in JSON:

 [source,js]
 ----
 { "requestId":"1d6d02bd-8e56-421d-9438-3bd6d0079ff1",
   "op":"eval",
   "processor":"",
   "args":{"gremlin":"g.V(x).out()",
           "bindings":{"x":1},
           "language":"gremlin-groovy"}}
 ----

 The above JSON represents the "body" of the request to send to Gremlin Server. When sending this "body" over
 WebSocket, Gremlin Server can accept a packet frame using a "text" (1) or a "binary" (2) opcode. Using "text"
 is a bit more limited in that Gremlin Server will always process the body of that request as JSON.  Generally speaking
 "text" is just for testing purposes.

 The preferred method for sending requests to Gremlin Server is to use the "binary" opcode.  In this case, a "header"
 will need be sent in addition to to the "body".  The "header" basically consists of a "mime type" so that Gremlin
 Server knows how to deserialize the `RequestMessage`. So, the actual byte array sent to Gremlin Server would be
 formatted as follows:

 image::gremlin-server-request.png[]

 The first byte represents the length of the "mime type" string value that follows.  Given the default configuration of
 Gremlin Server, this value should be set to `application/json`.  The "payload" represents the JSON message above
 encoded as bytes.

 NOTE: Gremlin Server will only accept masked packets as it pertains to a WebSocket packet header construction.

 When Gremlin Server receives that request, it will decode it given the "mime type", pass it to the requested
 `OpProcessor` which will execute the `op` defined in the message.  In this case, it will evaluate the script
 `g.V(x).out()` using the `bindings` supplied in the `args` and stream back the results in a series of
 `ResponseMessages`.  A `ResponseMessage` looks like this:

 [width="100%",cols="3,10",options="header"]
 |=========================================================
 |Key |Description
 |requestId |The identifier of the `RequestMessage` that generated this `ResponseMessage`.
 |status | The `status` contains a `Map` of three keys: `code` which refers to a `ResultCode` that is somewhat analogous to an link:http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html[HTTP status code], `attributes` that represent a `Map` of protocol-level information, and `message` which is just a human-readable `String` usually associated with errors.
 |result | The `result` contains a `Map` of two keys: `data` which refers to the actual data returned from the server (the type of data is determined by the operation requested) and `meta` which is a `Map` of meta-data related to the response.
 |=========================================================

 In this case the `ResponseMessage` returned to the client would look something like this:

 [source,js]
 ----
 {"result":{"data":[{"id": 2,"label": "person","type": "vertex","properties": [
   {"id": 2, "value": "vadas", "label": "name"},
   {"id": 3, "value": 27, "label": "age"}]},
   ], "meta":{}},
  "requestId":"1d6d02bd-8e56-421d-9438-3bd6d0079ff1",
  "status":{"code":206,"attributes":{},"message":""}}
 ----

 Gremlin Server is capable of streaming results such that additional responses will arrive over the WebSocket connection until
 the iteration of the result on the server is complete.  Each successful incremental message will have a `ResultCode`
 of `206`. Termination of the stream will be marked by a final `200` status code.  Note that all messages without a
 `206` represent terminating conditions for a request.  The following table details the various status codes that
 Gremlin Server will send:

 [width="100%",cols="2,2,9",options="header"]
 |=========================================================
 |Code |Name |Description
 |200 |SUCCESS |The server successfully processed a request to completion - there are no messages remaining in this stream.
 |204 |NO CONTENT |The server processed the request but there is no result to return (e.g. an `Iterator` with no elements) - there are no messages remaining in this stream.
 |206 |PARTIAL CONTENT |The server successfully returned some content, but there is more in the stream to arrive - wait for a `SUCCESS` to signify the end of the stream.
 |401 |UNAUTHORIZED |The request attempted to access resources that the requesting user did not have access to.
 |403 |FORBIDDEN |The server could authenticate the request, but will not fulfill it.
 |407 |AUTHENTICATE |A challenge from the server for the client to authenticate its request.
 |497 |REQUEST ERROR SERIALIZATION |The request message contained an object that was not serializable.
 |498 |REQUEST ERROR MALFORMED REQUEST |The request message was not properly formatted which means it could not be parsed at all or the "op" code was not recognized such that Gremlin Server could properly route it for processing.  Check the message format and retry the request.
 |499 |REQUEST ERROR INVALID REQUEST ARGUMENTS |The request message was parseable, but the arguments supplied in the message were in conflict or incomplete. Check the message format and retry the request.
 |500 |SERVER ERROR |A general server error occurred that prevented the request from being processed.
 |595 |SERVER ERROR FAIL STEP | A server error that is produced when the `fail()` step is triggered. The returned exception will include information consistent with the `Failure` interface.
 |596 |SERVER ERROR TEMPORARY |A server error occurred, but it was temporary in nature and therefore the client is free to retry it's request as-is with the potential for success.
 |597 |SERVER ERROR EVALUATION |The script submitted for processing evaluated in the `ScriptEngine` with errors and could not be processed.  Check the script submitted for syntax errors or other problems and then resubmit.
 |598 |SERVER ERROR TIMEOUT |The server exceeded one of the timeout settings for the request and could therefore only partially responded or did not respond at all.
 |599 |SERVER ERROR SERIALIZATION |The server was not capable of serializing an object that was returned from the script supplied on the request. Either transform the object into something Gremlin Server can process within the script or install mapper serialization classes to Gremlin Server.
 |=========================================================

 NOTE: Please refer to the link:https://tinkerpop.apache.org/docs/x.y.z/dev/io[IO Reference Documentation] for more
 examples of `RequestMessage` and `ResponseMessage` instances.

 === OpProcessors Arguments

 The following sections define a non-exhaustive list of available operations and arguments for embedded `OpProcessors`
 (i.e. ones packaged with Gremlin Server).

 ==== Common

 All `OpProcessor` instances support these arguments.

 [width="100%",cols="2,2,9",options="header"]
 |=========================================================
 |Key |Type |Description
 |batchSize |Int |When the result is an iterator this value defines the number of iterations each `ResponseMessage` should contain - overrides the `resultIterationBatchSize` server setting.
 |=========================================================

 ==== Standard OpProcessor

 The "standard" `OpProcessor` handles requests for the primary function of Gremlin Server - executing Gremlin.
 Requests made to this `OpProcessor` are "sessionless" in the sense that a request must encapsulate the entirety
 of a transaction.  There is no state maintained between requests.  A transaction is started when the script is first
 evaluated and is committed when the script completes (or rolled back if an error occurred).

 [width="100%",cols="3,10a",options="header"]
 |=========================================================
 |Key |Description
 |processor |As this is the default `OpProcessor` this value can be set to an empty string.
 |op |[width="100%",cols="3,10",options="header"]
 !=========================================================
 !Key !Description
 !`authentication` !A request that contains the response to a server challenge for authentication.
 !`eval` !Evaluate a Gremlin script provided as a `String`.
 !=========================================================
 |=========================================================

 **`authentication` operation arguments**

 [width="100%",cols="2,2,9",options="header"]
 |=========================================================
 |Key |Type |Description
 |sasl |String | *Required* The response to the server authentication challenge.  This value is dependent on the SASL authentication mechanism required by the server and is Base64 encoded.
 |saslMechanism |String | The SASL mechanism: `PLAIN` or `GSSAPI`. Note that it is up to the server implementation to use or disregard this setting (default implementation in Gremlin Server ignores it).
 |=========================================================

 **`eval` operation arguments**

 [width="100%",cols="2,2,9",options="header"]
 |=========================================================
 |Key |Type |Description
 |gremlin |String | *Required* The Gremlin script to evaluate.
 |bindings |Map |A map of key/value pairs to apply as variables in the context of the Gremlin script.
 |language |String |The flavor of Gremlin used (e.g. `gremlin-groovy`).
 |aliases |Map |A map of key/value pairs that allow globally bound `Graph` and `TraversalSource` objects to
 be aliased to different variable names for purposes of the current request.  The value represents the name of the
 global variable and its key represents the new binding name as it will be referenced in the Gremlin query.  For
 example, if the Gremlin Server defines two `TraversalSource` instances named `g1` and `g2`, it would be possible
 to send an alias pair with key of "g" and value of "g2" and thus allow the script to refer to "g2" simply as "g".
 |evaluationTimeout |Long |An override for the server setting that determines the maximum time to wait for a script to execute on the server.
 |=========================================================

 ==== Session OpProcessor

 The "session" `OpProcessor` handles requests for the primary function of Gremlin Server - executing Gremlin. It is
 like the "standard" `OpProcessor`, but instead maintains state between sessions and allows the option to leave all
 transaction management up to the calling client.  It is important that clients that open sessions, commit or roll
 them back, however Gremlin Server will try to clean up such things when a session is killed that has been abandoned.
 It is important to consider that a session can only be maintained with a single machine.  In the event that multiple
 Gremlin Server are deployed, session state is not shared among them.

 [width="100%",cols="3,10a",options="header"]
 |=========================================================
 |Key |Description
 |processor |This value should be set to `session`
 |op |
 [cols="3,10",options="header"]
 !=========================================================
 !Key !Description
 !`authentication` !A request that contains the response to a server challenge for authentication.
 !`eval` !Evaluate a Gremlin script provided as a `String`.
 !`close` !Deprecated. Gremlin-Server will only return a `NO CONTENT` message.
 |=========================================================

 NOTE: The "close" message related to sessions was deprecated as of 3.3.11. Closing sessions now relies on closing the connections. The function to accept `close` message on the server was removed
 in 3.5.0, but has been added back as of 3.5.2. Servers wishing to be compatible with older versions of the driver need only send back a `NO_CONTENT` for
 this message (which is what Gremlin Server does as of 3.5.0). Drivers wishing to be compatible with servers prior to
 3.3.11 may continue to send the message on calls to `close()`, otherwise such code can be removed.

 **`authentication` operation arguments**

 [width="100%",cols="2,2,9",options="header"]
 |=========================================================
 |Key |Type |Description
 |saslMechanism |String | The SASL mechanism: `PLAIN` or `GSSAPI`. Note that it is up to the server implementation to use or disregard this setting (default implementation in Gremlin Server ignores it).
 |sasl |String | *Required* The response to the server authentication challenge.  This value is dependent on the SASL authentication mechanism required by the server and is Base64 encoded.
 |=========================================================

 **`eval` operation arguments**

 [width="100%",options="header"]
 |=========================================================
 |Key |Type |Description
 |gremlin |String | *Required* The Gremlin script to evaluate.
 |session |String | *Required* The session identifier for the current session - typically this value should be a UUID (the session will be created if it doesn't exist).
 |manageTransaction |Boolean |When set to `true` the transaction for the current request is auto-committed or rolled-back as are done with sessionless requests - defaulted to `false`.
 |bindings |Map |A map of key/value pairs to apply as variables in the context of the Gremlin script.
 |evaluationTimeout |Long |An override for the server setting that determines the maximum time to wait for a script to execute on the server.
 |language |String |The flavor of Gremlin used (e.g. `gremlin-groovy`)
 |aliases |Map |A map of key/value pairs that allow globally bound `Graph` and `TraversalSource` objects to
 be aliased to different variable names for purposes of the current request.  The value represents the name the
 global variable and its key represents the new binding name as it will be referenced in the Gremlin query.  For
 example, if the Gremlin Server defines two `TraversalSource` instances named `g1` and `g2`, it would be possible
 to send an alias pair with key of "g" and value of "g2" and thus allow the script to refer to "g2" simply as "g".
 |=========================================================

 **`close` operation arguments**
 [width="100%",cols="2,2,9",options="header"]
 |=========================================================
 |Key |Type |Description
 |session |String | *Required* The session identifier for the session to close.
 |force |Boolean | Determines if the session should be force closed when the client is closed. Force closing will not
 attempt to close open transactions from existing running jobs and leave it to the underlying graph to decided how to
 proceed with those orphaned transactions. Setting this to `true` tends to lead to faster close operation and release
 of resources which can be desirable if Gremlin Server has a long session timeout and a long script evaluation timeout
 as attempts to close long run jobs can occur more rapidly. If not provided, this value is `false`.
 |=========================================================

 ==== Traversal OpProcessor

 Both the Standard and Session OpProcessors allow for Gremlin scripts to be submitted to the server. The
 `TraversalOpProcessor` however allows Gremlin `Bytecode` to be submitted to the server. Supporting this `OpProcessor`
 makes it possible for a link:https://tinkerpop.apache.org/docs/x.y.z/reference/#gremlin-drivers-variants[Gremlin Language Variant]
 to submit a `Traversal` directly to Gremlin Server in the native language of the GLV without having to use a script in
 a different language.

 Unlike Standard and Session OpProcessors, the Traversal OpProcessor does not simply return the results of the
 `Traversal`. It instead returns `Traverser` objects which allows the client to take advantage of
 link:https://tinkerpop.apache.org/docs/x.y.z/reference/#barrier-step[bulking]. To describe this interaction more
 directly, the returned `Traverser` will represent some value from the `Traversal` result and the number of times it
 is represented in the full stream of results. So, if a `Traversal` happens to return the same vertex twenty times
 it won't return twenty instances of the same object. It will return one in `Traverser` with the `bulk` value set to
 twenty. Under this model, the amount of processing and network overhead can be reduced considerably.

 To demonstrate consider this example:

 [gremlin-groovy]
 ----
 cluster = Cluster.open()
 client = cluster.connect()
 aliased = client.alias("g")
 g = traversal().withEmbedded(org.apache.tinkerpop.gremlin.structure.util.empty.EmptyGraph.instance())     <1>
 rs = aliased.submit(g.V().both().barrier().both().barrier()).all().get()                    <2>
 aliased.submit(g.V().both().barrier().both().barrier().count()).all().get().get(0).getInt() <3>
 rs.collect{[value: it.getObject().get(), bulk: it.getObject().bulk()]}                      <4>
 ----

 <1> All commands through this step are just designed to demonstrate bulking with Gremlin Server and don't represent
 a real-world way that this feature would be used.
 <2> Submit a `Traversal` that happens to ensure that the server uses bulking. Note that a `Traverser` is returned
 and that there are only six results.
 <3> In actuality, however, if this same `Traversal` is iterated there are thirty results. Without bulking, the previous
 request would have sent back thirty traversers.
 <4> Note that the sum of the bulk of each `Traverser` is thirty.

 The full iteration of a `Traversal` is thus left to the client. It must interpret the bulk on the `Traverser` and
 unroll it to represent the actual number of times it exists when iterated. The unrolling is typically handled
 directly within TinkerPop's remote traversal implementations.

 [width="100%",cols="3,10a",options="header"]
 |=========================================================
 |Key |Description
 |processor |This value should be set to `traversal`
 |op |
 [cols="3,10",options="header"]
 !=========================================================
 !Key !Description
 !`authentication` !A request that contains the response to a server challenge for authentication.
 !`bytecode` !A request that contains the `Bytecode` representation of a `Traversal`.
 |=========================================================

 **`authentication` operation arguments**

 [width="100%",cols="2,2,9",options="header"]
 |=========================================================
 |Key |Type |Description
 |sasl |String | *Required* The response to the server authentication challenge.  This value is dependent on the SASL authentication mechanism required by the server and is Base64 encoded.
 |=========================================================

 **`bytecode` operation arguments**

 [width="100%",cols="2,2,9",options="header"]
 |=========================================================
 |Key |Type |Description
 |gremlin |String | *Required* The `Bytecode` representation of a `Traversal`.
 |aliases |Map | *Required* A map with a single key/value pair that refers to a globally bound `TraversalSource` object
 to be aliased to different variable names for purposes of the current request.  The value represents the name of the
 global variable and its key represents the new binding name as it will be referenced in the Gremlin query.  For
 example, if the Gremlin Server defines two `TraversalSource` instances named `g1`, it would be possible
 to send an alias pair with key of "g" and value of "g1" and thus allow the script to refer to "g1" simply as "g". Note
 that unlike users of `alias` in other contexts, in this case, the key can *only* be set to "g" and there can be only
 one key value pair present (since only one `Traversal` is being submitted, there is no sense to having more than a
 single alias).
 |=========================================================

 === Authentication and Authorization

 Gremlin Server supports link:https://en.wikipedia.org/wiki/Simple_Authentication_and_Security_Layer[SASL-based]
 authentication.  A SASL implementation provides a series of challenges and responses that a driver must comply with
 in order to authenticate.  Gremlin Server supports the "PLAIN" SASL mechanism, which is a cleartext
 password system, for all link:https://tinkerpop.apache.org/docs/x.y.z/reference/#gremlin-drivers-variants[Gremlin Language Variants].
 Other SASL mechanisms supported for selected clients are listed in the
 link:https://tinkerpop.apache.org/docs/x.y.z/reference/#security[security section of the Gremlin Server reference documentation].

 When authentication is enabled, an incoming request is intercepted before it is evaluated by the `ScriptEngine`.  The
 request is saved on the server and a `AUTHENTICATE` challenge response (status code `407`) is returned to the client.

 The client will detect the `AUTHENTICATE` and respond with an `authentication` for the `op` and an `arg` named `sasl`.
 In case of the "PLAIN" SASL mechanism the `arg` contains the password.  The password should be either, an encoded
 sequence of UTF-8 bytes, delimited by 0 (US-ASCII NUL), where the form is : `<NUL>username<NUL>password`, or a Base64
 encoded string of the former (which in this instance would be `AHVzZXJuYW1lAHBhc3N3b3Jk`).  Should Gremlin Server be
 able to authenticate with the provided credentials, the server will return the results of the original request as it
 normally does without authentication.  If it cannot authenticate given the challenge response from the client, it will
 return `UNAUTHORIZED` (status code `401`).

 NOTE: Gremlin Server does not support the "authorization identity" as described in link:https://tools.ietf.org/html/rfc4616[RFC4616].

 In addition to authenticating users at the start of a connection, Gremlin Server allows providers to authorize users on
 a per request basis. If
 link:https://tinkerpop.apache.org/docs/x.y.z/reference/#_configuring_2[a java class is configured] that implements the
 link:https://tinkerpop.apache.org/javadocs/x.y.z/full/org/apache/tinkerpop/gremlin/server/authz/Authorizer.html[Authorizer interface],
 Gremlin Server passes each request to this `Authorizer`. The `Authorizer` can deny authorization for the request by
 throwing an exception and Gremlin Server returns `UNAUTHORIZED` (status code `401`) to the client. The `Authorizer`
 authorizes the request by returning the original request or the request with some additional constraints. Gremlin Server
 proceeds with the returned request and on its turn returns the result of the request to the client. More details on
 implementing authorization can be found in the
 link:https://tinkerpop.apache.org/docs/x.y.z/reference/#security[reference documentation for Gremlin Server security].

 NOTE: While Gremlin Server supports this authorization feature it is not a feature that TinkerPop requires of graph
 providers as part of the agreement between client and server.


 [[gremlin-plugins]]
 == Gremlin Plugins

 image:gremlin-plugin.png[width=125]

 Plugins provide a way to expand the features of a `GremlinScriptEngine`, which stands at that core of both Gremlin
 Console and Gremlin Server. Providers may wish to create plugins for a variety of reasons, but some common examples
 include:

 * Initialize the `GremlinScriptEngine` application with important classes so that the user doesn't need to type their
 own imports.
 * Place specific objects in the bindings of the `GremlinScriptEngine` for the convenience of the user.
 * Bootstrap the `GremlinScriptEngine` with custom functions so that they are ready for usage at startup.

 The first step to developing a plugin is to implement the link:https://tinkerpop.apache.org/javadocs/x.y.z/full/org/apache/tinkerpop/gremlin/jsr223/GremlinPlugin.html[GremlinPlugin]
 interface:

 [source,java]
 ----
 include::{basedir}/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/jsr223/GremlinPlugin.java[]
 ----

 The most simple plugin and the one most commonly implemented will likely be one that just provides a list of classes
 for import.  This type of plugin is the easiest way for implementers of the TinkerPop Structure and Process APIs to
 make their implementations available to users.  The TinkerGraph implementation has just such a plugin:

 [source,java]
 ----
 include::{basedir}/tinkergraph-gremlin/src/main/java/org/apache/tinkerpop/gremlin/tinkergraph/jsr223/TinkerGraphGremlinPlugin.java[]
 ----

 This plugin extends from the abstract base class of link:https://tinkerpop.apache.org/javadocs/x.y.z/full/org/apache/tinkerpop/gremlin/jsr223/AbstractGremlinPlugin.html[AbstractGremlinPlugin]
 which provides some default implementations of the `GremlinPlugin` methods. It simply allows those who extend from it
 to be able to just supply the name of the module and a list of link:https://tinkerpop.apache.org/javadocs/x.y.z/full/org/apache/tinkerpop/gremlin/jsr223/Customizer.html[Customizer]
 instances to apply to the `GremlinScriptEngine`. In this case, the TinkerGraph plugin just needs an
 link:https://tinkerpop.apache.org/javadocs/x.y.z/full/org/apache/tinkerpop/gremlin/jsr223/ImportCustomizer.html[ImportCustomizer]
 which describes the list of classes to import when the plugin is activated and applied to the `GremlinScriptEngine`.

 The `ImportCustomizer` is just one of several provided `Customizer` implementations that can be used in conjunction
 with plugin development:

 * link:https://tinkerpop.apache.org/javadocs/x.y.z/full/org/apache/tinkerpop/gremlin/jsr223/BindingsCustomizer.html[BindingsCustomizer] - Inject a key/value pair into the global bindings of the `GremlinScriptEngine` instances
 * link:https://tinkerpop.apache.org/javadocs/x.y.z/full/org/apache/tinkerpop/gremlin/jsr223/ImportCustomizer.html[ImportCustomizer] - Add imports to a `GremlinScriptEngine`
 * link:https://tinkerpop.apache.org/javadocs/x.y.z/full/org/apache/tinkerpop/gremlin/jsr223/ScriptCustomizer.html[ScriptCustomizer] - Execute a script on a `GremlinScriptEngine` at startup

 Individual `GremlinScriptEngine` instances may have their own `Customizer` instances that can be used only with that
 engine - e.g. `gremlin-groovy` has some that are specific to controlling the Groovy compiler configuration. Developing
 a new `Customizer` implementation is not really possible without changes to TinkerPop, as the framework is not designed
 to respond to external ones. The base `Customizer` implementations listed above should cover most needs.

 A `GremlinPlugin` must support one of two instantiation models so that it can be instantiated from configuration files
 for use in various situations - e.g. Gremlin Server. The first option is to use a static initializer given a method
 with the following signature:

 [source,java]
 ----
 public static GremlinPlugin instance()
 ----

 The limitation with this approach is that it does not provide a way to supply any configuration to the plugin so it
 tends to only be useful for fairly simplistic plugins. The more advanced approach is to provide a "builder" given a
 method with the following signature:

 [source,java]
 ----
 public static Builder build()
 ----

 It doesn't really matter what kind of class is returned from `build` so long as it follows a "Builder" pattern, where
 methods on that object return an instance of itself, so that builder methods can be chained together prior to calling
 a final `create` method as follows:

 [source,java]
 ----
 public GremlinPlugin create()
 ----

 Please see the link:https://tinkerpop.apache.org/javadocs/x.y.z/full/org/apache/tinkerpop/gremlin/jsr223/ImportGremlinPlugin.html[ImportGremlinPlugin]
 for an example of what implementing a `Builder` might look like in this context.

 Note that the plugin provides a unique name for the plugin which follows a namespaced pattern as _namespace_._plugin-name_
 (e.g. "tinkerpop.hadoop" - "tinkerpop" is the reserved namespace for TinkerPop maintained plugins).

 For plugins that will work with Gremlin Console, there is one other step to follow to ensure that the `GremlinPlugin`
 will work there.  The console loads `GremlinPlugin` instances via link:http://docs.oracle.com/javase/8/docs/api/java/util/ServiceLoader.html[ServiceLoader]
 and therefore need a resource file added to the jar file where the plugin exists.  Add a file called
 `org.apache.tinkerpop.gremlin.jsr223.GremlinPlugin` to `META-INF/services`.  In the case of the TinkerGraph
 plugin above, that file will have this line in it:

 [source,java]
 ----
 include::{basedir}/tinkergraph-gremlin/src/main/resources/META-INF/services/org.apache.tinkerpop.gremlin.jsr223.GremlinPlugin[]
 ----

 Once the plugin is packaged, there are two ways to test it out:

 . Copy the jar and its dependencies to the Gremlin Console path and start it. It is preferrable that the plugin is
 copied to the `/ext/_plugin_name_` directory.
 . Start Gremlin Console and try the `:install` command: `:install com.company my-plugin 1.0.0`.

 In either case, once one of these two approaches is taken, the jars and their dependencies are available to the
 Console.  The next step is to "activate" the plugin by doing `:plugin use my-plugin`, where "my-plugin" refers to the
 name of the plugin to activate.

 NOTE: When `:install` is used logging dependencies related to link:http://www.slf4j.org/[SLF4J] are filtered out so as
 not to introduce multiple logger bindings (which generates warning messages to the logs).

 Plugins can also tie into the `:remote` and `:submit` commands.  Recall that a `:remote` represents a different
 context within which Gremlin is executed, when issued with `:submit`.  It is encouraged to use this integration point
 when possible, as opposed to registering new commands that can otherwise follow the `:remote` and `:submit` pattern.
 To expose this integration point as part of a plugin, implement the `RemoteAcceptor` interface:

 TIP: Be good to the users of plugins and prevent dependency conflicts. Maintaining a conflict free plugin is most
 easily done by using the link:http://maven.apache.org/enforcer/maven-enforcer-plugin/[Maven Enforcer Plugin].

 TIP: Consider binding the plugin's minor version to the TinkerPop minor version so that it's easy for users to figure
 out plugin compatibility.  Otherwise, clearly document a compatibility matrix for the plugin somewhere that users can
 find it.

 [source,java]
 ----
 include::{basedir}/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/jsr223/console/RemoteAcceptor.java[]
 ----

 The `RemoteAcceptor` can be bound to a `GremlinPlugin` by adding a `ConsoleCustomizer` implementation to the list of
 `Customizer` instances that are returned from the `GremlinPlugin`. The `ConsoleCustomizer` will only be executed when
 in use with the Gremlin Console plugin host.  Simply instantiate and return a `RemoteAcceptor` in the
 `ConsoleCustomizer.getRemoteAcceptor(GremlinShellEnvironment)` method.  Generally speaking, each call to
 `getRemoteAcceptor(GremlinShellEnvironment)` should produce a new instance of a `RemoteAcceptor`.

 include::gremlin-semantics.asciidoc[]

 include::policies.asciidoc[]