| //// |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| //// |
| [[traversal]] |
| = The Traversal |
| |
| image::gremlin-running.png[width=125] |
| |
| At the most general level there is `Traversal<S,E>` which implements `Iterator<E>`, where the `S` stands for start and |
| the `E` stands for end. A traversal is composed of four primary components: |
| |
| . `Step<S,E>`: an individual function applied to `S` to yield `E`. Steps are chained within a traversal. |
| . `TraversalStrategy`: interceptor methods to alter the execution of the traversal (e.g. query re-writing). |
| . `TraversalSideEffects`: key/value pairs that can be used to store global information about the traversal. |
| . `Traverser<T>`: the object propagating through the `Traversal` currently representing an object of type `T`. |
| |
| The classic notion of a graph traversal is provided by `GraphTraversal<S,E>` which extends `Traversal<S,E>`. |
| `GraphTraversal` provides an interpretation of the graph data in terms of vertices, edges, etc. and thus, a graph |
| traversal link:http://en.wikipedia.org/wiki/Domain-specific_language[DSL]. |
| |
| IMPORTANT: The underlying `Step` implementations provided by TinkerPop should encompass most of the functionality |
| required by a DSL author. It is important that DSL authors leverage the provided steps as then the common optimization |
| and decoration strategies can reason on the underlying traversal sequence. If new steps are introduced, then common |
| traversal strategies may not function properly. |
| |
| [[graph-traversal-steps]] |
| == Graph Traversal Steps |
| |
| image::step-types.png[width=650] |
| |
| A `GraphTraversal<S,E>` is spawned from a `GraphTraversalSource`. It can also be spawned anonymously (i.e. empty) |
| via `__`. A graph traversal is composed of an ordered list of steps. All the steps provided by `GraphTraversal` |
| inherit from the more general forms diagrammed above. A list of all the steps (and their descriptions) are provided |
| in the TinkerPop link:https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html[GraphTraversal JavaDoc]. |
| The following subsections will demonstrate the GraphTraversal steps using the <<gremlin-console,Gremlin Console>>. |
| |
| IMPORTANT: The basics for starting a traversal are described in <<the-graph-process,The Graph Process>> section as |
| well as in the link:https://tinkerpop.apache.org/docs/current/tutorials/getting-started/[Getting Started] tutorial. |
| |
| NOTE: To reduce the verbosity of the expression, it is good to |
| `+import static org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.__.*+`. This way, instead of doing `+__.inE()+` |
| for an anonymous traversal, it is possible to simply write `inE()`. Be aware of language-specific reserved keywords |
| when using anonymous traversals. For example, `in` and `as` are reserved keywords in Groovy, therefore you must use |
| the verbose syntax `+__.in()+` and `+__.as()+` to avoid collisions. |
| |
| [[general-steps]] |
| === General Steps |
| |
| There are five general steps, each having a traversal and a lambda representation, by which all other specific steps described later extend. |
| |
| [width="100%",cols="10,12",options="header"] |
| |========================================================= |
| | Step| Description |
| | `map(Traversal<S, E>)` `map(Function<Traverser<S>, E>)` | map the traverser to some object of type `E` for the next step to process. |
| | `flatMap(Traversal<S, E>)` `flatMap(Function<Traverser<S>, Iterator<E>>)` | map the traverser to an iterator of `E` objects that are streamed to the next step. |
| | `filter(Traversal<?, ?>)` `filter(Predicate<Traverser<S>>)` | map the traverser to either true or false, where false will not pass the traverser to the next step. |
| | `sideEffect(Traversal<S, S>)` `sideEffect(Consumer<Traverser<S>>)` | perform some operation on the traverser and pass it to the next step. |
| | `branch(Traversal<S, M>)` `branch(Function<Traverser<S>,M>)` | split the traverser to all the traversals indexed by the `M` token. |
| |========================================================= |
| |
| WARNING: Lambda steps are presented for educational purposes as they represent the foundational constructs of the |
| Gremlin language. In practice, lambda steps should be avoided in favor of their traversals representation and traversal |
| verification strategies exist to disallow their use unless explicitly "turned off." For more information on the problems |
| with lambdas, please read <<a-note-on-lambdas,A Note on Lambdas>>. |
| |
| The `Traverser<S>` object provides access to: |
| |
| . The current traversed `S` object -- `Traverser.get()`. |
| . The current path traversed by the traverser -- `Traverser.path()`. |
| .. A helper shorthand to get a particular path-history object -- `Traverser.path(String) == Traverser.path().get(String)`. |
| . The number of times the traverser has gone through the current loop -- `Traverser.loops()`. |
| . The number of objects represented by this traverser -- `Traverser.bulk()`. |
| . The local data structure associated with this traverser -- `Traverser.sack()`. |
| . The side-effects associated with the traversal -- `Traverser.sideEffects()`. |
| .. A helper shorthand to get a particular side-effect -- `Traverser.sideEffect(String) == Traverser.sideEffects().get(String)`. |
| |
| image:map-lambda.png[width=150,float=right] |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).out().values('name') <1> |
| g.V(1).out().map {it.get().value('name')} <2> |
| g.V(1).out().map(values('name')) <3> |
| ---- |
| |
| <1> An outgoing traversal from vertex 1 to the name values of the adjacent vertices. |
| <2> The same operation, but using a lambda to access the name property values. |
| <3> Again the same operation, but using the traversal representation of `map()`. |
| |
| image:filter-lambda.png[width=160,float=right] |
| [gremlin-groovy,modern] |
| ---- |
| g.V().filter {it.get().label() == 'person'} <1> |
| g.V().filter(label().is('person')) <2> |
| g.V().hasLabel('person') <3> |
| ---- |
| |
| <1> A filter that only allows the vertex to pass if it has the "person" label |
| <2> The same operation, but using the traversal representation of `filter()`. |
| <3> The more specific `has()`-step is implemented as a `filter()` with respective predicate. |
| |
| image:side-effect-lambda.png[width=175,float=right] |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('person').sideEffect(System.out.&println) <1> |
| g.V().sideEffect(outE().count().store("o")). |
| sideEffect(inE().count().store("i")).cap("o","i") <2> |
| ---- |
| |
| <1> Whatever enters `sideEffect()` is passed to the next step, but some intervening process can occur. |
| <2> Compute the out- and in-degree for each vertex. Both `sideEffect()` are fed with the same vertex. |
| |
| image:branch-lambda.png[width=180,float=right] |
| [gremlin-groovy,modern] |
| ---- |
| g.V().branch {it.get().value('name')}. |
| option('marko', values('age')). |
| option(none, values('name')) <1> |
| g.V().branch(values('name')). |
| option('marko', values('age')). |
| option(none, values('name')) <2> |
| g.V().choose(has('name','marko'), |
| values('age'), |
| values('name')) <3> |
| ---- |
| |
| <1> If the vertex is "marko", get his age, else get the name of the vertex. |
| <2> The same operation, but using the traversal representing of `branch()`. |
| <3> The more specific boolean-based `choose()`-step is implemented as a `branch()`. |
| |
| [[start-steps]] |
| === Start Steps |
| |
| Not all steps are capable of starting a `GraphTraversal`. Only those steps on the `GraphTraversalSource` can do that. |
| Many of the methods on `GraphTraversalSource` are actually for its configuration. From that configured object, it is |
| then possible to use start steps to spawn a `GraphTraversal`. |
| |
| Configuration methods can be identified by their names with make use of "with" as a prefix: |
| |
| * `with()` - Adds arbitrary configuration options which can be used by graph providers as configuration options. |
| * `withBulk()` - This value is `true` by default allowing for normal <<barrier-step,bulking>> operations, but when set |
| to `false`, introduces a subtle change in that behavior as shown in examples in <<sack-step,sack()-step>>. |
| * `withComputer()` - Adds a `Computer` that will be used to process the traversal (<<sparkgraphcomputer,example>>). |
| * `withSack()` - Adds a "sack" that can be accessed by traversals spawned from this source (<<sack-step,example>>). |
| * `withSideEffect()` - Adds an arbitrary `Object` to traversals spawned from this source which can be accessed as a |
| side-effect given the supplied key (<<math-step,example>>). |
| * `withStrategies()` - Includes additional `TraversalStrategy` instances to be applied to any traversals spawned from |
| the configured source (<<readonlystrategy, example>>). |
| * `withPath()` |
| * `withoutStrategies()` - Removes a particular `TraversalStrategy` from those to be applied to traversals spawned from |
| the configured source. |
| |
| Spawn steps, which actually yield a traversal, typically match the names of existing steps: |
| |
| * `addE()` - Adds an `Edge` to start the traversal (<<addedge-step, example>>). |
| * `addV()` - Adds a `Vertex` to start the traversal (<<addvertex-step, example>>). |
| * `E()` - Reads edges from the graph to start the traversal (<<graph-step, example>>). |
| * `inject()` - Inserts arbitrary objects to start the traversal (<<inject-step, example>>). |
| * `V()` - Reads vertices from the graph to start the traversal (<<graph-step, example>>). |
| |
| [[terminal-steps]] |
| === Terminal Steps |
| |
| Typically, when a step is concatenated to a traversal a traversal is returned. In this way, a traversal is built up |
| in a link:https://en.wikipedia.org/wiki/Fluent_interface[fluent], link:https://en.wikipedia.org/wiki/Monoid[monadic] fashion. |
| However, some steps do not return a traversal, but instead, execute the traversal and return a result. These steps are known |
| as terminal steps (*terminal*) and they are explained via the examples below. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().out('created').hasNext() <1> |
| g.V().out('created').next() <2> |
| g.V().out('created').next(2) <3> |
| g.V().out('nothing').tryNext() <4> |
| g.V().out('created').toList() <5> |
| g.V().out('created').toSet() <6> |
| g.V().out('created').toBulkSet() <7> |
| results = ['blah',3] |
| g.V().out('created').fill(results) <8> |
| g.addV('person').iterate() <9> |
| ---- |
| |
| <1> `hasNext()` determines whether there are available results (not supported in `gremlin-javascript`). |
| <2> `next()` will return the next result. |
| <3> `next(n)` will return the next `n` results in a list (not supported in `gremlin-javascript` or Gremlin.NET). |
| <4> `tryNext()` will return an `Optional` and thus, is a composite of `hasNext()`/`next()` (only supported for JVM languages). |
| <5> `toList()` will return all results in a list. |
| <6> `toSet()` will return all results in a set and thus, duplicates removed (not supported in `gremlin-javascript`). |
| <7> `toBulkSet()` will return all results in a weighted set and thus, duplicates preserved via weighting (only supported for JVM languages). |
| <8> `fill(collection)` will put all results in the provided collection and return the collection when complete (only supported for JVM languages). |
| <9> `iterate()` does not exactly fit the definition of a terminal step in that it doesn't return a result, but still |
| returns a traversal - it does however behave as a terminal step in that it iterates the traversal and generates side |
| effects without returning the actual result. |
| |
| There is also the `promise()` terminator step, which can only be used with remote traversals to |
| <<connecting-gremlin-server,Gremlin Server>> or <<connecting-rgp,RGPs>>. It starts a promise to execute a function |
| on the current `Traversal` that will be completed in the future. |
| |
| Finally, <<explain-step,`explain()`>>-step is also a terminal step and is described in its own section. |
| |
| [[addedge-step]] |
| === AddEdge Step |
| |
| link:http://en.wikipedia.org/wiki/Automated_reasoning[Reasoning] is the process of making explicit what is implicit |
| in the data. What is explicit in a graph are the objects of the graph -- i.e. vertices and edges. What is implicit |
| in the graph is the traversal. In other words, traversals expose meaning where the meaning is determined by the |
| traversal definition. For example, take the concept of a "co-developer." Two people are co-developers if they have |
| worked on the same project together. This concept can be represented as a traversal and thus, the concept of |
| "co-developers" can be derived. Moreover, what was once implicit can be made explicit via the `addE()`-step |
| (*map*/*sideEffect*). |
| |
| image::addedge-step.png[width=450] |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).as('a').out('created').in('created').where(neq('a')). |
| addE('co-developer').from('a').property('year',2009) <1> |
| g.V(3,4,5).aggregate('x').has('name','josh').as('a'). |
| select('x').unfold().hasLabel('software').addE('createdBy').to('a') <2> |
| g.V().as('a').out('created').addE('createdBy').to('a').property('acl','public') <3> |
| g.V(1).as('a').out('knows'). |
| addE('livesNear').from('a').property('year',2009). |
| inV().inE('livesNear').values('year') <4> |
| g.V().match( |
| __.as('a').out('knows').as('b'), |
| __.as('a').out('created').as('c'), |
| __.as('b').out('created').as('c')). |
| addE('friendlyCollaborator').from('a').to('b'). |
| property(id,23).property('project',select('c').values('name')) <5> |
| g.E(23).valueMap() |
| marko = g.V().has('name','marko').next() |
| peter = g.V().has('name','peter').next() |
| g.V(marko).addE('knows').to(peter) <6> |
| g.addE('knows').from(marko).to(peter) <7> |
| ---- |
| |
| <1> Add a co-developer edge with a year-property between marko and his collaborators. |
| <2> Add incoming createdBy edges from the josh-vertex to the lop- and ripple-vertices. |
| <3> Add an inverse createdBy edge for all created edges. |
| <4> The newly created edge is a traversable object. |
| <5> Two arbitrary bindings in a traversal can be joined ``from()``->``to()``, where `id` can be provided for graphs that |
| supports user provided ids. |
| <6> Add an edge between marko and peter given the directed (detached) vertex references. |
| <7> Add an edge between marko and peter given the directed (detached) vertex references. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#addE-java.lang.String-++[`addE(String)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#addE-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`addE(Traversal)`] |
| |
| [[addvertex-step]] |
| === AddVertex Step |
| |
| The `addV()`-step is used to add vertices to the graph (*map*/*sideEffect*). For every incoming object, a vertex is |
| created. Moreover, `GraphTraversalSource` maintains an `addV()` method. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.addV('person').property('name','stephen') |
| g.V().values('name') |
| g.V().outE('knows').addV().property('name','nothing') |
| g.V().has('name','nothing') |
| g.V().has('name','nothing').bothE() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#addV--++[`addV()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#addV-java.lang.String-++[`addV(String)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#addV-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`addV(Traversal)`] |
| |
| [[addproperty-step]] |
| === AddProperty Step |
| |
| The `property()`-step is used to add properties to the elements of the graph (*sideEffect*). Unlike `addV()` and |
| `addE()`, `property()` is a full sideEffect step in that it does not return the property it created, but the element |
| that streamed into it. Moreover, if `property()` follows an `addV()` or `addE()`, then it is "folded" into the |
| previous step to enable vertex and edge creation with all its properties in one creation operation. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).property('country','usa') |
| g.V(1).property('city','santa fe').property('state','new mexico').valueMap() |
| g.V(1).property(list,'age',35) <1> |
| g.V(1).valueMap() |
| g.V(1).property('friendWeight',outE('knows').values('weight').sum(),'acl','private') <2> |
| g.V(1).properties('friendWeight').valueMap() <3> |
| ---- |
| |
| <1> For vertices, a cardinality can be provided for <<vertex-properties,vertex properties>>. |
| <2> It is possible to select the property value (as well as key) via a traversal. |
| <3> For vertices, the `property()`-step can add meta-properties. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#property-java.lang.Object-java.lang.Object-java.lang.Object...-++[`property(Object, Object, Object...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#property-org.apache.tinkerpop.gremlin.structure.VertexProperty.Cardinality-java.lang.Object-java.lang.Object-java.lang.Object...-++[`property(Cardinality, Object, Object, Object...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/structure/VertexProperty.Cardinality.html++[`Cardinality`] |
| |
| [[aggregate-step]] |
| [[store-step]] |
| === Aggregate Step |
| |
| image::aggregate-step.png[width=800] |
| |
| The `aggregate()`-step (*sideEffect*) is used to aggregate all the objects at a particular point of traversal into a |
| `Collection`. The step is uses `Scope` to help determine the aggregating behavior. For `global` scope this means that |
| the step will use link:http://en.wikipedia.org/wiki/Eager_evaluation[eager evaluation] in that no objects continue on |
| until all previous objects have been fully aggregated. The eager evaluation model is crucial in situations |
| where everything at a particular point is required for future computation. By default, when the overload of |
| `aggregate()` is called without a `Scope`, the default is `global`. An example is provided below. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).out('created') <1> |
| g.V(1).out('created').aggregate('x') <2> |
| g.V(1).out('created').aggregate(global, 'x') <3> |
| g.V(1).out('created').aggregate('x').in('created') <4> |
| g.V(1).out('created').aggregate('x').in('created').out('created') <5> |
| g.V(1).out('created').aggregate('x').in('created').out('created'). |
| where(without('x')).values('name') <6> |
| ---- |
| |
| <1> What has marko created? |
| <2> Aggregate all his creations. |
| <3> Identical to the previous line. |
| <3> Who are marko's collaborators? |
| <4> What have marko's collaborators created? |
| <5> What have marko's collaborators created that he hasn't created? |
| |
| In link:http://en.wikipedia.org/wiki/Recommender_system[recommendation systems], the above pattern is used: |
| |
| "What has userA liked? Who else has liked those things? What have they liked that userA hasn't already liked?" |
| |
| Finally, `aggregate()`-step can be modulated via `by()`-projection. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().out('knows').aggregate('x').cap('x') |
| g.V().out('knows').aggregate('x').by('name').cap('x') |
| ---- |
| |
| For `local` scope the aggregation will occur in a link:http://en.wikipedia.org/wiki/Lazy_evaluation[lazy] fashion. |
| |
| NOTE: Prior to 3.4.3, `local` aggregation (i.e. lazy) evaluation was handled by `store()`-step. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().aggregate(global, 'x').limit(1).cap('x') |
| g.V().aggregate(local, 'x').limit(1).cap('x') |
| g.withoutStrategies(EarlyLimitStrategy).V().aggregate(local,'x').limit(1).cap('x') |
| ---- |
| |
| It is important to note that `EarlyLimitStrategy` introduced in 3.3.5 alters the behavior of `aggregate(local)`. |
| Without that strategy (which is installed by default), there are two results in the `aggregate()` side-effect even |
| though the interval selection is for 1 object. Realize that when the second object is on its way to the `range()` |
| filter (i.e. `[0..1]`), it passes through `aggregate()` and thus, stored before filtered. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.E().store('x').by('weight').cap('x') |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#aggregate-java.lang.String-++[`aggregate(String)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#aggregate-org.apache.tinkerpop.gremlin.process.traversal.Scope,java.lang.String-++[`aggregate(Scope,String)`] |
| |
| [[and-step]] |
| === And Step |
| |
| The `and()`-step ensures that all provided traversals yield a result (*filter*). Please see <<or-step,`or()`>> for or-semantics. |
| |
| [NOTE, caption=Python] |
| ==== |
| The term `and` is a reserved word in Python, and therefore must be referred to in Gremlin with `and_()`. |
| ==== |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().and( |
| outE('knows'), |
| values('age').is(lt(30))). |
| values('name') |
| ---- |
| |
| The `and()`-step can take an arbitrary number of traversals. All traversals must produce at least one output for the |
| original traverser to pass to the next step. |
| |
| An link:http://en.wikipedia.org/wiki/Infix_notation[infix notation] can be used as well. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().where(outE('created').and().outE('knows')).values('name') |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#and-org.apache.tinkerpop.gremlin.process.traversal.Traversal...-++[`and(Traversal...)`] |
| |
| [[as-step]] |
| === As Step |
| |
| The `as()`-step is not a real step, but a "step modulator" similar to <<by-step,`by()`>> and <<option-step,`option()`>>. |
| With `as()`, it is possible to provide a label to the step that can later be accessed by steps and data structures |
| that make use of such labels -- e.g., <<select-step,`select()`>>, <<match-step,`match()`>>, and path. |
| |
| [NOTE, caption=Groovy] |
| ==== |
| The term `as` is a reserved word in Groovy, and when therefore used as part of an anonymous traversal must be referred |
| to in Gremlin with the double underscore `__.as()`. |
| ==== |
| |
| [NOTE, caption=Python] |
| ==== |
| The term `as` is a reserved word in Python, and therefore must be referred to in Gremlin with `as_()`. |
| ==== |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().as('a').out('created').as('b').select('a','b') <1> |
| g.V().as('a').out('created').as('b').select('a','b').by('name') <2> |
| ---- |
| |
| <1> Select the objects labeled "a" and "b" from the path. |
| <2> Select the objects labeled "a" and "b" from the path and, for each object, project its name value. |
| |
| A step can have any number of labels associated with it. This is useful for referencing the same step multiple times in a future step. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('software').as('a','b','c'). |
| select('a','b','c'). |
| by('name'). |
| by('lang'). |
| by(__.in('created').values('name').fold()) |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#as-java.lang.String-java.lang.String...-++[`as(String,String...)`] |
| |
| [[barrier-step]] |
| === Barrier Step |
| |
| The `barrier()`-step (*barrier*) turns the lazy traversal pipeline into a bulk-synchronous pipeline. This step is |
| useful in the following situations: |
| |
| * When everything prior to `barrier()` needs to be executed before moving onto the steps after the `barrier()` (i.e. ordering). |
| * When "stalling" the traversal may lead to a "bulking optimization" in traversals that repeatedly touch many of the same elements (i.e. optimizing). |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().sideEffect{println "first: ${it}"}.sideEffect{println "second: ${it}"}.iterate() |
| g.V().sideEffect{println "first: ${it}"}.barrier().sideEffect{println "second: ${it}"}.iterate() |
| ---- |
| |
| The theory behind a "bulking optimization" is simple. If there are one million traversers at vertex 1, then there is |
| no need to calculate one million `both()`-computations. Instead, represent those one million traversers as a single |
| traverser with a `Traverser.bulk()` equal to one million and execute `both()` once. A bulking optimization example is |
| made more salient on a larger graph. Therefore, the example below leverages the <<grateful-dead,Grateful Dead graph>>. |
| |
| [gremlin-groovy] |
| ---- |
| graph = TinkerGraph.open() |
| g = traversal().withEmbedded(graph) |
| g.io('data/grateful-dead.xml').read().iterate() |
| g = traversal().withEmbedded(graph).withoutStrategies(LazyBarrierStrategy) <1> |
| clockWithResult(1){g.V().both().both().both().count().next()} <2> |
| clockWithResult(1){g.V().repeat(both()).times(3).count().next()} <3> |
| clockWithResult(1){g.V().both().barrier().both().barrier().both().barrier().count().next()} <4> |
| ---- |
| |
| <1> Explicitly remove `LazyBarrierStrategy` which yields a bulking optimization. |
| <2> A non-bulking traversal where each traverser is processed. |
| <3> Each traverser entering `repeat()` has its recursion bulked. |
| <4> A bulking traversal where implicit traversers are not processed. |
| |
| If `barrier()` is provided an integer argument, then the barrier will only hold `n`-number of unique traversers in its |
| barrier before draining the aggregated traversers to the next step. This is useful in the aforementioned bulking |
| optimization scenario with the added benefit of reducing the risk of an out-of-memory exception. |
| |
| `LazyBarrierStrategy` inserts `barrier()`-steps into a traversal where appropriate in order to gain the |
| "bulking optimization." |
| |
| [gremlin-groovy] |
| ---- |
| graph = TinkerGraph.open() |
| g = traversal().withEmbedded(graph) <1> |
| g.io('data/grateful-dead.xml').read().iterate() |
| clockWithResult(1){g.V().both().both().both().count().next()} |
| g.V().both().both().both().count().iterate().toString() <2> |
| ---- |
| |
| <1> `LazyBarrierStrategy` is a default strategy and thus, does not need to be explicitly activated. |
| <2> With `LazyBarrierStrategy` activated, `barrier()`-steps are automatically inserted where appropriate. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#barrier--++[`barrier()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#barrier-java.util.function.Consumer-++[`barrier(Consumer)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#barrier-int-++[`barrier(int)`] |
| |
| [[by-step]] |
| === By Step |
| |
| The `by()`-step is not an actual step, but instead is a "step-modulator" similar to <<as-step,`as()`>> and |
| <<option-step,`option()`>>. If a step is able to accept traversals, functions, comparators, etc. then `by()` is the |
| means by which they are added. The general pattern is `step().by()...by()`. Some steps can only accept one `by()` |
| while others can take an arbitrary amount. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().group().by(bothE().count()) <1> |
| g.V().group().by(bothE().count()).by('name') <2> |
| g.V().group().by(bothE().count()).by(count()) <3> |
| ---- |
| |
| <1> `by(outE().count())` will group the elements by their edge count (*traversal*). |
| <2> `by('name')` will process the grouped elements by their name (*element property projection*). |
| <3> `by(count())` will count the number of elements in each group (*traversal*). |
| |
| The following steps all support `by()`-modulation. Note that the semantics of such modulation should be understood |
| on a step-by-step level and thus, as discussed in their respective section of the documentation. |
| |
| * <<dedup-step, `dedup()`>>: dedup on the results of a `by()`-modulation. |
| * <<cyclicpath-step, `cyclicPath()`>>: filter if the traverser's path is cyclic given `by()`-modulation. |
| * <<simplepath-step, `simplePath()`>>: filter if the traverser's path is simple given `by()`-modulation. |
| * <<sample-step, `sample()`>>: sample using the value returned by `by()`-modulation. |
| * <<where-step, `where()`>>: determine the predicate given the testing of the results of `by()`-modulation. |
| * <<groupcount-step,`groupCount()`>>: count those groups where the group keys are the result of `by()`-modulation. |
| * <<group-step, `group()`>>: create group keys and values according to `by()`-modulation. |
| * <<order-step, `order()`>>: order the objects by the results of a `by()`-modulation. |
| * <<path-step, `path()`>>: get the path of the traverser where each path element is `by()`-modulated. |
| * <<project-step, `project()`>>: project a map of results given various `by()`-modulations off the current object. |
| * <<select-step, `select()`>>: select path elements and transform them via `by()`-modulation. |
| * <<tree-step, `tree()`>>: get a tree of traversers objects where the objects have been `by()`-modulated. |
| * <<aggregate-step, `aggregate()`>>: aggregate all objects into a set but only store their `by()`-modulated values. |
| * <<store-step, `store()`>>: store all objects into a set but only store their `by()`-modulated values. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#by--++[`by()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#by-java.util.Comparator-++[`by(Comparator)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#by-java.util.function.Function-java.util.Comparator-++[`by(Function,Comparator)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#by-java.util.function.Function-++[`by(Function)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#by-org.apache.tinkerpop.gremlin.process.traversal.Order-++[`by(Order)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#by-java.lang.String-++[`by(String)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#by-java.lang.String-java.util.Comparator-++[`by(String,Comparator)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#by-org.apache.tinkerpop.gremlin.structure.T-++[`by(T)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#by-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`by(Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#by-org.apache.tinkerpop.gremlin.process.traversal.Traversal-java.util.Comparator-++[`by(Traversal,Comparator)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/structure/T.html++[`T`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Order.html++[`Order`] |
| |
| |
| [[cap-step]] |
| === Cap Step |
| |
| The `cap()`-step (*barrier*) iterates the traversal up to itself and emits the sideEffect referenced by the provided |
| key. If multiple keys are provided, then a `Map<String,Object>` of sideEffects is emitted. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().groupCount('a').by(label).cap('a') <1> |
| g.V().groupCount('a').by(label).groupCount('b').by(outE().count()).cap('a','b') <2> |
| ---- |
| |
| <1> Group and count vertices by their label. Emit the side effect labeled 'a', which is the group count by label. |
| <2> Same as statement 1, but also emit the side effect labeled 'b' which groups vertices by the number of out edges. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#cap-java.lang.String-java.lang.String...-++[`cap(String,String...)`] |
| |
| [[choose-step]] |
| === Choose Step |
| |
| image::choose-step.png[width=700] |
| |
| The `choose()`-step (*branch*) routes the current traverser to a particular traversal branch option. With `choose()`, |
| it is possible to implement if/then/else-semantics as well as more complicated selections. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('person'). |
| choose(values('age').is(lte(30)), |
| __.in(), |
| __.out()).values('name') <1> |
| g.V().hasLabel('person'). |
| choose(values('age')). |
| option(27, __.in()). |
| option(32, __.out()).values('name') <2> |
| ---- |
| |
| <1> If the traversal yields an element, then do `in`, else do `out` (i.e. true/false-based option selection). |
| <2> Use the result of the traversal as a key to the map of traversal options (i.e. value-based option selection). |
| |
| If the "false"-branch is not provided, then if/then-semantics are implemented. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().choose(hasLabel('person'), out('created')).values('name') <1> |
| g.V().choose(hasLabel('person'), out('created'), identity()).values('name') <2> |
| ---- |
| |
| <1> If the vertex is a person, emit the vertices they created, else emit the vertex. |
| <2> If/then/else with an `identity()` on the false-branch is equivalent to if/then with no false-branch. |
| |
| Note that `choose()` can have an arbitrary number of options and moreover, can take an anonymous traversal as its choice function. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('person'). |
| choose(values('name')). |
| option('marko', values('age')). |
| option('josh', values('name')). |
| option('vadas', elementMap()). |
| option('peter', label()) |
| ---- |
| |
| The `choose()`-step can leverage the `Pick.none` option match. For anything that does not match a specified option, the `none`-option is taken. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('person'). |
| choose(values('name')). |
| option('marko', values('age')). |
| option(none, values('name')) |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#choose-java.util.function.Function-++[`choose(Function)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#choose-java.util.function.Predicate-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`choose(Predicate,Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#choose-java.util.function.Predicate-org.apache.tinkerpop.gremlin.process.traversal.Traversal-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`choose(Predicate,Traversal,Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#choose-org.apache.tinkerpop.gremlin.process.traversal.Traversal-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`choose(Traversal,Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#choose-org.apache.tinkerpop.gremlin.process.traversal.Traversal-org.apache.tinkerpop.gremlin.process.traversal.Traversal-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`choose(Traversal,Traversal,Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#choose-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`choose(Traversal)`] |
| |
| [[coalesce-step]] |
| === Coalesce Step |
| |
| The `coalesce()`-step evaluates the provided traversals in order and returns the first traversal that emits at |
| least one element. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).coalesce(outE('knows'), outE('created')).inV().path().by('name').by(label) |
| g.V(1).coalesce(outE('created'), outE('knows')).inV().path().by('name').by(label) |
| g.V(1).property('nickname', 'okram') |
| g.V().hasLabel('person').coalesce(values('nickname'), values('name')) |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#coalesce-org.apache.tinkerpop.gremlin.process.traversal.Traversal...-++[`coalesce(Traversal...)`] |
| |
| [[coin-step]] |
| === Coin Step |
| |
| To randomly filter out a traverser, use the `coin()`-step (*filter*). The provided double argument biases the "coin toss." |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().coin(0.5) |
| g.V().coin(0.0) |
| g.V().coin(1.0) |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#coin-double-++[`coin(double)`] |
| |
| [[connectedcomponent-step]] |
| === ConnectedComponent Step |
| |
| The `connectedComponent()` step performs a computation to identify link:https://en.wikipedia.org/wiki/Connected_component_(graph_theory)[Connected Component] |
| instances in a graph. When this step completes, the vertices will be labelled with a component identifier to denote |
| the component to which they are associated. |
| |
| IMPORTANT: The `connectedComponent()`-step is a `VertexComputing`-step and as such, can only be used against a graph |
| that supports `GraphComputer` (OLAP). |
| |
| [gremlin-groovy,modern] |
| ---- |
| g = traversal().withEmbedded(graph).withComputer() |
| g.V(). |
| connectedComponent(). |
| with(ConnectedComponent.propertyName, 'component'). |
| project('name','component'). |
| by('name'). |
| by('component') |
| g.V().hasLabel('person'). |
| connectedComponent(). |
| with(ConnectedComponent.propertyName, 'component'). |
| with(ConnectedComponent.edges, outE('knows')). |
| project('name','component'). |
| by('name'). |
| by('component') |
| ---- |
| |
| Note the use of the `with()` modulating step which provides configuration options to the algorithm. It takes |
| configuration keys from the `ConnectedComponent` class and is automatically imported to the Gremlin Console. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#connectedComponent--++[`connectedComponent()`] |
| |
| [[constant-step]] |
| === Constant Step |
| |
| To specify a constant value for a traverser, use the `constant()`-step (*map*). This is often useful with conditional |
| steps like <<choose-step,`choose()`-step>> or <<coalesce-step,`coalesce()`-step>>. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().choose(hasLabel('person'), |
| values('name'), |
| constant('inhuman')) <1> |
| g.V().coalesce( |
| hasLabel('person').values('name'), |
| constant('inhuman')) <2> |
| ---- |
| |
| <1> Show the names of people, but show "inhuman" for other vertices. |
| <2> Same as statement 1 (unless there is a person vertex with no name). |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#constant-E2-++[`constant(Object)`] |
| |
| [[count-step]] |
| === Count Step |
| |
| image::count-step.png[width=195] |
| |
| The `count()`-step (*map*) counts the total number of represented traversers in the streams (i.e. the bulk count). |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().count() |
| g.V().hasLabel('person').count() |
| g.V().hasLabel('person').outE('created').count().path() <1> |
| g.V().hasLabel('person').outE('created').count().map {it.get() * 10}.path() <2> |
| ---- |
| |
| <1> `count()`-step is a <<a-note-on-barrier-steps,reducing barrier step>> meaning that all of the previous traversers are folded into a new traverser. |
| <2> The path of the traverser emanating from `count()` starts at `count()`. |
| |
| IMPORTANT: `count(local)` counts the current, local object (not the objects in the traversal stream). This works for |
| `Collection`- and `Map`-type objects. For any other object, a count of 1 is returned. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#count--++[`count()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#count-org.apache.tinkerpop.gremlin.process.traversal.Scope-++[`count(Scope)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Scope.html++[`Scope`] |
| |
| [[cyclicpath-step]] |
| === CyclicPath Step |
| |
| image::cyclicpath-step.png[width=400] |
| |
| Each traverser maintains its history through the traversal over the graph -- i.e. its <<path-data-structure,path>>. |
| If it is important that the traverser repeat its course, then `cyclic()`-path should be used (*filter*). The step |
| analyzes the path of the traverser thus far and if there are any repeats, the traverser is filtered out over the |
| traversal computation. If non-cyclic behavior is desired, see <<simplepath-step,`simplePath()`>>. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).both().both() |
| g.V(1).both().both().cyclicPath() |
| g.V(1).both().both().cyclicPath().path() |
| g.V(1).as('a').out('created').as('b'). |
| in('created').as('c'). |
| cyclicPath(). |
| path() |
| g.V(1).as('a').out('created').as('b'). |
| in('created').as('c'). |
| cyclicPath().from('a').to('b'). |
| path() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#cyclicPath--++[`cyclicPath()`] |
| |
| [[dedup-step]] |
| === Dedup Step |
| |
| With `dedup()`-step (*filter*), repeatedly seen objects are removed from the traversal stream. Note that if a |
| traverser's bulk is greater than 1, then it is set to 1 before being emitted. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().values('lang') |
| g.V().values('lang').dedup() |
| g.V(1).repeat(bothE('created').dedup().otherV()).emit().path() <1> |
| ---- |
| |
| <1> Traverse all `created` edges, but don't touch any edge twice. |
| |
| If a by-step modulation is provided to `dedup()`, then the object is processed accordingly prior to determining if it |
| has been seen or not. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().elementMap('name') |
| g.V().dedup().by(label).values('name') |
| ---- |
| |
| Finally, if `dedup()` is provided an array of strings, then it will ensure that the de-duplication is not with respect |
| to the current traverser object, but to the path history of the traverser. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().as('a').out('created').as('b').in('created').as('c').select('a','b','c') |
| g.V().as('a').out('created').as('b').in('created').as('c').dedup('a','b').select('a','b','c') <1> |
| ---- |
| |
| <1> If the current `a` and `b` combination has been seen previously, then filter the traverser. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#dedup-org.apache.tinkerpop.gremlin.process.traversal.Scope-java.lang.String...-++[`dedup(Scope,String...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#dedup-java.lang.String...-++[`dedup(String...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Scope.html++[`Scope`] |
| |
| [[drop-step]] |
| === Drop Step |
| |
| The `drop()`-step (*filter*/*sideEffect*) is used to remove element and properties from the graph (i.e. remove). It |
| is a filter step because the traversal yields no outgoing objects. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().outE().drop() |
| g.E() |
| g.V().properties('name').drop() |
| g.V().elementMap() |
| g.V().drop() |
| g.V() |
| ---- |
| |
| *Additional References* |
| |
| * link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#drop--++[`drop()`] |
| |
| [[elementmap-step]] |
| === ElementMap Step |
| |
| The `elementMap()`-step yields a `Map` representation of the structure of an element. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().elementMap() |
| g.V().elementMap('age') |
| g.V().elementMap('age','blah') |
| g.E().elementMap() |
| ---- |
| |
| It is important to note that the map of a vertex assumes that cardinality for each key is `single` and if it is `list` |
| then only the first item encountered will be returned. As `single` is the more common cardinality for properties this |
| assumption should serve the greatest number of use cases. |
| |
| [gremlin-groovy,theCrew] |
| ---- |
| g.V().elementMap() |
| g.V().has('name','marko').properties('location') |
| g.V().has('name','marko').properties('location').elementMap() |
| ---- |
| |
| IMPORTANT: The `elementMap()`-step does not return the vertex labels for incident vertices when using `GraphComputer` |
| as the `id` is the only available data to the star graph. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#elementMap-java.lang.String...-++[`elementMap(String...)`] |
| |
| [[emit-step]] |
| === Emit Step |
| |
| The `emit`-step is not an actual step, but is instead a step modulator for `<<repeat-step,repeat()>>` (find more |
| documentation on the `emit()` there). |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#emit--++[`emit()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#emit-java.util.function.Predicate-++[`emit(Predicate)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#emit-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`emit(Traversal)`] |
| |
| [[explain-step]] |
| === Explain Step |
| |
| The `explain()`-step (*terminal*) will return a `TraversalExplanation`. A traversal explanation details how the |
| traversal (prior to `explain()`) will be compiled given the registered <<traversalstrategy,traversal strategies>>. |
| A `TraversalExplanation` has a `toString()` representation with 3-columns. The first column is the |
| traversal strategy being applied. The second column is the traversal strategy category: [D]ecoration, [O]ptimization, |
| [P]rovider optimization, [F]inalization, and [V]erification. Finally, the third column is the state of the traversal |
| post strategy application. The final traversal is the resultant execution plan. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('person').outE().identity().inV().count().is(gt(5)).explain() |
| ---- |
| |
| For traversal profiling information, please see <<profile-step,`profile()`>>-step. |
| |
| [[fold-step]] |
| === Fold Step |
| |
| There are situations when the traversal stream needs a "barrier" to aggregate all the objects and emit a computation |
| that is a function of the aggregate. The `fold()`-step (*map*) is one particular instance of this. Please see |
| <<unfold-step,`unfold()`>>-step for the inverse functionality. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).out('knows').values('name') |
| g.V(1).out('knows').values('name').fold() <1> |
| g.V(1).out('knows').values('name').fold().next().getClass() <2> |
| g.V(1).out('knows').values('name').fold(0) {a,b -> a + b.length()} <3> |
| g.V().values('age').fold(0) {a,b -> a + b} <4> |
| g.V().values('age').fold(0, sum) <5> |
| g.V().values('age').sum() <6> |
| ---- |
| |
| <1> A parameterless `fold()` will aggregate all the objects into a list and then emit the list. |
| <2> A verification of the type of list returned. |
| <3> `fold()` can be provided two arguments -- a seed value and a reduce bi-function ("vadas" is 5 characters + "josh" with 4 characters). |
| <4> What is the total age of the people in the graph? |
| <5> The same as before, but using a built-in bi-function. |
| <6> The same as before, but using the <<sum-step,`sum()`-step>>. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#fold--++[`fold()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#fold-E2-java.util.function.BiFunction-++[`fold(Object,BiFunction)`] |
| |
| [[from-step]] |
| === From Step |
| |
| The `from()`-step is not an actual step, but instead is a "step-modulator" similar to <<as-step,`as()`>> and |
| <<by-step,`by()`>>. If a step is able to accept traversals or strings then `from()` is the |
| means by which they are added. The general pattern is `step().from()`. See <<to-step,`to()`>>-step. |
| |
| The list of steps that support `from()`-modulation are: <<simplepath-step,`simplePath()`>>, <<cyclicpath-step,`cyclicPath()`>>, |
| <<path-step,`path()`>>, and <<addedge-step,`addE()`>>. |
| |
| [NOTE, caption=Javascript] |
| ==== |
| The term `from` is a reserved word in Javascript, and therefore must be referred to in Gremlin with `from_()`. |
| ==== |
| |
| [NOTE, caption=Python] |
| ==== |
| The term `from` is a reserved word in Python, and therefore must be referred to in Gremlin with `from_()`. |
| ==== |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#from-java.lang.String-++[`from(String)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#from-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`from(Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#from-org.apache.tinkerpop.gremlin.structure.Vertex-++[`from(Vertex)`] |
| |
| [[graph-step]] |
| === Graph Step |
| |
| Graph steps are those that read vertices, `V()`, or edges, `E()`, from the graph. The `V()`-step is usually used to |
| start a `GraphTraversal`, but can also be used mid-traversal. The `E()`-step on the other hand can only be used as a |
| start step. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1) <1> |
| g.V().has('name', within('marko', 'vadas', 'josh')).as('person'). |
| V().has('name', within('lop', 'ripple')).addE('uses').from('person') <2> |
| g.E(11) <3> |
| g.E().hasLabel('knows').has('weight', gt(0.75)) |
| ---- |
| |
| <1> Find the vertex by its unique identifier (i.e. `T.id`) - not all graphs will use a numeric value for their identifier. |
| <2> An example where `V()` is used both as a start step and in the middle of a traversal. |
| <3> Find the edge by its unique identifier (i.e. `T.id`) - not all graphs will use a numeric value for their identifier. |
| |
| NOTE: Whether a mid-traversal `V()` uses an index or not, depends on a) whether suitable index exists and b) if the |
| particular graph system provider implemented this functionality. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().has('name', within('marko', 'vadas', 'josh')).as('person'). |
| V().has('name', within('lop', 'ripple')).addE('uses').from('person').toString() <1> |
| g.V().has('name', within('marko', 'vadas', 'josh')).as('person'). |
| V().has('name', within('lop', 'ripple')).addE('uses').from('person').iterate().toString() <2> |
| ---- |
| |
| <1> Normally the `V()`-step will iterate over all vertices. However, graph strategies can fold ``HasContainer``'s into a `GraphStep` to allow index lookups. |
| <2> Whether the graph system provider supports mid-traversal `V()` index lookups or not can easily be determined by inspecting the `toString()` output of the iterated traversal. If `has` conditions were folded into the `V()`-step, an index - if one exists - will be used. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#V-java.lang.Object...-++[`V(Object...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#E-java.lang.Object...-++[`E(Object...)`] |
| |
| [[group-step]] |
| === Group Step |
| |
| As traversers propagate across a graph as defined by a traversal, sideEffect computations are sometimes required. |
| That is, the actual path taken or the current location of a traverser is not the ultimate output of the computation, |
| but some other representation of the traversal. The `group()`-step (*map*/*sideEffect*) is one such sideEffect that |
| organizes the objects according to some function of the object. Then, if required, that organization (a list) is |
| reduced. An example is provided below. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().group().by(label) <1> |
| g.V().group().by(label).by('name') <2> |
| g.V().group().by(label).by(count()) <3> |
| ---- |
| |
| <1> Group the vertices by their label. |
| <2> For each vertex in the group, get their name. |
| <3> For each grouping, what is its size? |
| |
| The two projection parameters available to `group()` via `by()` are: |
| |
| . Key-projection: What feature of the object to group on (a function that yields the map key)? |
| . Value-projection: What feature of the group to store in the key-list? |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#group--++[`group()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#group-java.lang.String-++[`group(String)`] |
| |
| [[groupcount-step]] |
| === GroupCount Step |
| |
| When it is important to know how many times a particular object has been at a particular part of a traversal, |
| `groupCount()`-step (*map*/*sideEffect*) is used. |
| |
| "What is the distribution of ages in the graph?" |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('person').values('age').groupCount() |
| g.V().hasLabel('person').groupCount().by('age') <1> |
| ---- |
| |
| <1> You can also supply a pre-group projection, where the provided <<by-step,`by()`>>-modulation determines what to |
| group the incoming object by. |
| |
| There is one person that is 32, one person that is 35, one person that is 27, and one person that is 29. |
| |
| "Iteratively walk the graph and count the number of times you see the second letter of each name." |
| |
| image::groupcount-step.png[width=420] |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().repeat(both().groupCount('m').by(label)).times(10).cap('m') |
| ---- |
| |
| The above is interesting in that it demonstrates the use of referencing the internal `Map<Object,Long>` of |
| `groupCount()` with a string variable. Given that `groupCount()` is a sideEffect-step, it simply passes the object |
| it received to its output. Internal to `groupCount()`, the object's count is incremented. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#groupCount--++[`groupCount()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#groupCount-java.lang.String-++[`groupCount(String)`] |
| |
| [[has-step]] |
| === Has Step |
| |
| image::has-step.png[width=670] |
| |
| It is possible to filter vertices, edges, and vertex properties based on their properties using `has()`-step |
| (*filter*). There are numerous variations on `has()` including: |
| |
| * `has(key,value)`: Remove the traverser if its element does not have the provided key/value property. |
| * `has(label, key, value)`: Remove the traverser if its element does not have the specified label and provided key/value property. |
| * `has(key,predicate)`: Remove the traverser if its element does not have a key value that satisfies the bi-predicate. For more information on predicates, please read <<a-note-on-predicates,A Note on Predicates>>. |
| * `hasLabel(labels...)`: Remove the traverser if its element does not have any of the labels. |
| * `hasId(ids...)`: Remove the traverser if its element does not have any of the ids. |
| * `hasKey(keys...)`: Remove the traverser if the property does not have all of the provided keys. |
| * `hasValue(values...)`: Remove the traverser if its property does not have all of the provided values. |
| * `has(key)`: Remove the traverser if its element does not have a value for the key. |
| * `hasNot(key)`: Remove the traverser if its element has a value for the key. |
| * `has(key, traversal)`: Remove the traverser if its object does not yield a result through the traversal off the property value. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('person') |
| g.V().hasLabel('person','name','marko') |
| g.V().hasLabel('person').out().has('name',within('vadas','josh')) |
| g.V().hasLabel('person').out().has('name',within('vadas','josh')). |
| outE().hasLabel('created') |
| g.V().has('age',inside(20,30)).values('age') <1> |
| g.V().has('age',outside(20,30)).values('age') <2> |
| g.V().has('name',within('josh','marko')).elementMap() <3> |
| g.V().has('name',without('josh','marko')).elementMap() <4> |
| g.V().has('name',not(within('josh','marko'))).elementMap() <5> |
| g.V().properties().hasKey('age').value() <6> |
| g.V().hasNot('age').values('name') <7> |
| g.V().has('person','name', startingWith('m')) <8> |
| ---- |
| |
| <1> Find all vertices whose ages are between 20 (exclusive) and 30 (exclusive). In other words, the age must be greater than 20 and less than 30. |
| <2> Find all vertices whose ages are not between 20 (inclusive) and 30 (inclusive). In other words, the age must be less than 20 or greater than 30. |
| <3> Find all vertices whose names are exact matches to any names in the collection `[josh,marko]`, display all |
| the key,value pairs for those vertices. |
| <4> Find all vertices whose names are not in the collection `[josh,marko]`, display all the key,value pairs for those vertices. |
| <5> Same as the prior example save using `not` on `within` to yield `without`. |
| <6> Find all age-properties and emit their value. |
| <7> Find all vertices that do not have an age-property and emit their name. |
| <8> Find all "person" vertices that have a name property that starts with the letter "m". |
| |
| TinkerPop does not support a regular expression predicate, although specific graph databases that leverage TinkerPop |
| may provide a partial match extension. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#has-java.lang.String-++[`has(String)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#has-java.lang.String-java.lang.Object-++[`has(String,Object)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#has-java.lang.String-org.apache.tinkerpop.gremlin.process.traversal.P-++[`has(String,P)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#has-java.lang.String-java.lang.String-java.lang.Object-++[`has(String,String,Object)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#has-java.lang.String-java.lang.String-org.apache.tinkerpop.gremlin.process.traversal.P-++[`has(String,String,P)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#has-java.lang.String-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`has(String,Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#has-org.apache.tinkerpop.gremlin.structure.T-java.lang.Object-++[`has(T,Object)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#has-org.apache.tinkerpop.gremlin.structure.T-org.apache.tinkerpop.gremlin.process.traversal.P-++[`has(T,P)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#has-org.apache.tinkerpop.gremlin.structure.T-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`has(T,Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#hasId-java.lang.Object-java.lang.Object...-++[`hasId(Object,Object...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#hasId-org.apache.tinkerpop.gremlin.process.traversal.P-++[`hasId(P)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#hasKey-org.apache.tinkerpop.gremlin.process.traversal.P-++[`hasKey(P)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#hasKey-java.lang.String-java.lang.String...-++[`hasKey(String,String...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#hasLabel-org.apache.tinkerpop.gremlin.process.traversal.P-++[`hasLabel(P)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#hasLabel-java.lang.String-java.lang.String...-++[`hasLabel(String,String...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#hasNot-java.lang.String-++[`hasNot(String)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#hasValue-java.lang.Object-java.lang.Object...-++[`hasValue(Object,Object...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#hasValue-org.apache.tinkerpop.gremlin.process.traversal.P-++[`hasValue(P)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/P.html++[`P`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/TextP.html++[`TextP`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/structure/T.html++[`T`] |
| |
| [[id-step]] |
| === Id Step |
| |
| The `id()`-step (*map*) takes an `Element` and extracts its identifier from it. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().id() |
| g.V(1).out().id().is(2) |
| g.V(1).outE().id() |
| g.V(1).properties().id() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#id--++[`id()`] |
| |
| [[identity-step]] |
| === Identity Step |
| |
| The `identity()`-step (*map*) is an link:https://en.wikipedia.org/wiki/Identity_function[identity function] which maps |
| the current object to itself. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().identity() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#identity--++[`identity()`] |
| |
| [[index-step]] |
| === Index Step |
| |
| The `index()`-step (*map*) indexes each element in the current collection. If the current traverser's value is not a collection, then it's treated as a single-item collection. There are two indexers |
| available, which can be chosen using the `with()` modulator. The list indexer (default) creates a list for each collection item, with the first item being the original element and the second element |
| being the index. The map indexer created a linked hash map in which the index represents the key and the original item is used as the value. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel("software").index() <1> |
| g.V().hasLabel("software").values("name").fold(). |
| order(Scope.local). |
| index(). |
| unfold(). |
| order(). |
| by(__.tail(Scope.local, 1)) <2> |
| g.V().hasLabel("software").values("name").fold(). |
| order(Scope.local). |
| index(). |
| with(WithOptions.indexer, WithOptions.list). |
| unfold(). |
| order(). |
| by(__.tail(Scope.local, 1)) <3> |
| g.V().hasLabel("person").values("name").fold(). |
| order(Scope.local). |
| index(). |
| with(WithOptions.indexer, WithOptions.map) <4> |
| ---- |
| |
| <1> Indexing non-collection items results in multiple indexed single-item collections. |
| <2> Index all software names in their alphabetical order. |
| <3> Same as statement 1, but with an explicitely specified list indexer. |
| <4> Index all person names in their alphabetical order and store the result in an ordered map. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#index--++[`index()`] |
| |
| [[inject-step]] |
| === Inject Step |
| |
| image::inject-step.png[width=800] |
| |
| The concept of "injectable steps" makes it possible to insert objects arbitrarily into a traversal stream. In general, |
| `inject()`-step (*sideEffect*) exists and a few examples are provided below. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(4).out().values('name').inject('daniel') |
| g.V(4).out().values('name').inject('daniel').map {it.get().length()} |
| g.V(4).out().values('name').inject('daniel').map {it.get().length()}.path() |
| ---- |
| |
| In the last example above, note that the path starting with `daniel` is only of length 2. This is because the |
| `daniel` string was inserted half-way in the traversal. Finally, a typical use case is provided below -- when the |
| start of the traversal is not a graph object. |
| |
| [gremlin-groovy,modern] |
| ---- |
| inject(1,2) |
| inject(1,2).map {it.get() + 1} |
| inject(1,2).map {it.get() + 1}.map {g.V(it.get()).next()}.values('name') |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#inject-E...-++[`inject(Object)`] |
| |
| anchor:_gremlin_i_o[] |
| [[io-step]] |
| === IO Step |
| |
| image:gremlin-io.png[width=250,float=left] The task of importing and exporting the data of `Graph` instances is the |
| job of the `io()`-step. By default, TinkerPop supports three formats for importing and exporting graph data in |
| <<graphml,GraphML>>, <<graphson,GraphSON>>, and <<gryo,Gryo>>. |
| |
| NOTE: Additional documentation for TinkerPop IO formats can be found in the link:https://tinkerpop.apache.org/docs/x.y.z/dev/io/[IO Reference]. |
| |
| By itself the `io()`-step merely configures the kind of importing and exporting that is going |
| to occur and it is the follow-on call to the `read()` or `write()` step that determines which of those actions will |
| execute. Therefore, a typical usage of the `io()`-step would look like this: |
| |
| [source,java] |
| ---- |
| g.io(someInputFile).read().iterate() |
| g.io(someOutputFile).write().iterate() |
| ---- |
| |
| IMPORTANT: The commands above are still traversals and therefore require iteration to be executed, hence the use of |
| `iterate()` as a termination step. |
| |
| By default, the `io()`-step will try to detect the right file format using the file name extension. To gain greater |
| control of the format use the `with()` step modulator to provide further information to `io()`. For example: |
| |
| [source,java] |
| ---- |
| g.io(someInputFile). |
| with(IO.reader, IO.graphson). |
| read().iterate() |
| g.io(someOutputFile). |
| with(IO.writer,IO.graphml). |
| write().iterate() |
| ---- |
| |
| The `IO` class is a helper for the `io()`-step that provides expressions that can be used to help configure it |
| and in this case it allows direct specification of the "reader" or "writer" to use. The "reader" actually refers to |
| a `GraphReader` implementation and the "writer" refers to a `GraphWriter` implementation. The implementations of |
| those interfaces provided by default are the standard TinkerPop implementations. |
| |
| That default is an important point to consider for users. The default TinkerPop implementations are not designed with |
| massive, complex, parallel bulk loading in mind. They are designed to do single-threaded, OLTP-style loading of data |
| in the most generic way possible so as to accommodate the greatest number of graph databases out there. As such, from |
| a reading perspective, they work best for small datasets (or perhaps medium datasets where memory is plentiful and |
| time is not critical) that are loading to an empty graph - incremental loading is not supported. The story from the |
| writing perspective is not that different in there are no parallel operations in play, however streaming the output |
| to disk requires a single pass of the data without high memory requirements for larger datasets. |
| |
| In general, TinkerPop recommends that users examine the native bulk import/export tools of the graph implementation |
| that they choose. Those tools will often outperform the `io()`-step and perhaps be easier to use with a greater |
| feature set. That said, graph providers do have the option to optimize `io()` to back it with their own |
| import/export utilities and therefore the default behavior provided by TinkerPop described above might be overridden |
| by the graph. |
| |
| An excellent example of this lies in <<hadoop-gremlin,HadoopGraph>> with <<sparkgraphcomputer,SparkGraphComputer>> |
| which replaces the default single-threaded implementation with a more advanced OLAP style bulk import/export |
| functionality internally using <<clonevertexprogram,CloneVertexProgram>>. With this model, graphs of arbitrary size |
| can be imported/exported assuming that there is a Hadoop `InputFormat` or `OutputFormat` to support it. |
| |
| IMPORTANT: Remote Gremlin Console users or Gremlin Language Variant (GLV) users (e.g. gremlin-python) who utilize |
| the `io()`-step should recall that their `read()` or `write()` operation will occur on the server and not locally |
| and therefore the file specified for import/export must be something accessible by the server. |
| |
| GraphSON and Gryo formats are extensible allowing users and graph providers to extend supported serialization options. |
| These extensions are exposed through `IoRegistry` implementations. To apply an `IoRegistry` use the `with()` option |
| and the `IO.registry` key, where the value is either an actual `IoRegistry` instance or the fully qualified class |
| name of one. |
| |
| [source,java] |
| ---- |
| g.io(someInputFile). |
| with(IO.reader, IO.gryo). |
| with(IO.registry, TinkerIoRegistryV3d0.instance()) |
| read().iterate() |
| g.io(someOutputFile). |
| with(IO.writer,IO.graphson). |
| with(IO.registry, "org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0") |
| write().iterate() |
| ---- |
| |
| GLVs will obviously always be forced to use the latter form as they can't explicitly create an instance of an |
| `IoRegistry` to pass to the server (nor are `IoRegistry` instances necessarily serializable). |
| |
| The version of the formats (e.g. GraphSON 2.0 or 3.0) utilized by `io()` is determined entirely by the `IO.reader` and |
| `IO.writer` configurations or their defaults. The defaults will always be the latest version for the current release |
| of TinkerPop. It is also possible for graph providers to override these defaults, so consult the documentation of the |
| underlying graph database in use for any details on that. |
| |
| For more advanced configuration of `GraphReader` and `GraphWriter` operations (e.g. normalized output for GraphSON, |
| disabling class registrations for Gryo, etc.) then construct the appropriate `GraphReader` and `GraphWriter` using |
| the `build()` method on their implementations and use it directly. It can be passed directly to the `IO.reader` or |
| `IO.writer` options. Obviously, these are JVM based operations and thus not available to GLVs as portable features. |
| |
| anchor:_graphml_reader_writer[] |
| [[graphml]] |
| ==== GraphML |
| |
| image:gremlin-graphml.png[width=350,float=left] The link:http://graphml.graphdrawing.org/[GraphML] file format is a |
| common XML-based representation of a graph. It is widely supported by graph-related tools and libraries making it a |
| solid interchange format for TinkerPop. In other words, if the intent is to work with graph data in conjunction with |
| applications outside of TinkerPop, GraphML may be the best choice to do that. Common use cases might be: |
| |
| * Generate a graph using link:https://networkx.github.io/[NetworkX], export it with GraphML and import it to TinkerPop. |
| * Produce a subgraph and export it to GraphML to be consumed by and visualized in link:https://gephi.org/[Gephi]. |
| * Migrate the data of an entire graph to a different graph database not supported by TinkerPop. |
| |
| WARNING: GraphML is a "lossy" format in that it only supports primitive values for properties and does not have |
| support for `Graph` variables. It will use `toString` to serialize property values outside of those primitives. |
| |
| WARNING: GraphML as a specification allows for `<edge>` and `<node>` elements to appear in any order. Most software |
| that writes GraphML (including as TinkerPop's `GraphMLWriter`) write `<node>` elements before `<edge>` elements. However it |
| is important to note that `GraphMLReader` will read this data in order and order can matter. This is because TinkerPop |
| does not allow the vertex label to be changed after the vertex has been created. Therefore, if an `<edge>` element |
| comes before the `<node>`, the label on the vertex will be ignored. It is thus better to order `<node>` elements in the |
| GraphML to appear before all `<edge>` elements if vertex labels are important to the graph. |
| |
| [source,java] |
| ---- |
| g.io("graph.xml").read().iterate() |
| g.io("graph.xml").write().iterate() |
| ---- |
| |
| NOTE: If using GraphML generated from TinkerPop 2.x, read more about its incompatibilities in the |
| link:https://tinkerpop.apache.org/docs/x.y.z/upgrade/#graphml-format[Upgrade Documentation]. |
| |
| anchor:graphson-reader-writer[] |
| [[graphson]] |
| ==== GraphSON |
| |
| image:gremlin-graphson.png[width=350,float=left] GraphSON is a link:http://json.org/[JSON]-based format extended |
| from earlier versions of TinkerPop. It is important to note that TinkerPop's GraphSON is not backwards compatible |
| with prior TinkerPop GraphSON versions. GraphSON has some support from graph-related application outside of TinkerPop, |
| but it is generally best used in two cases: |
| |
| * A text format of the graph or its elements is desired (e.g. debugging, usage in source control, etc.) |
| * The graph or its elements need to be consumed by code that is not JVM-based (e.g. JavaScript, Python, .NET, etc.) |
| |
| [source,java] |
| ---- |
| g.io("graph.json").read().iterate() |
| g.io("graph.json").write().iterate() |
| ---- |
| |
| NOTE: Additional documentation for GraphSON can be found in the link:https://tinkerpop.apache.org/docs/x.y.z/dev/io/#graphson[IO Reference]. |
| |
| anchor:gryo-reader-writer[] |
| [[gryo]] |
| ==== Gryo |
| |
| image:gremlin-kryo.png[width=400,float=left] link:https://github.com/EsotericSoftware/kryo[Kryo] is a popular |
| serialization package for the JVM. Gremlin-Kryo is a binary `Graph` serialization format for use on the JVM by JVM |
| languages. It is designed to be space efficient, non-lossy and is promoted as the standard format to use when working |
| with graph data inside of the TinkerPop stack. A list of common use cases is presented below: |
| |
| * Migration from one Gremlin Structure implementation to another (e.g. `TinkerGraph` to `Neo4jGraph`) |
| * Serialization of individual graph elements to be sent over the network to another JVM. |
| * Backups of in-memory graphs or subgraphs. |
| |
| WARNING: When migrating between Gremlin Structure implementations, Kryo may not lose data, but it is important to |
| consider the features of each `Graph` and whether or not the data types supported in one will be supported in the |
| other. Failure to do so, may result in errors. |
| |
| [source,java] |
| ---- |
| g.io("graph.kryo").read().iterate() |
| g.io("graph.kryo").write().iterate() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversalSource.html#io-java.lang.String-++[`io(String)`] |
| |
| [[is-step]] |
| === Is Step |
| |
| It is possible to filter scalar values using `is()`-step (*filter*). |
| |
| [NOTE, caption=Python] |
| ==== |
| The term `is` is a reserved word in Python, and therefore must be referred to in Gremlin with `is_()`. |
| ==== |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().values('age').is(32) |
| g.V().values('age').is(lte(30)) |
| g.V().values('age').is(inside(30, 40)) |
| g.V().where(__.in('created').count().is(1)).values('name') <1> |
| g.V().where(__.in('created').count().is(gte(2))).values('name') <2> |
| g.V().where(__.in('created').values('age'). |
| mean().is(inside(30d, 35d))).values('name') <3> |
| ---- |
| |
| <1> Find projects having exactly one contributor. |
| <2> Find projects having two or more contributors. |
| <3> Find projects whose contributors average age is between 30 and 35. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#is-java.lang.Object-++[`is(Object)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#is-org.apache.tinkerpop.gremlin.process.traversal.P-++[`is(P)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/P.html++[`P`] |
| |
| [[key-step]] |
| === Key Step |
| |
| The `key()`-step (*map*) takes a `Property` and extracts the key from it. |
| |
| [gremlin-groovy,theCrew] |
| ---- |
| g.V(1).properties().key() |
| g.V(1).properties().properties().key() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#key--++[`key()`] |
| |
| [[label-step]] |
| === Label Step |
| |
| The `label()`-step (*map*) takes an `Element` and extracts its label from it. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().label() |
| g.V(1).outE().label() |
| g.V(1).properties().label() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#label--++[`label()`] |
| |
| [[limit-step]] |
| === Limit Step |
| |
| The `limit()`-step is analogous to <<range-step,`range()`-step>> save that the lower end range is set to 0. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().limit(2) |
| g.V().range(0, 2) |
| ---- |
| |
| The `limit()`-step can also be applied with `Scope.local`, in which case it operates on the incoming collection. |
| The examples below use the <<the-crew-toy-graph,The Crew>> toy data set. |
| |
| [gremlin-groovy,theCrew] |
| ---- |
| g.V().valueMap().select('location').limit(local,2) <1> |
| g.V().valueMap().limit(local, 1) <2> |
| ---- |
| |
| <1> `List<String>` for each vertex containing the first two locations. |
| <2> `Map<String, Object>` for each vertex, but containing only the first property value. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#limit-long-++[`limit(long)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#limit-org.apache.tinkerpop.gremlin.process.traversal.Scope-long-++[`limit(Scope,long)`] |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Scope.html++[`Scope`] |
| |
| [[local-step]] |
| === Local Step |
| |
| image::local-step.png[width=450] |
| |
| A `GraphTraversal` operates on a continuous stream of objects. In many situations, it is important to operate on a |
| single element within that stream. To do such object-local traversal computations, `local()`-step exists (*branch*). |
| Note that the examples below use the <<the-crew-toy-graph,The Crew>> toy data set. |
| |
| [gremlin-groovy,theCrew] |
| ---- |
| g.V().as('person'). |
| properties('location').order().by('startTime',asc).limit(2).value().as('location'). |
| select('person','location').by('name').by() <1> |
| g.V().as('person'). |
| local(properties('location').order().by('startTime',asc).limit(2)).value().as('location'). |
| select('person','location').by('name').by() <2> |
| ---- |
| |
| <1> Get the first two people and their respective location according to the most historic location start time. |
| <2> For every person, get their two most historic locations. |
| |
| The two traversals above look nearly identical save the inclusion of `local()` which wraps a section of the traversal |
| in a object-local traversal. As such, the `order().by()` and the `limit()` refer to a particular object, not to the |
| stream as a whole. |
| |
| Local Step is quite similar in functionality to <<general-steps,Flat Map Step>> where it can often be confused. |
| `local()` propagates the traverser through the internal traversal as is without splitting/cloning it. Thus, its |
| a “global traversal” with local processing. Its use is subtle and primarily finds application in compilation |
| optimizations (i.e. when writing `TraversalStrategy` implementations. As another example consider: |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().both().barrier().flatMap(groupCount().by("name")) |
| g.V().both().barrier().local(groupCount().by("name")) |
| ---- |
| |
| WARNING: The anonymous traversal of `local()` processes the current object "locally." In OLAP, where the atomic unit |
| of computing is the vertex and its local "star graph," it is important that the anonymous traversal does not leave |
| the confines of the vertex's star graph. In other words, it can not traverse to an adjacent vertex's properties or edges. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#local-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`local(Traversal)`] |
| |
| [[loops-step]] |
| === Loops Step |
| |
| The `loops()`-step (*map*) extracts the number of times the `Traverser` has gone through the current loop. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().emit(__.has("name", "marko").or().loops().is(2)).repeat(__.out()).values("name") |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#loops--++[`loops()`] |
| |
| link:++https://tinkerpop.apache.org/docs/x.y.z/recipes/#looping[`Looping Recipes`] |
| |
| [[match-step]] |
| === Match Step |
| |
| The `match()`-step (*map*) provides a more link:http://en.wikipedia.org/wiki/Declarative_programming[declarative] |
| form of graph querying based on the notion of link:http://en.wikipedia.org/wiki/Pattern_matching[pattern matching]. |
| With `match()`, the user provides a collection of "traversal fragments," called patterns, that have variables defined |
| that must hold true throughout the duration of the `match()`. When a traverser is in `match()`, a registered |
| `MatchAlgorithm` analyzes the current state of the traverser (i.e. its history based on its |
| <<path-data-structure,path data>>), the runtime statistics of the traversal patterns, and returns a traversal-pattern |
| that the traverser should try next. The default `MatchAlgorithm` provided is called `CountMatchAlgorithm` and it |
| dynamically revises the pattern execution plan by sorting the patterns according to their filtering capabilities |
| (i.e. largest set reduction patterns execute first). For very large graphs, where the developer is uncertain of the |
| statistics of the graph (e.g. how many `knows`-edges vs. `worksFor`-edges exist in the graph), it is advantageous to |
| use `match()`, as an optimal plan will be determined automatically. Furthermore, some queries are much easier to |
| express via `match()` than with single-path traversals. |
| |
| "Who created a project named 'lop' that was also created by someone who is 29 years old? Return the two creators." |
| |
| image::match-step.png[width=500] |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().match( |
| __.as('a').out('created').as('b'), |
| __.as('b').has('name', 'lop'), |
| __.as('b').in('created').as('c'), |
| __.as('c').has('age', 29)). |
| select('a','c').by('name') |
| ---- |
| |
| Note that the above can also be more concisely written as below which demonstrates that standard inner-traversals can |
| be arbitrarily defined. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().match( |
| __.as('a').out('created').has('name', 'lop').as('b'), |
| __.as('b').in('created').has('age', 29).as('c')). |
| select('a','c').by('name') |
| ---- |
| |
| In order to improve readability, `as()`-steps can be given meaningful labels which better reflect your domain. The |
| previous query can thus be written in a more expressive way as shown below. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().match( |
| __.as('creators').out('created').has('name', 'lop').as('projects'), <1> |
| __.as('projects').in('created').has('age', 29).as('cocreators')). <2> |
| select('creators','cocreators').by('name') <3> |
| ---- |
| |
| <1> Find vertices that created something and match them as 'creators', then find out what they created which is |
| named 'lop' and match these vertices as 'projects'. |
| <2> Using these 'projects' vertices, find out their creators aged 29 and remember these as 'cocreators'. |
| <3> Return the name of both 'creators' and 'cocreators'. |
| |
| [[grateful-dead]] |
| .Grateful Dead |
| image::grateful-dead-schema.png[width=475] |
| |
| `MatchStep` brings functionality similar to link:http://en.wikipedia.org/wiki/SPARQL[SPARQL] to Gremlin. Like SPARQL, |
| MatchStep conjoins a set of patterns applied to a graph. For example, the following traversal finds exactly those |
| songs which Jerry Garcia has both sung and written (using the Grateful Dead graph distributed in the `data/` directory): |
| |
| [gremlin-groovy] |
| ---- |
| g = traversal().withEmbedded(graph) |
| g.io('data/grateful-dead.xml').read().iterate() |
| g.V().match( |
| __.as('a').has('name', 'Garcia'), |
| __.as('a').in('writtenBy').as('b'), |
| __.as('a').in('sungBy').as('b')). |
| select('b').values('name') |
| ---- |
| |
| Among the features which differentiate `match()` from SPARQL are: |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().match( |
| __.as('a').out('created').has('name','lop').as('b'), <1> |
| __.as('b').in('created').has('age', 29).as('c'), |
| __.as('c').repeat(out()).times(2)). <2> |
| select('c').out('knows').dedup().values('name') <3> |
| ---- |
| |
| <1> *Patterns of arbitrary complexity*: `match()` is not restricted to triple patterns or property paths. |
| <2> *Recursion support*: `match()` supports the branch-based steps within a pattern, including `repeat()`. |
| <3> *Imperative/declarative hybrid*: Before and after a `match()`, it is possible to leverage classic Gremlin traversals. |
| |
| To extend point #3, it is possible to support going from imperative, to declarative, to imperative, ad infinitum. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().match( |
| __.as('a').out('knows').as('b'), |
| __.as('b').out('created').has('name','lop')). |
| select('b').out('created'). |
| match( |
| __.as('x').in('created').as('y'), |
| __.as('y').out('knows').as('z')). |
| select('z').values('name') |
| ---- |
| |
| IMPORTANT: The `match()`-step is stateless. The variable bindings of the traversal patterns are stored in the path |
| history of the traverser. As such, the variables used over all `match()`-steps within a traversal are globally unique. |
| A benefit of this is that subsequent `where()`, `select()`, `match()`, etc. steps can leverage the same variables in |
| their analysis. |
| |
| Like all other steps in Gremlin, `match()` is a function and thus, `match()` within `match()` is a natural consequence |
| of Gremlin's functional foundation (i.e. recursive matching). |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().match( |
| __.as('a').out('knows').as('b'), |
| __.as('b').out('created').has('name','lop'), |
| __.as('b').match( |
| __.as('b').out('created').as('c'), |
| __.as('c').has('name','ripple')). |
| select('c').as('c')). |
| select('a','c').by('name') |
| ---- |
| |
| If a step-labeled traversal proceeds the `match()`-step and the traverser entering the `match()` is destined to bind |
| to a particular variable, then the previous step should be labeled accordingly. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().as('a').out('knows').as('b'). |
| match( |
| __.as('b').out('created').as('c'), |
| __.not(__.as('c').in('created').as('a'))). |
| select('a','b','c').by('name') |
| ---- |
| |
| There are three types of `match()` traversal patterns. |
| |
| . `as('a')...as('b')`: both the start and end of the traversal have a declared variable. |
| . `as('a')...`: only the start of the traversal has a declared variable. |
| . `...`: there are no declared variables. |
| |
| If a variable is at the start of a traversal pattern it *must* exist as a label in the path history of the traverser |
| else the traverser can not go down that path. If a variable is at the end of a traversal pattern then if the variable |
| exists in the path history of the traverser, the traverser's current location *must* match (i.e. equal) its historic |
| location at that same label. However, if the variable does not exist in the path history of the traverser, then the |
| current location is labeled as the variable and thus, becomes a bound variable for subsequent traversal patterns. If a |
| traversal pattern does not have an end label, then the traverser must simply "survive" the pattern (i.e. not be |
| filtered) to continue to the next pattern. If a traversal pattern does not have a start label, then the traverser |
| can go down that path at any point, but will only go down that pattern once as a traversal pattern is executed once |
| and only once for the history of the traverser. Typically, traversal patterns that do not have a start and end label |
| are used in conjunction with `and()`, `or()`, and `where()`. Once the traverser has "survived" all the patterns (or at |
| least one for `or()`), `match()`-step analyzes the traverser's path history and emits a `Map<String,Object>` of the |
| variable bindings to the next step in the traversal. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().as('a').out().as('b'). <1> |
| match( <2> |
| __.as('a').out().count().as('c'), <3> |
| __.not(__.as('a').in().as('b')), <4> |
| or( <5> |
| __.as('a').out('knows').as('b'), |
| __.as('b').in().count().as('c').and().as('c').is(gt(2)))). <6> |
| dedup('a','c'). <7> |
| select('a','b','c').by('name').by('name').by() <8> |
| ---- |
| |
| <1> A standard, step-labeled traversal can come prior to `match()`. |
| <2> If the traverser's path prior to entering `match()` has requisite label values, then those historic values are bound. |
| <3> It is possible to use <<a-note-on-barrier-steps,barrier steps>> though they are computed locally to the pattern (as one would expect). |
| <4> It is possible to `not()` a pattern. |
| <5> It is possible to nest `and()`- and `or()`-steps for conjunction matching. |
| <6> Both infix and prefix conjunction notation is supported. |
| <7> It is possible to "distinct" the specified label combination. |
| <8> The bound values are of different types -- vertex ("a"), vertex ("b"), long ("c"). |
| |
| [[using-where-with-match]] |
| ==== Using Where with Match |
| |
| Match is typically used in conjunction with both `select()` (demonstrated previously) and `where()` (presented here). |
| A `where()`-step allows the user to further constrain the result set provided by `match()`. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().match( |
| __.as('a').out('created').as('b'), |
| __.as('b').in('created').as('c')). |
| where('a', neq('c')). |
| select('a','c').by('name') |
| ---- |
| |
| The `where()`-step can take either a `P`-predicate (example above) or a `Traversal` (example below). Using |
| `MatchPredicateStrategy`, `where()`-clauses are automatically folded into `match()` and thus, subject to the query |
| optimizer within `match()`-step. |
| |
| [gremlin-groovy,modern] |
| ---- |
| traversal = g.V().match( |
| __.as('a').has(label,'person'), <1> |
| __.as('a').out('created').as('b'), |
| __.as('b').in('created').as('c')). |
| where(__.as('a').out('knows').as('c')). <2> |
| select('a','c').by('name'); null <3> |
| traversal.toString() <4> |
| traversal <5> <6> |
| traversal.toString() <7> |
| ---- |
| |
| <1> Any `has()`-step traversal patterns that start with the match-key are pulled out of `match()` to enable the graph |
| system to leverage the filter for index lookups. |
| <2> A `where()`-step with a traversal containing variable bindings declared in `match()`. |
| <3> A useful trick to ensure that the traversal is not iterated by Gremlin Console. |
| <4> The string representation of the traversal prior to its strategies being applied. |
| <5> The Gremlin Console will automatically iterate anything that is an iterator or is iterable. |
| <6> Both marko and josh are co-developers and marko knows josh. |
| <7> The string representation of the traversal after the strategies have been applied (and thus, `where()` is folded into `match()`) |
| |
| IMPORTANT: A `where()`-step is a filter and thus, variables within a `where()` clause are not globally bound to the |
| path of the traverser in `match()`. As such, `where()`-steps in `match()` are used for filtering, not binding. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#match-org.apache.tinkerpop.gremlin.process.traversal.Traversal...-++[`match(Traversal...)`] |
| |
| [[math-step]] |
| === Math Step |
| |
| The `math()`-step (*math*) enables scientific calculator functionality within Gremlin. This step deviates from the common |
| function composition and nesting formalisms to provide an easy to read string-based math processor. Variables within the |
| equation map to scopes in Gremlin -- e.g. path labels, side-effects, or incoming map keys. This step supports |
| `by()`-modulation where the `by()`-modulators are applied in the order in which the variables are first referenced |
| within the equation. Note that the reserved variable `_` refers to the current numeric traverser object incoming to the |
| `math()`-step. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().as('a').out('knows').as('b').math('a + b').by('age') |
| g.V().as('a').out('created').as('b'). |
| math('b + a'). |
| by(both().count().math('_ + 100')). |
| by('age') |
| g.withSideEffect('x',10).V().values('age').math('_ / x') |
| g.withSack(1).V(1).repeat(sack(sum).by(constant(1))).times(10).emit().sack().math('sin _') |
| ---- |
| |
| The operators supported by the calculator include: `*`, `+`, `/`, `^`, and `%`. Furthermore, the following built in |
| functions are provided: |
| |
| * `abs`: absolute value |
| * `acos`: arc cosine |
| * `asin`: arc sine |
| * `atan`: arc tangent |
| * `cbrt`: cubic root |
| * `ceil`: nearest upper integer |
| * `cos`: cosine |
| * `cosh`: hyperbolic cosine |
| * `exp`: euler's number raised to the power (`e^x`) |
| * `floor`: nearest lower integer |
| * `log`: logarithmus naturalis (base e) |
| * `log10`: logarithm (base 10) |
| * `log2`: logarithm (base 2) |
| * `sin`: sine |
| * `sinh`: hyperbolic sine |
| * `sqrt`: square root |
| * `tan`: tangent |
| * `tanh`: hyperbolic tangent |
| * `signum`: signum function |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#math-java.lang.String-++[`math(String)`] |
| |
| [[max-step]] |
| === Max Step |
| |
| The `max()`-step (*map*) operates on a stream of comparable objects and determines which is the last object according to its natural order in the stream. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().values('age').max() |
| g.V().repeat(both()).times(3).values('age').max() |
| g.V().values('name').max() |
| ---- |
| |
| IMPORTANT: `max(local)` determines the max of the current, local object (not the objects in the traversal stream). |
| This works for `Collection` and `Comparable`-type objects. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#max--++[`max()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#max-org.apache.tinkerpop.gremlin.process.traversal.Scope-++[`max(Scope)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Scope.html++[`Scope`] |
| |
| [[mean-step]] |
| === Mean Step |
| |
| The `mean()`-step (*map*) operates on a stream of numbers and determines the average of those numbers. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().values('age').mean() |
| g.V().repeat(both()).times(3).values('age').mean() <1> |
| g.V().repeat(both()).times(3).values('age').dedup().mean() |
| ---- |
| |
| <1> Realize that traversers are being bulked by `repeat()`. There may be more of a particular number than another, |
| thus altering the average. |
| |
| IMPORTANT: `mean(local)` determines the mean of the current, local object (not the objects in the traversal stream). |
| This works for `Collection` and `Number`-type objects. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#mean--++[`mean()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#mean-org.apache.tinkerpop.gremlin.process.traversal.Scope-++[`mean(Scope)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Scope.html++[`Scope`] |
| |
| [[min-step]] |
| === Min Step |
| |
| The `min()`-step (*map*) operates on a stream of comparable objects and determines which is the first object according to its natural order in the stream. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().values('age').min() |
| g.V().repeat(both()).times(3).values('age').min() |
| g.V().values('name').min() |
| ---- |
| |
| IMPORTANT: `min(local)` determines the min of the current, local object (not the objects in the traversal stream). |
| This works for `Collection` and `Comparable`-type objects. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#min--++[`min()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#min-org.apache.tinkerpop.gremlin.process.traversal.Scope-++[`min(Scope)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Scope.html++[`Scope`] |
| |
| [[none-step]] |
| === None Step |
| |
| The `none()`-step (*filter*) filters all objects from a traversal stream. It is especially useful for to traversals |
| that are executed remotely where returning results is not useful and the traversal is only meant to generate |
| side-effects. Choosing not to return results saves in serialization and network costs as the objects are filtered on |
| the remote end and not returned to the client side. Typically, this step does not need to be used directly and is |
| quietly used by the `iterate()` terminal step which appends `none()` to the traversal before actually cycling through |
| results. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/current/core/org/apache/tinkerpop/gremlin/process/traversal/Traversal.html#none--++[`none()`] |
| link:++https://tinkerpop.apache.org/javadocs/current/core/org/apache/tinkerpop/gremlin/process/traversal/Traversal.html#iterate--++[`iterate()`] |
| |
| [[not-step]] |
| === Not Step |
| |
| The `not()`-step (*filter*) removes objects from the traversal stream when the traversal provided as an argument |
| returns an object. |
| |
| [NOTE, caption=Groovy] |
| ==== |
| The term `not` is a reserved word in Groovy, and when therefore used as part of an anonymous traversal must be referred |
| to in Gremlin with the double underscore `__.not()`. |
| ==== |
| |
| [NOTE, caption=Python] |
| ==== |
| The term `not` is a reserved word in Python, and therefore must be referred to in Gremlin with `not_()`. |
| ==== |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().not(hasLabel('person')).elementMap() |
| g.V().hasLabel('person'). |
| not(out('created').count().is(gt(1))).values('name') <1> |
| ---- |
| |
| <1> josh created two projects and vadas none |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#not-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`not(Traversal)`] |
| |
| [[option-step]] |
| === Option Step |
| |
| An option to a <<general-steps,`branch()`>> or <<choose-step,`choose()`>>. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#option-M-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`option(Object,Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#option-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`option(Traversal)`] |
| |
| [[optional-step]] |
| === Optional Step |
| |
| The `optional()`-step (*branch/flatMap*) returns the result of the specified traversal if it yields a result else it returns the calling |
| element, i.e. the `identity()`. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(2).optional(out('knows')) <1> |
| g.V(2).optional(__.in('knows')) <2> |
| ---- |
| |
| <1> vadas does not have an outgoing knows-edge so vadas is returned. |
| <2> vadas does have an incoming knows-edge so marko is returned. |
| |
| `optional` is particularly useful for lifting entire graphs when used in conjunction with `path` or `tree`. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('person').optional(out('knows').optional(out('created'))).path() <1> |
| ---- |
| |
| <1> Returns the paths of everybody followed by who they know followed by what they created. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#optional-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`optional(Traversal)`] |
| |
| [[or-step]] |
| === Or Step |
| |
| The `or()`-step ensures that at least one of the provided traversals yield a result (*filter*). Please see |
| <<and-step,`and()`>> for and-semantics. |
| |
| [NOTE, caption=Python] |
| ==== |
| The term `or` is a reserved word in Python, and therefore must be referred to in Gremlin with `or_()`. |
| ==== |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().or( |
| __.outE('created'), |
| __.inE('created').count().is(gt(1))). |
| values('name') |
| ---- |
| |
| The `or()`-step can take an arbitrary number of traversals. At least one of the traversals must produce at least one |
| output for the original traverser to pass to the next step. |
| |
| An link:http://en.wikipedia.org/wiki/Infix_notation[infix notation] can be used as well. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().where(outE('created').or().outE('knows')).values('name') |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#or-org.apache.tinkerpop.gremlin.process.traversal.Traversal...-++[`or(Traversal...)`] |
| |
| [[order-step]] |
| === Order Step |
| |
| When the objects of the traversal stream need to be sorted, `order()`-step (*map*) can be leveraged. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().values('name').order() |
| g.V().values('name').order().by(desc) |
| g.V().hasLabel('person').order().by('age', asc).values('name') |
| ---- |
| |
| One of the most traversed objects in a traversal is an `Element`. An element can have properties associated with it |
| (i.e. key/value pairs). In many situations, it is desirable to sort an element traversal stream according to a |
| comparison of their properties. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().values('name') |
| g.V().order().by('name',asc).values('name') |
| g.V().order().by('name',desc).values('name') |
| ---- |
| |
| The `order()`-step allows the user to provide an arbitrary number of comparators for primary, secondary, etc. sorting. |
| In the example below, the primary ordering is based on the outgoing created-edge count. The secondary ordering is |
| based on the age of the person. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('person').order().by(outE('created').count(), asc). |
| by('age', asc).values('name') |
| g.V().hasLabel('person').order().by(outE('created').count(), asc). |
| by('age', desc).values('name') |
| ---- |
| |
| Randomizing the order of the traversers at a particular point in the traversal is possible with `Order.shuffle`. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('person').order().by(shuffle) |
| g.V().hasLabel('person').order().by(shuffle) |
| ---- |
| |
| It is possible to use `order(local)` to order the current local object and not the entire traversal stream. This works for |
| `Collection`- and `Map`-type objects. For any other object, the object is returned unchanged. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().values('age').fold().order(local).by(desc) <1> |
| g.V().values('age').order(local).by(desc) <2> |
| g.V().groupCount().by(inE().count()).order(local).by(values, desc) <3> |
| g.V().groupCount().by(inE().count()).order(local).by(keys, asc) <4> |
| ---- |
| |
| <1> The ages are gathered into a list and then that list is sorted in decreasing order. |
| <2> The ages are not gathered and thus `order(local)` is "ordering" single integers and thus, does nothing. |
| <3> The `groupCount()` map is ordered by its values in decreasing order. |
| <4> The `groupCount()` map is ordered by its keys in increasing order. |
| |
| NOTE: The `values` and `keys` enums are from `Column` which is used to select "columns" from a `Map`, `Map.Entry`, or `Path`. |
| |
| NOTE: Prior to version 3.3.4, ordering was defined by `Order.incr` for ascending order and `Order.decr` for descending |
| order. That approach is now deprecated with the preferred method shown in the examples which uses the more common |
| forms for query languages in `Order.asc` and Order.desc. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#order--++[`order()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#order-org.apache.tinkerpop.gremlin.process.traversal.Scope-++[`order(Scope)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Scope.html++[`Scope`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Order.html++[`Order`] |
| |
| [[pagerank-step]] |
| === PageRank Step |
| |
| The `pageRank()`-step (*map*/*sideEffect*) calculates link:http://en.wikipedia.org/wiki/PageRank[PageRank] using <<pagerankvertexprogram,`PageRankVertexProgram`>>. |
| |
| IMPORTANT: The `pageRank()`-step is a `VertexComputing`-step and as such, can only be used against a graph that supports `GraphComputer` (OLAP). |
| |
| [gremlin-groovy,modern] |
| ---- |
| g = traversal().withEmbedded(graph).withComputer() |
| g.V().pageRank().by('pageRank').values('pageRank') |
| g.V().hasLabel('person'). |
| pageRank(). |
| with(PageRank.edges, __.outE('knows')). |
| with(PageRank.propertyName, 'friendRank'). |
| order().by('friendRank',desc). |
| elementMap('name','friendRank') |
| ---- |
| |
| Note the use of the `with()` modulating step which provides configuration options to the algorithm. It takes |
| configuration keys from the `PageRank` and is automatically imported to the Gremlin Console. |
| |
| The <<explain-step,`explain()`>>-step can be used to understand how the traversal is compiled into multiple `GraphComputer` jobs. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g = traversal().withEmbedded(graph).withComputer() |
| g.V().hasLabel('person'). |
| pageRank(). |
| with(PageRank.edges, __.outE('knows')). |
| with(PageRank.propertyName, 'friendRank'). |
| order().by('friendRank',desc). |
| elementMap('name','friendRank').explain() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#pageRank--++[`pageRank()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#pageRank-double-++[`pageRank(double)`] |
| |
| [[path-step]] |
| === Path Step |
| |
| A traverser is transformed as it moves through a series of steps within a traversal. The history of the traverser is |
| realized by examining its path with `path()`-step (*map*). |
| |
| image::path-step.png[width=650] |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().out().out().values('name') |
| g.V().out().out().values('name').path() |
| ---- |
| |
| If edges are required in the path, then be sure to traverse those edges explicitly. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().outE().inV().outE().inV().path() |
| ---- |
| |
| It is possible to post-process the elements of the path in a round-robin fashion via `by()`. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().out().out().path().by('name').by('age') |
| ---- |
| |
| Finally, because `by()`-based post-processing, nothing prevents triggering yet another traversal. In the traversal |
| below, for each element of the path traversed thus far, if its a person (as determined by having an `age`-property), |
| then get all of their creations, else if its a creation, get all the people that created it. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().out().out().path().by( |
| choose(hasLabel('person'), |
| out('created').values('name'), |
| __.in('created').values('name')).fold()) |
| ---- |
| |
| It's possible to limit the path using the <<to-step,`to()`>> or <<from-step,`from()`>> step modulators. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().has('person','name','vadas').as('e'). |
| in('knows'). |
| out('knows').where(neq('e')). |
| path().by('name') <1> |
| g.V().has('person','name','vadas').as('e'). |
| in('knows').as('m'). |
| out('knows').where(neq('e')). |
| path().to('m').by('name') <2> |
| g.V().has('person','name','vadas').as('e'). |
| in('knows').as('m'). |
| out('knows').where(neq('e')). |
| path().from('m').by('name') <3> |
| ---- |
| |
| <1> Obtain the full path from vadas to josh. |
| <2> Save the middle node, marko, and use the `to()` modulator to show only the path from vadas to marko |
| <3> Use the `from()` mdoulator to show only the path from marko to josh |
| |
| WARNING: Generating path information is expensive as the history of the traverser is stored into a Java list. With |
| numerous traversers, there are numerous lists. Moreover, in an OLAP <<graphcomputer,`GraphComputer`>> environment |
| this becomes exceedingly prohibitive as there are traversers emanating from all vertices in the graph in parallel. |
| In OLAP there are optimizations provided for traverser populations, but when paths are calculated (and each traverser |
| is unique due to its history), then these optimizations are no longer possible. |
| |
| [[path-data-structure]] |
| ==== Path Data Structure |
| |
| The `Path` data structure is an ordered list of objects, where each object is associated to a `Set<String>` of |
| labels. An example is presented below to demonstrate both the `Path` API as well as how a traversal yields labeled paths. |
| |
| image::path-data-structure.png[width=350] |
| |
| [gremlin-groovy,modern] |
| ---- |
| path = g.V(1).as('a').has('name').as('b'). |
| out('knows').out('created').as('c'). |
| has('name','ripple').values('name').as('d'). |
| identity().as('e').path().next() |
| path.size() |
| path.objects() |
| path.labels() |
| path.a |
| path.b |
| path.c |
| path.d == path.e |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#path--++[`path()`] |
| |
| [[peerpressure-step]] |
| === PeerPressure Step |
| |
| The `peerPressure()`-step (*map*/*sideEffect*) clusters vertices using <<peerpressurevertexprogram,`PeerPressureVertexProgram`>>. |
| |
| IMPORTANT: The `peerPressure()`-step is a `VertexComputing`-step and as such, can only be used against a graph that supports `GraphComputer` (OLAP). |
| |
| [gremlin-groovy,modern] |
| ---- |
| g = traversal().withEmbedded(graph).withComputer() |
| g.V().peerPressure().by('cluster').values('cluster') |
| g.V().hasLabel('person'). |
| peerPressure(). |
| with(PeerPressure.propertyName, 'cluster'). |
| group(). |
| by('cluster'). |
| by('name') |
| ---- |
| |
| Note the use of the `with()` modulating step which provides configuration options to the algorithm. It takes |
| configuration keys from the `PeerPressure` class and is automatically imported to the Gremlin Console. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#peerPressure--++[`peerPressure()`] |
| |
| [[profile-step]] |
| === Profile Step |
| |
| The `profile()`-step (*sideEffect*) exists to allow developers to profile their traversals to determine statistical |
| information like step runtime, counts, etc. |
| |
| WARNING: Profiling a Traversal will impede the Traversal's performance. This overhead is mostly excluded from the |
| profile results, but durations are not exact. Thus, durations are best considered in relation to each other. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().out('created').repeat(both()).times(3).hasLabel('person').values('age').sum().profile() |
| ---- |
| |
| The `profile()`-step generates a `TraversalMetrics` sideEffect object that contains the following information: |
| |
| * `Step`: A step within the traversal being profiled. |
| * `Count`: The number of _represented_ traversers that passed through the step. |
| * `Traversers`: The number of traversers that passed through the step. |
| * `Time (ms)`: The total time the step was actively executing its behavior. |
| * `% Dur`: The percentage of total time spent in the step. |
| |
| image:gremlin-exercise.png[width=120,float=left] It is important to understand the difference between "Count" |
| and "Traversers". Traversers can be merged and as such, when two traversers are "the same" they may be aggregated |
| into a single traverser. That new traverser has a `Traverser.bulk()` that is the sum of the two merged traverser |
| bulks. On the other hand, the `Count` represents the sum of all `Traverser.bulk()` results and thus, expresses the |
| number of "represented" (not enumerated) traversers. `Traversers` will always be less than or equal to `Count`. |
| |
| For traversal compilation information, please see <<explain-step,`explain()`>>-step. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#profile--++[`profile()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#profile-java.lang.String-++[`profile(String)`] |
| |
| [[project-step]] |
| === Project Step |
| |
| The `project()`-step (*map*) projects the current object into a `Map<String,Object>` keyed by provided labels. It is similar |
| to <<select-step,`select()`>>-step, save that instead of retrieving and modulating historic traverser state, it modulates |
| the current state of the traverser. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().has('name','marko'). |
| project('id', 'name', 'out', 'in'). |
| by(id). |
| by('name'). |
| by(outE().count()). |
| by(inE().count()) |
| g.V().has('name','marko'). |
| project('name', 'friendsNames'). |
| by('name'). |
| by(out('knows').values('name').fold()) |
| g.V().out('created'). |
| project('a','b'). |
| by('name'). |
| by(__.in('created').count()). |
| order().by(select('b'),desc). |
| select('a') |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#project-java.lang.String-java.lang.String...-++[`project(String,String...)`] |
| |
| [[program-step]] |
| === Program Step |
| |
| The `program()`-step (*map*/*sideEffect*) is the "lambda" step for `GraphComputer` jobs. The step takes a |
| <<vertexprogram,`VertexProgram`>> as an argument and will process the incoming graph accordingly. Thus, the user |
| can create their own `VertexProgram` and have it execute within a traversal. The configuration provided to the |
| vertex program includes: |
| |
| * `gremlin.vertexProgramStep.rootTraversal` is a serialization of a `PureTraversal` form of the root traversal. |
| * `gremlin.vertexProgramStep.stepId` is the step string id of the `program()`-step being executed. |
| |
| The user supplied `VertexProgram` can leverage that information accordingly within their vertex program. Example uses |
| are provided below. |
| |
| WARNING: Developing a `VertexProgram` is for expert users. Moreover, developing one that can be used effectively within |
| a traversal requires yet more expertise. This information is recommended to advanced users with a deep understanding of the |
| mechanics of Gremlin OLAP (<<graphcomputer,`GraphComputer`>>). |
| |
| [source,java] |
| ---- |
| private TraverserSet<Object> haltedTraversers; |
| |
| public void loadState(Graph graph, Configuration configuration) { |
| VertexProgram.super.loadState(graph, configuration); |
| this.traversal = PureTraversal.loadState(configuration, VertexProgramStep.ROOT_TRAVERSAL, graph); |
| this.programStep = new TraversalMatrix<>(this.traversal.get()).getStepById(configuration.getString(ProgramVertexProgramStep.STEP_ID)); |
| // if the traversal sideEffects will be used in the computation, add them as memory compute keys |
| this.memoryComputeKeys.addAll(MemoryTraversalSideEffects.getMemoryComputeKeys(this.traversal.get())); |
| // if master-traversal traversers may be propagated, create a memory compute key |
| this.memoryComputeKeys.add(MemoryComputeKey.of(TraversalVertexProgram.HALTED_TRAVERSERS, Operator.addAll, false, false)); |
| // returns an empty traverser set if there are no halted traversers |
| this.haltedTraversers = TraversalVertexProgram.loadHaltedTraversers(configuration); |
| } |
| |
| public void storeState(Configuration configuration) { |
| VertexProgram.super.storeState(configuration); |
| // if halted traversers is null or empty, it does nothing |
| TraversalVertexProgram.storeHaltedTraversers(configuration, this.haltedTraversers); |
| } |
| |
| public void setup(Memory memory) { |
| if(!this.haltedTraversers.isEmpty()) { |
| // do what you like with the halted master traversal traversers |
| } |
| // once used, no need to keep that information around (master) |
| this.haltedTraversers = null; |
| } |
| |
| public void execute(Vertex vertex, Messenger messenger, Memory memory) { |
| // once used, no need to keep that information around (workers) |
| if(null != this.haltedTraversers) |
| this.haltedTraversers = null; |
| if(vertex.property(TraversalVertexProgram.HALTED_TRAVERSERS).isPresent()) { |
| // haltedTraversers in execute() represent worker-traversal traversers |
| // for example, from a traversal of the form g.V().out().program(...) |
| TraverserSet<Object> haltedTraversers = vertex.value(TraversalVertexProgram.HALTED_TRAVERSERS); |
| // create a new halted traverser set that can be used by the next OLAP job in the chain |
| // these are worker-traversers that are distributed throughout the graph |
| TraverserSet<Object> newHaltedTraversers = new TraverserSet<>(); |
| haltedTraversers.forEach(traverser -> { |
| newHaltedTraversers.add(traverser.split(traverser.get().toString(), this.programStep)); |
| }); |
| vertex.property(VertexProperty.Cardinality.single, TraversalVertexProgram.HALTED_TRAVERSERS, newHaltedTraversers); |
| // it is possible to create master-traversers that are localized to the master traversal (this is how results are ultimately delivered back to the user) |
| memory.add(TraversalVertexProgram.HALTED_TRAVERSERS, |
| new TraverserSet<>(this.traversal().get().getTraverserGenerator().generate("an example", this.programStep, 1l))); |
| } |
| |
| public boolean terminate(Memory memory) { |
| // the master-traversal will have halted traversers |
| assert memory.exists(TraversalVertexProgram.HALTED_TRAVERSERS); |
| TraverserSet<String> haltedTraversers = memory.get(TraversalVertexProgram.HALTED_TRAVERSERS); |
| // it will only have the traversers sent to the master traversal via memory |
| assert haltedTraversers.stream().map(Traverser::get).filter(s -> s.equals("an example")).findAny().isPresent(); |
| // it will not contain the worker traversers distributed throughout the vertices |
| assert !haltedTraversers.stream().map(Traverser::get).filter(s -> !s.equals("an example")).findAny().isPresent(); |
| return true; |
| } |
| ---- |
| |
| NOTE: The test case `ProgramTest` in `gremlin-test` has an example vertex program called `TestProgram` that demonstrates |
| all the various ways in which traversal and traverser information is propagated within a vertex program and ultimately |
| usable by other vertex programs (including `TraversalVertexProgram`) down the line in an OLAP compute chain. |
| |
| Finally, an example is provided using `PageRankVertexProgram` which doesn't use <<pagerank-step,`pageRank()`>>-step. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g = traversal().withEmbedded(graph).withComputer() |
| g.V().hasLabel('person'). |
| program(PageRankVertexProgram.build().property('rank').create(graph)). |
| order().by('rank', asc). |
| elementMap('name', 'rank') |
| ---- |
| |
| [[properties-step]] |
| === Properties Step |
| |
| The `properties()`-step (*map*) extracts properties from an `Element` in the traversal stream. |
| |
| [gremlin-groovy,theCrew] |
| ---- |
| g.V(1).properties() |
| g.V(1).properties('location').valueMap() |
| g.V(1).properties('location').has('endTime').valueMap() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#properties-java.lang.String...-++[`properties(String...)`] |
| |
| [[propertymap-step]] |
| === PropertyMap Step |
| |
| The `propertiesMap()`-step yields a Map representation of the properties of an element. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().propertyMap() |
| g.V().propertyMap('age') |
| g.V().propertyMap('age','blah') |
| g.E().propertyMap() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#propertyMap-java.lang.String...-++[`propertyMap(String...)`] |
| |
| [[range-step]] |
| === Range Step |
| |
| As traversers propagate through the traversal, it is possible to only allow a certain number of them to pass through |
| with `range()`-step (*filter*). When the low-end of the range is not met, objects are continued to be iterated. When |
| within the low (inclusive) and high (exclusive) range, traversers are emitted. When above the high range, the traversal |
| breaks out of iteration. Finally, the use of `-1` on the high range will emit remaining traversers after the low range |
| begins. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().range(0,3) |
| g.V().range(1,3) |
| g.V().range(1, -1) |
| g.V().repeat(both()).times(1000000).emit().range(6,10) |
| ---- |
| |
| The `range()`-step can also be applied with `Scope.local`, in which case it operates on the incoming collection. |
| For example, it is possible to produce a `Map<String, String>` for each traversed path, but containing only the second |
| property value (the "b" step). |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().as('a').out().as('b').in().as('c').select('a','b','c').by('name').range(local,1,2) |
| ---- |
| |
| The next example uses the <<the-crew-toy-graph,The Crew>> toy data set. It produces a `List<String>` containing the |
| second and third location for each vertex. |
| |
| [gremlin-groovy,theCrew] |
| ---- |
| g.V().valueMap().select('location').range(local, 1, 3) |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#range-long-long-++[`range(long,long)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#range-org.apache.tinkerpop.gremlin.process.traversal.Scope-long-long-++[`range(Scope,long,long)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Scope.html++[`Scope`] |
| |
| [[read-step]] |
| === Read Step |
| |
| The `read()`-step is not really a "step" but a step modulator in that it modifies the functionality of the `io()`-step. |
| More specifically, it tells the `io()`-step that it is expected to use its configuration to read data from some |
| location. Please see the <<io-step,documentation>> for `io()`-step for more complete details on usage. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/full/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#read--[`read()`] |
| |
| [[repeat-step]] |
| === Repeat Step |
| |
| image::gremlin-fade.png[width=350] |
| |
| The `repeat()`-step (*branch*) is used for looping over a traversal given some break predicate. Below are some |
| examples of `repeat()`-step in action. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).repeat(out()).times(2).path().by('name') <1> |
| g.V().until(has('name','ripple')). |
| repeat(out()).path().by('name') <2> |
| ---- |
| |
| <1> do-while semantics stating to do `out()` 2 times. |
| <2> while-do semantics stating to break if the traverser is at a vertex named "ripple". |
| |
| IMPORTANT: There are two modulators for `repeat()`: `until()` and `emit()`. If `until()` comes after `repeat()` it is |
| do/while looping. If `until()` comes before `repeat()` it is while/do looping. If `emit()` is placed after `repeat()`, |
| it is evaluated on the traversers leaving the repeat-traversal. If `emit()` is placed before `repeat()`, it is |
| evaluated on the traversers prior to entering the repeat-traversal. |
| |
| The `repeat()`-step also supports an "emit predicate", where the predicate for an empty argument `emit()` is |
| `true` (i.e. `emit() == emit{true}`). With `emit()`, the traverser is split in two -- the traverser exits the code |
| block as well as continues back within the code block (assuming `until()` holds true). |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).repeat(out()).times(2).emit().path().by('name') <1> |
| g.V(1).emit().repeat(out()).times(2).path().by('name') <2> |
| ---- |
| |
| <1> The `emit()` comes after `repeat()` and thus, emission happens after the `repeat()` traversal is executed. Thus, |
| no one vertex paths exist. |
| <2> The `emit()` comes before `repeat()` and thus, emission happens prior to the `repeat()` traversal being executed. |
| Thus, one vertex paths exist. |
| |
| The `emit()`-modulator can take an arbitrary predicate. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).repeat(out()).times(2).emit(has('lang')).path().by('name') |
| ---- |
| |
| image::repeat-step.png[width=500] |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).repeat(out()).times(2).emit().path().by('name') |
| ---- |
| |
| The first time through the `repeat()`, the vertices lop, vadas, and josh are seen. Given that `loops==1`, the |
| traverser repeats. However, because the emit-predicate is declared true, those vertices are emitted. The next time through |
| `repeat()`, the vertices traversed are ripple and lop (Josh's created projects, as lop and vadas have no out edges). |
| Given that `loops==2`, the until-predicate fails and ripple and lop are emitted. |
| Therefore, the traverser has seen the vertices: lop, vadas, josh, ripple, and lop. |
| |
| `repeat()`-steps may be nested inside each other or inside the `emit()` or `until()` predicates and they can also be 'named' by passing a string as the first parameter to `repeat()`. The loop counter of a named repeat step can be accessed within the looped context with `loops(loopName)` where `loopName` is the name set whe creating the `repeat()`-step. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1). |
| repeat(out("knows")). |
| until(repeat(out("created")).emit(has("name", "lop"))) <1> |
| g.V(6). |
| repeat('a', both('created').simplePath()). |
| emit(repeat('b', both('knows')). |
| until(loops('b').as('b').where(loops('a').as('b'))). |
| hasId(2)).dedup() <2> |
| ---- |
| |
| <1> Starting from vertex 1, keep going taking outgoing 'knows' edges until the vertex was created by 'lop'. |
| <2> Starting from vertex 6, keep taking created edges in either direction until the vertex is same distance from vertex 2 over knows edges as it is from vertex 6 over created edges. |
| |
| Finally, note that both `emit()` and `until()` can take a traversal and in such, situations, the predicate is |
| determined by `traversal.hasNext()`. A few examples are provided below. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).repeat(out()).until(hasLabel('software')).path().by('name') <1> |
| g.V(1).emit(hasLabel('person')).repeat(out()).path().by('name') <2> |
| g.V(1).repeat(out()).until(outE().count().is(0)).path().by('name') <3> |
| ---- |
| |
| <1> Starting from vertex 1, keep taking outgoing edges until a software vertex is reached. |
| <2> Starting from vertex 1, and in an infinite loop, emit the vertex if it is a person and then traverser the outgoing edges. |
| <3> Starting from vertex 1, keep taking outgoing edges until a vertex is reached that has no more outgoing edges. |
| |
| WARNING: The anonymous traversal of `emit()` and `until()` (not `repeat()`) process their current objects "locally." |
| In OLAP, where the atomic unit of computing is the vertex and its local "star graph," it is important that the |
| anonymous traversals do not leave the confines of the vertex's star graph. In other words, they can not traverse to |
| an adjacent vertex's properties or edges. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#repeat-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`repeat(Traversal)`] |
| |
| link:++https://tinkerpop.apache.org/docs/x.y.z/recipes/#looping[`Looping Recipes`] |
| |
| [[sack-step]] |
| === Sack Step |
| |
| image:gremlin-sacks-running.png[width=175,float=right] A traverser can contain a local data structure called a "sack". |
| The `sack()`-step is used to read and write sacks (*sideEffect* or *map*). Each sack of each traverser is created |
| when using `GraphTraversal.withSack(initialValueSupplier,splitOperator?,mergeOperator?)`. |
| |
| * *Initial value supplier*: A `Supplier` providing the initial value of each traverser's sack. |
| * *Split operator*: a `UnaryOperator` that clones the traverser's sack when the traverser splits. If no split operator |
| is provided, then `UnaryOperator.identity()` is assumed. |
| * *Merge operator*: A `BinaryOperator` that unites two traverser's sack when they are merged. If no merge operator is |
| provided, then traversers with sacks can not be merged. |
| |
| Two trivial examples are presented below to demonstrate the *initial value supplier*. In the first example below, a |
| traverser is created at each vertex in the graph (`g.V()`), with a 1.0 sack (`withSack(1.0f)`), and then the sack |
| value is accessed (`sack()`). In the second example, a random float supplier is used to generate sack values. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.withSack(1.0f).V().sack() |
| rand = new Random() |
| g.withSack {rand.nextFloat()}.V().sack() |
| ---- |
| |
| A more complicated initial value supplier example is presented below where the sack values are used in a running |
| computation and then emitted at the end of the traversal. When an edge is traversed, the edge weight is multiplied |
| by the sack value (`sack(mult).by('weight')`). Note that the <<by-step,`by()`>>-modulator can be any arbitrary traversal. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.withSack(1.0f).V().repeat(outE().sack(mult).by('weight').inV()).times(2) |
| g.withSack(1.0f).V().repeat(outE().sack(mult).by('weight').inV()).times(2).sack() |
| g.withSack(1.0f).V().repeat(outE().sack(mult).by('weight').inV()).times(2).path(). |
| by().by('weight') |
| ---- |
| |
| image:gremlin-sacks-standing.png[width=100,float=left] When complex objects are used (i.e. non-primitives), then a |
| *split operator* should be defined to ensure that each traverser gets a clone of its parent's sack. The first example |
| does not use a split operator and as such, the same map is propagated to all traversers (a global data structure). The |
| second example, demonstrates how `Map.clone()` ensures that each traverser's sack contains a unique, local sack. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.withSack {[:]}.V().out().out(). |
| sack {m,v -> m[v.value('name')] = v.value('lang'); m}.sack() // BAD: single map |
| g.withSack {[:]}{it.clone()}.V().out().out(). |
| sack {m,v -> m[v.value('name')] = v.value('lang'); m}.sack() // GOOD: cloned map |
| ---- |
| |
| NOTE: For primitives (i.e. integers, longs, floats, etc.), a split operator is not required as a primitives are |
| encoded in the memory address of the sack, not as a reference to an object. |
| |
| If a *merge operator* is not provided, then traversers with sacks can not be bulked. However, in many situations, |
| merging the sacks of two traversers at the same location is algorithmically sound and good to provide so as to gain |
| the bulking optimization. In the examples below, the binary merge operator is `Operator.sum`. Thus, when two traverser |
| merge, their respective sacks are added together. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.withSack(1.0d).V(1).out('knows').in('knows') <1> |
| g.withSack(1.0d).V(1).out('knows').in('knows').sack() <2> |
| g.withSack(1.0d, sum).V(1).out('knows').in('knows').sack() <3> |
| g.withSack(1.0d).V(1).local(outE('knows').barrier(normSack).inV()).in('knows').barrier() <4> |
| g.withSack(1.0d).V(1).local(outE('knows').barrier(normSack).inV()).in('knows').barrier().sack() <5> |
| g.withSack(1.0d,sum).V(1).local(outE('knows').barrier(normSack).inV()).in('knows').barrier().sack() <6> |
| g.withBulk(false).withSack(1.0f,sum).V(1).local(outE('knows').barrier(normSack).inV()).in('knows').barrier().sack() <7> |
| g.withBulk(false).withSack(1.0f).V(1).local(outE('knows').barrier(normSack).inV()).in('knows').barrier().sack()<8> |
| |
| ---- |
| |
| <1> We find vertex 1 twice because he knows two other people |
| <2> Without a merge operation the sack values are 1.0. |
| <3> When specifying `sum` as the merge operation, the sack values are 2.0 because of bulking |
| <4> Like 1, but using barrier internally |
| <5> The `local(...barrier(normSack)...)` ensures that all traversers leaving vertex 1 have an evenly distributed amount of the initial 1.0 "energy" (50-50), i.e. the sack is 0.5 on each result |
| <6> Like 3, but using `sum` as merge operator leads to the expected 1.0 |
| <7> There is now a single traverser with bulk of 2 and sack of 1.0 and thus, setting `withBulk(false)`` yields the expected 1.0 |
| <8> Like 7, but without the `sum` operator |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#sack--++[`sack()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#sack-java.util.function.BiFunction-++[`sack(BiFunction)`] |
| |
| [[sample-step]] |
| === Sample Step |
| |
| The `sample()`-step is useful for sampling some number of traversers previous in the traversal. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().outE().sample(1).values('weight') |
| g.V().outE().sample(1).by('weight').values('weight') |
| g.V().outE().sample(2).by('weight').values('weight') |
| ---- |
| |
| One of the more interesting use cases for `sample()` is when it is used in conjunction with <<local-step,`local()`>>. |
| The combination of the two steps supports the execution of link:http://en.wikipedia.org/wiki/Random_walk[random walks]. |
| In the example below, the traversal starts are vertex 1 and selects one edge to traverse based on a probability |
| distribution generated by the weights of the edges. The output is always a single path as by selecting a single edge, |
| the traverser never splits and continues down a single path in the graph. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).repeat(local( |
| bothE().sample(1).by('weight').otherV() |
| )).times(5) |
| g.V(1).repeat(local( |
| bothE().sample(1).by('weight').otherV() |
| )).times(5).path() |
| g.V(1).repeat(local( |
| bothE().sample(1).by('weight').otherV() |
| )).times(10).path() |
| ---- |
| |
| As a clarification, note that in the above example `local()` is not strictly required as it only does the random walk |
| over a single vertex, but note what happens without it if multiple vertices are traversed: |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().repeat(bothE().sample(1).by('weight').otherV()).times(5).path() |
| ---- |
| |
| The use of `local()` ensures that the traversal over `bothE()` occurs once per vertex traverser that passes through, |
| thus allowing one random walk per vertex. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().repeat(local(bothE().sample(1).by('weight').otherV())).times(5).path() |
| ---- |
| |
| So, while not strictly required, it is likely better to be explicit with the use of `local()` so that the proper intent |
| of the traversal is expressed. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#sample-int-++[`sample(int)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#sample-org.apache.tinkerpop.gremlin.process.traversal.Scope-int-++[`sample(Scope,int)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Scope.html++[`Scope`] |
| |
| [[select-step]] |
| === Select Step |
| |
| link:http://en.wikipedia.org/wiki/Functional_programming[Functional languages] make use of function composition and |
| lazy evaluation to create complex computations from primitive operations. This is exactly what `Traversal` does. One |
| of the differentiating aspects of Gremlin's data flow approach to graph processing is that the flow need not always go |
| "forward," but in fact, can go back to a previously seen area of computation. Examples include <<path-step,`path()`>> |
| as well as the `select()`-step (*map*). There are two general ways to use `select()`-step. |
| |
| . Select labeled steps within a path (as defined by `as()` in a traversal). |
| . Select objects out of a `Map<String,Object>` flow (i.e. a sub-map). |
| |
| The first use case is demonstrated via example below. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().as('a').out().as('b').out().as('c') // no select |
| g.V().as('a').out().as('b').out().as('c').select('a','b','c') |
| g.V().as('a').out().as('b').out().as('c').select('a','b') |
| g.V().as('a').out().as('b').out().as('c').select('a','b').by('name') |
| g.V().as('a').out().as('b').out().as('c').select('a') <1> |
| ---- |
| |
| <1> If the selection is one step, no map is returned. |
| |
| When there is only one label selected, then a single object is returned. This is useful for stepping back in a |
| computation and easily moving forward again on the object reverted to. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().out().out() |
| g.V().out().out().path() |
| g.V().as('x').out().out().select('x') |
| g.V().out().as('x').out().select('x') |
| g.V().out().out().as('x').select('x') // pointless |
| ---- |
| |
| NOTE: When executing a traversal with `select()` on a standard traversal engine (i.e. OLTP), `select()` will do its |
| best to avoid calculating the path history and instead, will rely on a global data structure for storing the currently |
| selected object. As such, if only a subset of the path walked is required, `select()` should be used over the more |
| resource intensive <<path-step,`path()`>>-step. |
| |
| When the set of keys or values (i.e. columns) of a path or map are needed, use `select(keys)` and `select(values)`, |
| respectively. This is especially useful when one is only interested in the top N elements in a `groupCount()` |
| ranking. |
| |
| [gremlin-groovy] |
| ---- |
| g = traversal().withEmbedded(graph) |
| g.io('data/grateful-dead.xml').read().iterate() |
| g.V().hasLabel('song').out('followedBy').groupCount().by('name'). |
| order(local).by(values,desc).limit(local, 5) |
| g.V().hasLabel('song').out('followedBy').groupCount().by('name'). |
| order(local).by(values,desc).limit(local, 5).select(keys) |
| g.V().hasLabel('song').out('followedBy').groupCount().by('name'). |
| order(local).by(values,desc).limit(local, 5).select(keys).unfold() |
| ---- |
| |
| Similarly, for extracting the values from a path or map. |
| |
| [gremlin-groovy] |
| ---- |
| g = traversal().withEmbedded(graph) |
| g.io('data/grateful-dead.xml').read().iterate() |
| g.V().hasLabel('song').out('sungBy').groupCount().by('name') <1> |
| g.V().hasLabel('song').out('sungBy').groupCount().by('name').select(values) <2> |
| g.V().hasLabel('song').out('sungBy').groupCount().by('name').select(values).unfold(). |
| groupCount().order(local).by(values,desc).limit(local, 5) <3> |
| ---- |
| |
| <1> Which artist sung how many songs? |
| <2> Get an anonymized set of song repertoire sizes. |
| <3> What are the 5 most common song repertoire sizes? |
| |
| WARNING: Note that `by()`-modulation is not supported with `select(keys)` and `select(values)`. |
| |
| There is also an option to supply a `Pop` operation to `select()` to manipulate `List` objects in the `Traverser`: |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).as("a").repeat(out().as("a")).times(2).select(first, "a") |
| g.V(1).as("a").repeat(out().as("a")).times(2).select(last, "a") |
| g.V(1).as("a").repeat(out().as("a")).times(2).select(all, "a") |
| ---- |
| |
| In addition to the previously shown examples, where `select()` was used to select an element based on a static key, `select()` can also accept a traversal |
| that emits a key. |
| |
| WARNING: Since the key used by `select(<traversal>)` cannot be determined at compile time, the `TraversalSelectStep` enables full path tracking. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.withSideEffect("alias", ["marko":"okram"]).V(). <1> |
| values("name").sack(assign). <2> |
| optional(select("alias").select(sack())) <3> |
| ---- |
| |
| <1> Inject a name alias map and start the traversal from all vertices. |
| <2> Select all `name` values and store them as the current traverser's sack value. |
| <3> Optionally select the alias for the current name from the injected map. |
| |
| [[using-where-with-select]] |
| ==== Using Where with Select |
| |
| Like <<match-step,`match()`>>-step, it is possible to use `where()`, as where is a filter that processes |
| `Map<String,Object>` streams. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().as('a').out('created').in('created').as('b').select('a','b').by('name') <1> |
| g.V().as('a').out('created').in('created').as('b'). |
| select('a','b').by('name').where('a',neq('b')) <2> |
| g.V().as('a').out('created').in('created').as('b'). |
| select('a','b'). <3> |
| where('a',neq('b')). |
| where(__.as('a').out('knows').as('b')). |
| select('a','b').by('name') |
| ---- |
| |
| <1> A standard `select()` that generates a `Map<String,Object>` of variables bindings in the path (i.e. `a` and `b`) |
| for the sake of a running example. |
| <2> The `select().by('name')` projects each binding vertex to their name property value and `where()` operates to |
| ensure respective `a` and `b` strings are not the same. |
| <3> The first `select()` projects a vertex binding set. A binding is filtered if `a` vertex equals `b` vertex. A |
| binding is filtered if `a` doesn't know `b`. The second and final `select()` projects the name of the vertices. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#select-org.apache.tinkerpop.gremlin.structure.Column-++[`select(Column)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#select-org.apache.tinkerpop.gremlin.process.traversal.Pop-java.lang.String-++[`select(Pop,String)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#select-java.lang.String-++[`select(String)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#select-java.lang.String-java.lang.String-java.lang.String...-++[`select(String,String,String...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#select-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`select(Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#select-org.apache.tinkerpop.gremlin.process.traversal.Pop-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`select(Pop,Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/structure/Column.html++[`Column`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Pop.html++[`Pop`] |
| |
| [[shortestpath-step]] |
| === ShortestPath step |
| |
| The `shortestPath()`-step provides an easy way to find shortest non-cyclic paths in a graph. It is configurable |
| using the `with()`-modulator with the options given below. |
| |
| IMPORTANT: The `shortestPath()`-step is a `VertexComputing`-step and as such, can only be used against a graph |
| that supports `GraphComputer` (OLAP). |
| |
| [width="100%",cols="3,3,15,5",options="header"] |
| |========================================================= |
| | Key | Type | Description | Default |
| | `target` | `Traversal` | Sets a filter traversal for the end vertices (e.g. `+__.has('name','marko')+`). | all vertices (`+__.identity()+`) |
| | `edges` | `Traversal` or `Direction` | Sets a `Traversal` that emits the edges to traverse from the current vertex or the `Direction` to traverse during the shortest path discovery. | `Direction.BOTH` |
| | `distance` | `Traversal` or `String` | Sets the `Traversal` that calculates the distance for the current edge or the name of an edge property to use for the distance calculations. | `__.constant(1)` |
| | `maxDistance` | `Number` | Sets the distance limit for all shortest paths. | none |
| | `includeEdges` | `Boolean` | Whether to include edges in the result or not. | `false` |
| |========================================================= |
| |
| [gremlin-groovy,modern] |
| ---- |
| g = g.withComputer() |
| g.V().shortestPath() <1> |
| g.V().has('person','name','marko').shortestPath() <2> |
| g.V().shortestPath().with(ShortestPath.target, __.has('name','peter')) <3> |
| g.V().shortestPath(). |
| with(ShortestPath.edges, Direction.IN). |
| with(ShortestPath.target, __.has('name','josh')) <4> |
| g.V().has('person','name','marko'). |
| shortestPath(). |
| with(ShortestPath.target, __.has('name','josh')) <5> |
| g.V().has('person','name','marko'). |
| shortestPath(). |
| with(ShortestPath.target, __.has('name','josh')). |
| with(ShortestPath.distance, 'weight') <6> |
| g.V().has('person','name','marko'). |
| shortestPath(). |
| with(ShortestPath.target, __.has('name','josh')). |
| with(ShortestPath.includeEdges, true) <7> |
| ---- |
| |
| <1> Find all shortest paths. |
| <2> Find all shortest paths from `marko`. |
| <3> Find all shortest paths to `peter`. |
| <4> Find all in-directed paths to `josh`. |
| <5> Find all shortest paths from `marko` to `josh`. |
| <6> Find all shortest paths from `marko` to `josh` using a custom distance property. |
| <7> Find all shortest paths from `marko` to `josh` and include edges in the result. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.inject(g.withComputer().V().shortestPath(). |
| with(ShortestPath.distance, 'weight'). |
| with(ShortestPath.includeEdges, true). |
| with(ShortestPath.maxDistance, 1).toList().toArray()). |
| map(unfold().values('name','weight').fold()) <1> |
| ---- |
| |
| <1> Find all shortest paths using a custom distance property and limit the distance to 1. Inject the result into a OLTP `GraphTraversal` in order to be able to select properties from all elements in all paths. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#shortestPath--++[`shortestPath()`] |
| |
| [[simplepath-step]] |
| === SimplePath Step |
| |
| image::simplepath-step.png[width=400] |
| |
| When it is important that a traverser not repeat its path through the graph, `simplePath()`-step should be used |
| (*filter*). The <<path-data-structure,path>> information of the traverser is analyzed and if the path has repeated |
| objects in it, the traverser is filtered. If cyclic behavior is desired, see <<cyclicpath-step,`cyclicPath()`>>. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).both().both() |
| g.V(1).both().both().simplePath() |
| g.V(1).both().both().simplePath().path() |
| g.V().out().as('a').out().as('b').out().as('c'). |
| simplePath().by(label). |
| path() |
| g.V().out().as('a').out().as('b').out().as('c'). |
| simplePath(). |
| by(label). |
| from('b'). |
| to('c'). |
| path(). |
| by('name') |
| ---- |
| |
| By using the `from()` and `to()` modulators traversers can ensure that only certain sections of the path are acyclic. |
| |
| [gremlin-groovy] |
| ---- |
| g.addV().property(id, 'A').as('a'). |
| addV().property(id, 'B').as('b'). |
| addV().property(id, 'C').as('c'). |
| addV().property(id, 'D').as('d'). |
| addE('link').from('a').to('b'). |
| addE('link').from('b').to('c'). |
| addE('link').from('c').to('d').iterate() |
| g.V('A').repeat(both().simplePath()).times(3).path() <1> |
| g.V('D').repeat(both().simplePath()).times(3).path() <2> |
| g.V('A').as('a'). |
| repeat(both().simplePath().from('a')).times(3).as('b'). |
| repeat(both().simplePath().from('b')).times(3).path() <3> |
| ---- |
| |
| <1> Traverse all acyclic 3-hop paths starting from vertex `A` |
| <2> Traverse all acyclic 3-hop paths starting from vertex `D` |
| <3> Traverse all acyclic 3-hop paths starting from vertex `A` and from there again all 3-hop paths. The second path may |
| cross the vertices from the first path. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#simplePath--++[`simplePath()`] |
| |
| [[skip-step]] |
| === Skip Step |
| |
| The `skip()`-step is analogous to <<range-step,`range()`-step>> save that the higher end range is set to -1. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().values('age').order() |
| g.V().values('age').order().skip(2) |
| g.V().values('age').order().range(2, -1) |
| ---- |
| |
| The `skip()`-step can also be applied with `Scope.local`, in which case it operates on the incoming collection. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('person').filter(outE('created')).as('p'). <1> |
| map(out('created').values('name').fold()). |
| project('person','primary','other'). |
| by(select('p').by('name')). |
| by(limit(local, 1)). <2> |
| by(skip(local, 1)) <3> |
| ---- |
| |
| <1> For each person who created something... |
| <2> ...select the first project (random order) as `primary` and... |
| <3> ...select all other projects as `other`. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#skip-long-++[`skip(long)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#skip-org.apache.tinkerpop.gremlin.process.traversal.Scope-long-++[`skip(Scope,long)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Scope.html++[`Scope`] |
| |
| [[subgraph-step]] |
| === Subgraph Step |
| |
| image::subgraph-logo.png[width=380] |
| |
| Extracting a portion of a graph from a larger one for analysis, visualization or other purposes is a fairly common |
| use case for graph analysts and developers. The `subgraph()`-step (*sideEffect*) provides a way to produce an |
| link:http://mathworld.wolfram.com/Edge-InducedSubgraph.html[edge-induced subgraph] from virtually any traversal. |
| The following example demonstrates how to produce the "knows" subgraph: |
| |
| [gremlin-groovy,modern] |
| ---- |
| subGraph = g.E().hasLabel('knows').subgraph('subGraph').cap('subGraph').next() <1> |
| sg = traversal().withEmbedded(subGraph) |
| sg.E() <2> |
| ---- |
| |
| <1> As this function produces "edge-induced" subgraphs, `subgraph()` must be called at edge steps. |
| <2> The subgraph contains only "knows" edges. |
| |
| A more common subgraphing use case is to get all of the graph structure surrounding a single vertex: |
| |
| [gremlin-groovy,modern] |
| ---- |
| subGraph = g.V(3).repeat(__.inE().subgraph('subGraph').outV()).times(3).cap('subGraph').next() <1> |
| sg = traversal().withEmbedded(subGraph) |
| sg.E() |
| ---- |
| |
| <1> Starting at vertex `3`, traverse 3 steps away on in-edges, outputting all of that into the subgraph. |
| |
| There can be multiple `subgraph()` calls within the same traversal. Each operating against either the same graph |
| (i.e. same side-effect key) or different graphs (i.e. different side-effect keys). |
| |
| [gremlin-groovy,modern] |
| ---- |
| t = g.V().outE('knows').subgraph('knowsG').inV().outE('created').subgraph('createdG'). |
| inV().inE('created').subgraph('createdG').iterate() |
| traversal().withEmbedded(t.sideEffects.get('knowsG')).E() |
| traversal().withEmbedded(t.sideEffects.get('createdG')).E() |
| ---- |
| |
| TinkerGraph is the ideal (and default) `Graph` into which a subgraph is extracted as it's fast, in-memory, and supports |
| user-supplied identifiers which can be any Java object. It is this last feature that needs some focus as many |
| TinkerPop-enabled graphs have complex identifier types and TinkerGraph's ability to consume those makes it a perfect |
| host for an incoming subgraph. However care needs to be taken when using the elements of the TinkerGraph subgraph. |
| The original graph's identifiers may be preserved, but the elements of the graph are now TinkerGraph objects like, |
| `TinkerVertex` and `TinkerEdge`. As a result, they can not be used directly in Gremlin running against the original |
| graph. For example, the following traversal would likely return an error: |
| |
| [source,text] |
| ---- |
| Vertex v = sg.V().has('name','marko').next(); <1> |
| List<Vertex> vertices = g.V(v).out().toList(); <2> |
| ---- |
| |
| <1> Here "sg" is a reference to a TinkerGraph subgraph and "v" is a `TinkerVertex`. |
| <2> The `g.V(v)` has the potential to fail as "g" is the original `Graph` instance and not a TinkerGraph - it could |
| reject the `TinkerVertex` instance as it will not recognize it. |
| |
| It is safer to wrap the `TinkerVertex` in a `ReferenceVertex` or simply reference the `id()` as follows: |
| |
| [source,text] |
| ---- |
| Vertex v = sg.V().has('name','marko').next(); |
| List<Vertex> vertices = g.V(v.id()).out().toList(); |
| |
| // OR |
| |
| Vertex v = new ReferenceVertex(sg.V().has('name','marko').next()); |
| List<Vertex> vertices = g.V(v).out().toList(); |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#subgraph-java.lang.String-++[`subgraph(String)`] |
| |
| [[sum-step]] |
| === Sum Step |
| |
| The `sum()`-step (*map*) operates on a stream of numbers and sums the numbers together to yield a result. Note that |
| the current traverser number is multiplied by the traverser bulk to determine how many such numbers are being |
| represented. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().values('age').sum() |
| g.V().repeat(both()).times(3).values('age').sum() |
| ---- |
| |
| IMPORTANT: `sum(local)` determines the sum of the current, local object (not the objects in the traversal stream). |
| This works for `Collection`-type objects. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#sum--++[`sum()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#sum-org.apache.tinkerpop.gremlin.process.traversal.Scope-++[`sum(Scope)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Scope.html++[`Scope`] |
| |
| [[tail-step]] |
| === Tail Step |
| |
| image::tail-step.png[width=530] |
| |
| The `tail()`-step is analogous to <<limit-step,`limit()`>>-step, except that it emits the last `n`-objects instead of |
| the first `n`-objects. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().values('name').order() |
| g.V().values('name').order().tail() <1> |
| g.V().values('name').order().tail(1) <2> |
| g.V().values('name').order().tail(3) <3> |
| ---- |
| |
| <1> Last name (alphabetically). |
| <2> Same as statement 1. |
| <3> Last three names. |
| |
| The `tail()`-step can also be applied with `Scope.local`, in which case it operates on the incoming collection. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().as('a').out().as('a').out().as('a').select('a').by(tail(local)).values('name') <1> |
| g.V().as('a').out().as('a').out().as('a').select('a').by(unfold().values('name').fold()).tail(local) <2> |
| g.V().as('a').out().as('a').out().as('a').select('a').by(unfold().values('name').fold()).tail(local, 2) <3> |
| g.V().elementMap().tail(local) <4> |
| ---- |
| |
| <1> Only the most recent name from the "a" step (`List<Vertex>` becomes `Vertex`). |
| <2> Same result as statement 1 (`List<String>` becomes `String`). |
| <3> `List<String>` for each path containing the last two names from the 'a' step. |
| <4> `Map<String, Object>` for each vertex, but containing only the last property value. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#tail--++[`tail()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#tail-long-++[`tail(long)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#tail-org.apache.tinkerpop.gremlin.process.traversal.Scope-++[`tail(Scope)`] |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#tail-org.apache.tinkerpop.gremlin.process.traversal.Scope-long-++[`tail(Scope,long)`] |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/Scope.html++[`Scope`] |
| |
| [[timelimit-step]] |
| === TimeLimit Step |
| |
| In many situations, a graph traversal is not about getting an exact answer as its about getting a relative ranking. |
| A classic example is link:http://en.wikipedia.org/wiki/Recommender_system[recommendation]. What is desired is a |
| relative ranking of vertices, not their absolute rank. Next, it may be desirable to have the traversal execute for |
| no more than 2 milliseconds. In such situations, `timeLimit()`-step (*filter*) can be used. |
| |
| image::timelimit-step.png[width=400] |
| |
| NOTE: The method `clock(int runs, Closure code)` is a utility preloaded in the <<gremlin-console,Gremlin Console>> |
| that can be used to time execution of a body of code. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().repeat(both().groupCount('m')).times(16).cap('m').order(local).by(values,desc).next() |
| clock(1) {g.V().repeat(both().groupCount('m')).times(16).cap('m').order(local).by(values,desc).next()} |
| g.V().repeat(timeLimit(2).both().groupCount('m')).times(16).cap('m').order(local).by(values,desc).next() |
| clock(1) {g.V().repeat(timeLimit(2).both().groupCount('m')).times(16).cap('m').order(local).by(values,desc).next()} |
| ---- |
| |
| In essence, the relative order is respected, even through the number of traversers at each vertex is not. The primary |
| benefit being that the calculation is guaranteed to complete at the specified time limit (in milliseconds). Finally, |
| note that the internal clock of `timeLimit()`-step starts when the first traverser enters it. When the time limit is |
| reached, any `next()` evaluation of the step will yield a `NoSuchElementException` and any `hasNext()` evaluation will |
| yield `false`. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#timeLimit-long-++[`timeLimit(long)`] |
| |
| [[to-step]] |
| === To Step |
| |
| The `to()`-step is not an actual step, but instead is a "step-modulator" similar to <<as-step,`as()`>> and |
| <<by-step,`by()`>>. If a step is able to accept traversals or strings then `to()` is the |
| means by which they are added. The general pattern is `step().to()`. See <<from-step,`from()`>>-step. |
| |
| The list of steps that support `to()`-modulation are: <<simplepath-step,`simplePath()`>>, <<cyclicpath-step,`cyclicPath()`>>, |
| <<path-step,`path()`>>, and <<addedge-step,`addE()`>>. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#to-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`to(Direction,String...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#to-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`to(String)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#to-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`to(Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#to-org.apache.tinkerpop.gremlin.structure.Vertex-++[`to(Vertex)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#toE-org.apache.tinkerpop.gremlin.structure.Direction-java.lang.String...-++[`toE(Direction,String)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#toV-org.apache.tinkerpop.gremlin.structure.Direction-++[`toV(Direction)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/structure/Direction.html++[`Direction`] |
| |
| [[tree-step]] |
| === Tree Step |
| |
| From any one element (i.e. vertex or edge), the emanating paths from that element can be aggregated to form a |
| link:http://en.wikipedia.org/wiki/Tree_(data_structure)[tree]. Gremlin provides `tree()`-step (*sideEffect*) for such |
| this situation. |
| |
| image::tree-step.png[width=450] |
| |
| [gremlin-groovy,modern] |
| ---- |
| tree = g.V().out().out().tree().next() |
| ---- |
| |
| It is important to see how the paths of all the emanating traversers are united to form the tree. |
| |
| image::tree-step2.png[width=500] |
| |
| The resultant tree data structure can then be manipulated (see `Tree` JavaDoc). |
| |
| [gremlin-groovy,modern] |
| ---- |
| tree = g.V().out().out().tree().by('name').next() |
| tree['marko'] |
| tree['marko']['josh'] |
| tree.getObjectsAtDepth(3) |
| ---- |
| |
| Note that when using `by()`-modulation, tree nodes are combined based on projection uniqueness, not on the |
| uniqueness of the original objects being projected. For instance: |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().has('name','josh').out('created').values('name').tree() <1> |
| g.V().has('name','josh').out('created').values('name'). |
| tree().by('name').by(label).by() <2> |
| ---- |
| |
| <1> When the `tree()` is created, vertex 3 and 5 are unique and thus, form unique branches in the tree structure. |
| <2> When the `tree()` is `by()`-modulated by `label`, then vertex 3 and 5 are both "software" and thus are merged to a single node in the tree. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#tree--++[`tree()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#tree-java.lang.String-++[`tree(String)`] |
| |
| [[unfold-step]] |
| === Unfold Step |
| |
| If the object reaching `unfold()` (*flatMap*) is an iterator, iterable, or map, then it is unrolled into a linear |
| form. If not, then the object is simply emitted. Please see <<fold-step,`fold()`>> step for the inverse behavior. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).out().fold().inject('gremlin',[1.23,2.34]) |
| g.V(1).out().fold().inject('gremlin',[1.23,2.34]).unfold() |
| ---- |
| |
| Note that `unfold()` does not recursively unroll iterators. Instead, `repeat()` can be used to for recursive unrolling. |
| |
| [gremlin-groovy,modern] |
| ---- |
| inject(1,[2,3,[4,5,[6]]]) |
| inject(1,[2,3,[4,5,[6]]]).unfold() |
| inject(1,[2,3,[4,5,[6]]]).repeat(unfold()).until(count(local).is(1)).unfold() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#unfold--++[`unfold()`] |
| |
| [[union-step]] |
| === Union Step |
| |
| image::union-step.png[width=650] |
| |
| The `union()`-step (*branch*) supports the merging of the results of an arbitrary number of traversals. When a |
| traverser reaches a `union()`-step, it is copied to each of its internal steps. The traversers emitted from `union()` |
| are the outputs of the respective internal traversals. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(4).union( |
| __.in().values('age'), |
| out().values('lang')) |
| g.V(4).union( |
| __.in().values('age'), |
| out().values('lang')).path() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#union-org.apache.tinkerpop.gremlin.process.traversal.Traversal...-++[`union(Traversal...)`] |
| |
| [[until-step]] |
| === Until Step |
| |
| The `until`-step is not an actual step, but is instead a step modulator for `<<repeat-step,repeat()>>` (find more |
| documentation on the `until()` there). |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#until-java.util.function.Predicate-++[`until(Predicate)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#until-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`until(Traversal)`] |
| |
| [[value-step]] |
| === Value Step |
| |
| The `value()`-step (*map*) takes a `Property` and extracts the value from it. |
| |
| [gremlin-groovy,theCrew] |
| ---- |
| g.V(1).properties().value() |
| g.V(1).properties().properties().value() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#value--++[`value()`] |
| |
| [[valuemap-step]] |
| === ValueMap Step |
| |
| The `valueMap()`-step yields a `Map` representation of the properties of an element. |
| |
| IMPORTANT: This step is the precursor to the <<elementmap-step,elementMap()-step>>. Users should typically |
| choose `elementMap()` unless they utilize multi-properties. `elementMap()` effectively mimics the functionality of |
| `valueMap(true).by(unfold())` as a single step. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().valueMap() |
| g.V().valueMap('age') |
| g.V().valueMap('age','blah') |
| g.E().valueMap() |
| ---- |
| |
| It is important to note that the map of a vertex maintains a list of values for each key. The map of an edge or |
| vertex-property represents a single property (not a list). The reason is that vertices in TinkerPop leverage |
| <<vertex-properties,vertex properties>> which support multiple values per key. Using the <<the-crew-toy-graph, |
| "The Crew">> toy graph, the point is made explicit. |
| |
| [gremlin-groovy,theCrew] |
| ---- |
| g.V().valueMap() |
| g.V().has('name','marko').properties('location') |
| g.V().has('name','marko').properties('location').valueMap() |
| ---- |
| |
| To turn list of values into single items, the `by()` modulator can be used as shown below. |
| |
| [gremlin-groovy,theCrew] |
| ---- |
| g.V().valueMap().by(unfold()) |
| g.V().valueMap('name','location').by().by(unfold()) |
| ---- |
| |
| If the `id`, `label`, `key`, and `value` of the `Element` is desired, then the `with()` modulator can be used to |
| trigger its insertion into the returned map. |
| |
| [gremlin-groovy,theCrew] |
| ---- |
| g.V().hasLabel('person').valueMap().with(WithOptions.tokens) |
| g.V().hasLabel('person').valueMap('name').with(WithOptions.tokens, WithOptions.labels) |
| g.V().hasLabel('person').properties('location').valueMap().with(WithOptions.tokens, WithOptions.values) |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#valueMap-java.lang.String...-++[`valueMap(String...)`] |
| |
| [[values-step]] |
| === Values Step |
| |
| The `values()`-step (*map*) extracts the values of properties from an `Element` in the traversal stream. |
| |
| [gremlin-groovy,theCrew] |
| ---- |
| g.V(1).values() |
| g.V(1).values('location') |
| g.V(1).properties('location').values() |
| ---- |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#values-java.lang.String...-++[`values(String...)`] |
| |
| [[vertex-steps]] |
| === Vertex Steps |
| |
| image::vertex-steps.png[width=350] |
| |
| The vertex steps (*flatMap*) are fundamental to the Gremlin language. Via these steps, its possible to "move" on the |
| graph -- i.e. traverse. |
| |
| * `out(string...)`: Move to the outgoing adjacent vertices given the edge labels. |
| * `in(string...)`: Move to the incoming adjacent vertices given the edge labels. |
| * `both(string...)`: Move to both the incoming and outgoing adjacent vertices given the edge labels. |
| * `outE(string...)`: Move to the outgoing incident edges given the edge labels. |
| * `inE(string...)`: Move to the incoming incident edges given the edge labels. |
| * `bothE(string...)`: Move to both the incoming and outgoing incident edges given the edge labels. |
| * `outV()`: Move to the outgoing vertex. |
| * `inV()`: Move to the incoming vertex. |
| * `bothV()`: Move to both vertices. |
| * `otherV()` : Move to the vertex that was not the vertex that was moved from. |
| |
| [NOTE, caption=Groovy] |
| ==== |
| The term `in` is a reserved word in Groovy, and when therefore used as part of an anonymous traversal must be referred |
| to in Gremlin with the double underscore `__.in()`. |
| ==== |
| |
| [NOTE, caption=Javascript] |
| ==== |
| The term `in` is a reserved word in Javascript, and therefore must be referred to in Gremlin with `in_()`. |
| ==== |
| |
| [NOTE, caption=Python] |
| ==== |
| The term `in` is a reserved word in Python, and therefore must be referred to in Gremlin with `in_()`. |
| ==== |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(4) |
| g.V(4).outE() <1> |
| g.V(4).inE('knows') <2> |
| g.V(4).inE('created') <3> |
| g.V(4).bothE('knows','created','blah') |
| g.V(4).bothE('knows','created','blah').otherV() |
| g.V(4).both('knows','created','blah') |
| g.V(4).outE().inV() <4> |
| g.V(4).out() <5> |
| g.V(4).inE().outV() |
| g.V(4).inE().bothV() |
| ---- |
| |
| <1> All outgoing edges. |
| <2> All incoming knows-edges. |
| <3> All incoming created-edges. |
| <4> Moving forward touching edges and vertices. |
| <5> Moving forward only touching vertices. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#both-java.lang.String...-++[`both(String...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#bothE-java.lang.String...-++[`bothE(String...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#bothV--++[`bothV()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#in-java.lang.String...-++[`in(String...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#inE-java.lang.String...-++[`inE(String...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#inV--++[`inV()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#otherV--++[`otherV()`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#out-java.lang.String...-++[`out(String...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#outE-java.lang.String...-++[`outE(String...)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#outV--++[`outV()`] |
| |
| [[where-step]] |
| === Where Step |
| |
| The `where()`-step filters the current object based on either the object itself (`Scope.local`) or the path history |
| of the object (`Scope.global`) (*filter*). This step is typically used in conjunction with either |
| <<match-step,`match()`>>-step or <<select-step,`select()`>>-step, but can be used in isolation. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V(1).as('a').out('created').in('created').where(neq('a')) <1> |
| g.withSideEffect('a',['josh','peter']).V(1).out('created').in('created').values('name').where(within('a')) <2> |
| g.V(1).out('created').in('created').where(out('created').count().is(gt(1))).values('name') <3> |
| ---- |
| |
| <1> Who are marko's collaborators, where marko can not be his own collaborator? (predicate) |
| <2> Of the co-creators of marko, only keep those whose name is josh or peter. (using a sideEffect) |
| <3> Which of marko's collaborators have worked on more than 1 project? (using a traversal) |
| |
| IMPORTANT: Please see <<using-where-with-match,`match().where()`>> and <<using-where-with-select,`select().where()`>> |
| for how `where()` can be used in conjunction with `Map<String,Object>` projecting steps -- i.e. `Scope.local`. |
| |
| A few more examples of filtering an arbitrary object based on a anonymous traversal is provided below. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().where(out('created')).values('name') <1> |
| g.V().out('knows').where(out('created')).values('name') <2> |
| g.V().where(out('created').count().is(gte(2))).values('name') <3> |
| g.V().where(out('knows').where(out('created'))).values('name') <4> |
| g.V().where(__.not(out('created'))).where(__.in('knows')).values('name') <5> |
| g.V().where(__.not(out('created')).and().in('knows')).values('name') <6> |
| g.V().as('a').out('knows').as('b'). |
| where('a',gt('b')). |
| by('age'). |
| select('a','b'). |
| by('name') <7> |
| g.V().as('a').out('knows').as('b'). |
| where('a',gt('b').or(eq('b'))). |
| by('age'). |
| by('age'). |
| by(__.in('knows').values('age')). |
| select('a','b'). |
| by('name') <8> |
| ---- |
| |
| <1> What are the names of the people who have created a project? |
| <2> What are the names of the people that are known by someone one and have created a project? |
| <3> What are the names of the people how have created two or more projects? |
| <4> What are the names of the people who know someone that has created a project? (This only works in OLTP -- see the `WARNING` below) |
| <5> What are the names of the people who have not created anything, but are known by someone? |
| <6> The concatenation of `where()`-steps is the same as a single `where()`-step with an and'd clause. |
| <7> Marko knows josh and vadas but is only older than vadas. |
| <8> Marko is younger than josh, but josh knows someone equal in age to marko (which is marko). |
| |
| WARNING: The anonymous traversal of `where()` processes the current object "locally". In OLAP, where the atomic unit |
| of computing is the vertex and its local "star graph," it is important that the anonymous traversal does not leave |
| the confines of the vertex's star graph. In other words, it can not traverse to an adjacent vertex's properties or |
| edges. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#where-org.apache.tinkerpop.gremlin.process.traversal.P-++[`where(P)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#where-java.lang.String-org.apache.tinkerpop.gremlin.process.traversal.P-++[`where(String,P)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#where-org.apache.tinkerpop.gremlin.process.traversal.Traversal-++[`where(Traversal)`], |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/process/traversal/P.html++[`P`] |
| |
| [[with-step]] |
| === With Step |
| |
| The `with()`-step is not an actual step, but is instead a "step modulator" which modifies the behavior of the step |
| prior to it. The `with()`-step provides additional "configuration" information to steps that implement the `Configuring` |
| interface. Steps that allow for this type of modulation will explicitly state so in their documentation. |
| |
| [NOTE, caption=Javascript] |
| ==== |
| The term `with` is a reserved word in Javascript, and therefore must be referred to in Gremlin with `with_()`. |
| ==== |
| |
| [NOTE, caption=Python] |
| ==== |
| The term `with` is a reserved word in Python, and therefore must be referred to in Gremlin with `with_()`. |
| ==== |
| |
| [[write-step]] |
| === Write Step |
| |
| The `write()`-step is not really a "step" but a step modulator in that it modifies the functionality of the `io()`-step. |
| More specifically, it tells the `io()`-step that it is expected to use its configuration to write data to some |
| location. Please see the <<io-step,documentation>> for `io()`-step for more complete details on usage. |
| |
| *Additional References* |
| |
| link:++https://tinkerpop.apache.org/javadocs/x.y.z/full/org/apache/tinkerpop/gremlin/process/traversal/dsl/graph/GraphTraversal.html#write--[`write()`] |
| |
| [[a-note-on-predicates]] |
| == A Note on Predicates |
| |
| A `P` is a predicate of the form `Function<Object,Boolean>`. That is, given some object, return true or false. As of |
| the release of TinkerPop 3.4.0, Gremlin also supports simple text predicates, which only work on `String` values. The `TextP` |
| text predicates extend the `P` predicates, but are specialized in that they are of the form `Function<String,Boolean>`. |
| The provided predicates are outlined in the table below and are used in various steps such as <<has-step,`has()`>>-step, |
| <<where-step,`where()`>>-step, <<is-step,`is()`>>-step, etc. |
| |
| [width="100%",cols="3,15",options="header"] |
| |========================================================= |
| | Predicate | Description |
| | `P.eq(object)` | Is the incoming object equal to the provided object? |
| | `P.neq(object)` | Is the incoming object not equal to the provided object? |
| | `P.lt(number)` | Is the incoming number less than the provided number? |
| | `P.lte(number)` | Is the incoming number less than or equal to the provided number? |
| | `P.gt(number)` | Is the incoming number greater than the provided number? |
| | `P.gte(number)` | Is the incoming number greater than or equal to the provided number? |
| | `P.inside(number,number)` | Is the incoming number greater than the first provided number and less than the second? |
| | `P.outside(number,number)` | Is the incoming number less than the first provided number or greater than the second? |
| | `P.between(number,number)` | Is the incoming number greater than or equal to the first provided number and less than the second? |
| | `P.within(objects...)` | Is the incoming object in the array of provided objects? |
| | `P.without(objects...)` | Is the incoming object not in the array of the provided objects? |
| | `TextP.startingWith(string)` | Does the incoming `String` start with the provided `String`? |
| | `TextP.endingWith(string)` | Does the incoming `String` end with the provided `String`? |
| | `TextP.containing(string)` | Does the incoming `String` contain the provided `String`? |
| | `TextP.notStartingWith(string)` | Does the incoming `String` not start with the provided `String`? |
| | `TextP.notEndingWith(string)` | Does the incoming `String` not end with the provided `String`? |
| | `TextP.notContaining(string)` | Does the incoming `String` not contain the provided `String`? |
| |========================================================= |
| |
| [gremlin-groovy] |
| ---- |
| eq(2) |
| not(neq(2)) <1> |
| not(within('a','b','c')) |
| not(within('a','b','c')).test('d') <2> |
| not(within('a','b','c')).test('a') |
| within(1,2,3).and(not(eq(2))).test(3) <3> |
| inside(1,4).or(eq(5)).test(3) <4> |
| inside(1,4).or(eq(5)).test(5) |
| between(1,2) <5> |
| not(between(1,2)) |
| ---- |
| |
| <1> The `not()` of a `P`-predicate is another `P`-predicate. |
| <2> `P`-predicates are arguments to various steps which internally `test()` the incoming value. |
| <3> `P`-predicates can be and'd together. |
| <4> `P`-predicates can be or' together. |
| <5> `and()` is a `P`-predicate and thus, a `P`-predicate can be composed of multiple `P`-predicates. |
| |
| TIP: To reduce the verbosity of predicate expressions, it is good to |
| `import static org.apache.tinkerpop.gremlin.process.traversal.P.*`. |
| |
| Finally, note that <<where-step,`where()`>>-step takes a `P<String>`. The provided string value refers to a variable |
| binding, not to the explicit string value. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().as('a').both().both().as('b').count() |
| g.V().as('a').both().both().as('b').where('a',neq('b')).count() |
| ---- |
| |
| NOTE: It is possible for graph system providers and users to extend `P` and provide new predicates. For instance, a |
| `regex(pattern)` could be a graph system specific `P`. |
| |
| [[a-note-on-barrier-steps]] |
| == A Note on Barrier Steps |
| |
| image:barrier.png[width=165,float=right] Gremlin is primarily a |
| link:http://en.wikipedia.org/wiki/Lazy_evaluation[lazy], stream processing language. This means that Gremlin fully |
| processes (to the best of its abilities) any traversers currently in the traversal pipeline before getting more data |
| from the start/head of the traversal. However, there are numerous situations in which a completely lazy computation |
| is not possible (or impractical). When a computation is not lazy, a "barrier step" exists. There are three types of |
| barriers: |
| |
| . `CollectingBarrierStep`: All of the traversers prior to the step are put into a collection and then processed in |
| some way (e.g. ordered) prior to the collection being "drained" one-by-one to the next step. Examples |
| include: <<order-step,`order()`>>, <<sample-step,`sample()`>>, <<aggregate-step,`aggregate()`>>, <<barrier-step,`barrier()`>>. |
| . `ReducingBarrierStep`: All of the traversers prior to the step are processed by a reduce function and once all the |
| previous traversers are processed, a single "reduced value" traverser is emitted to the next step. Note that the path |
| history leading up to a reducing barrier step is destroyed given its many-to-one nature. Examples include: |
| <<fold-step,`fold()`>>, <<count-step,`count()`>>, <<sum-step,`sum()`>>, <<max-step,`max()`>>, <<min-step,`min()`>>. |
| . `SupplyingBarrierStep`: All of the traversers prior to the step are iterated (no processing) and then some provided |
| supplier yields a single traverser to continue to the next step. Examples include: <<cap-step,`cap()`>>. |
| |
| In Gremlin OLAP (see <<traversalvertexprogram,`TraversalVertexProgram`>>), a barrier is introduced at the end of |
| every <<vertex-steps,adjacent vertex step>>. This means that the traversal does its best to compute as much as |
| possible at the current, local vertex. What it can't compute without referencing an adjacent vertex is aggregated |
| into a barrier collection. When there are no more traversers at the local vertex, the barriered traversers are the |
| messages that are propagated to remote vertices for further processing. |
| |
| [[a-note-on-scopes]] |
| == A Note on Scopes |
| |
| The `Scope` enum has two constants: `Scope.local` and `Scope.global`. Scope determines whether the particular step |
| being scoped is with respects to the current object (`local`) at that step or to the entire stream of objects up to that |
| step (`global`). |
| |
| [NOTE, caption=Python] |
| ==== |
| The term `global` is a reserved word in Python, and therefore a `Scope` using that term must be referred as `global_`. |
| ==== |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().has('name','marko').out('knows').count() <1> |
| g.V().has('name','marko').out('knows').fold().count() <2> |
| g.V().has('name','marko').out('knows').fold().count(local) <3> |
| g.V().has('name','marko').out('knows').fold().count(global) <4> |
| ---- |
| |
| <1> Marko knows 2 people. |
| <2> A list of Marko's friends is created and thus, one object is counted (the single list). |
| <3> A list of Marko's friends is created and a `local`-count yields the number of objects in that list. |
| <4> `count(global)` is the same as `count()` as the default behavior for most scoped steps is `global`. |
| |
| The steps that support scoping are: |
| |
| * <<count-step,`count()`>>: count the local collection or global stream. |
| * <<dedup-step, `dedup()`>>: dedup the local collection of global stream. |
| * <<max-step, `max()`>>: get the max value in the local collection or global stream. |
| * <<mean-step, `mean()`>>: get the mean value in the local collection or global stream. |
| * <<min-step, `min()`>>: get the min value in the local collection or global stream. |
| * <<order-step,`order()`>>: order the objects in the local collection or global stream. |
| * <<range-step, `range()`>>: clip the local collection or global stream. |
| * <<limit-step, `limit()`>>: clip the local collection or global stream. |
| * <<sample-step, `sample()`>>: sample objects from the local collection or global stream. |
| * <<tail-step, `tail()`>>: get the tail of the objects in the local collection or global stream. |
| |
| A few more examples of the use of `Scope` are provided below: |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().both().group().by(label).select('software').dedup(local) |
| g.V().groupCount().by(label).select(values).min(local) |
| g.V().groupCount().by(label).order(local).by(values,desc) |
| g.V().fold().sample(local,2) |
| ---- |
| |
| Finally, note that <<local-step,`local()`>>-step is a "hard-scoped step" that transforms any internal traversal into a |
| locally-scoped operation. A contrived example is provided below: |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().fold().local(unfold().count()) |
| g.V().fold().count(local) |
| ---- |
| |
| [[a-note-on-lambdas]] |
| == A Note On Lambdas |
| |
| image:lambda.png[width=150,float=right] A link:http://en.wikipedia.org/wiki/Anonymous_function[lambda] is a function |
| that can be referenced by software and thus, passed around like any other piece of data. In Gremlin, lambdas make it |
| possible to generalize the behavior of a step such that custom steps can be created (on-the-fly) by the user. However, |
| it is advised to avoid using lambdas if possible. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().filter{it.get().value('name') == 'marko'}. |
| flatMap{it.get().vertices(OUT,'created')}. |
| map {it.get().value('name')} <1> |
| g.V().has('name','marko').out('created').values('name') <2> |
| ---- |
| |
| <1> A lambda-rich Gremlin traversal which should and can be avoided. (*bad*) |
| <2> The same traversal (result), but without using lambdas. (*good*) |
| |
| Gremlin attempts to provide the user a comprehensive collection of steps in the hopes that the user will never need to |
| leverage a lambda in practice. It is advised that users only leverage a lambda if and only if there is no |
| corresponding lambda-less step that encompasses the desired functionality. The reason being, lambdas can not be |
| optimized by Gremlin's compiler strategies as they can not be programmatically inspected (see |
| <<traversalstrategy,traversal strategies>>). It is also not currently possible to send a natively written lambda for |
| remote execution to Gremlin-Server or a driver that supports remote execution. |
| |
| In many situations where a lambda could be used, either a corresponding step exists or a traversal can be provided in |
| its place. A `TraversalLambda` behaves like a typical lambda, but it can be optimized and it yields less objects than |
| the corresponding pure-lambda form. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().out().out().path().by {it.value('name')}. |
| by {it.value('name')}. |
| by {g.V(it).in('created').values('name').fold().next()} <1> |
| g.V().out().out().path().by('name'). |
| by('name'). |
| by(__.in('created').values('name').fold()) <2> |
| ---- |
| |
| <1> The length-3 paths have each of their objects transformed by a lambda. (*bad*) |
| <2> The length-3 paths have their objects transformed by a lambda-less step and a traversal lambda. (*good*) |
| |
| [[traversalstrategy]] |
| == TraversalStrategy |
| |
| image:traversal-strategy.png[width=125,float=right] A `TraversalStrategy` analyzes a `Traversal` and, if the traversal |
| meets its criteria, can mutate it accordingly. Traversal strategies are executed at compile-time and form the foundation |
| of the Gremlin traversal machine's compiler. There are 5 categories of strategies which are itemized below: |
| |
| * There is an application-level feature that can be embedded into the traversal logic (*decoration*). |
| * There is a more efficient way to express the traversal at the TinkerPop level (*optimization*). |
| * There is a more efficient way to express the traversal at the graph system/language/driver level (*provider optimization*). |
| * There are some final adjustments/cleanups/analyses required before executing the traversal (*finalization*). |
| * There are certain traversals that are not legal for the application or traversal engine (*verification*). |
| |
| NOTE: The <<explain-step,`explain()`>>-step shows the user how each registered strategy mutates the traversal. |
| |
| A simple `OptimizationStrategy` is the `IdentityRemovalStrategy`. |
| |
| [source,java] |
| ---- |
| public final class IdentityRemovalStrategy extends AbstractTraversalStrategy<TraversalStrategy.OptimizationStrategy> implements TraversalStrategy.OptimizationStrategy { |
| |
| private static final IdentityRemovalStrategy INSTANCE = new IdentityRemovalStrategy(); |
| |
| private IdentityRemovalStrategy() { |
| } |
| |
| @Override |
| public void apply(Traversal.Admin<?, ?> traversal) { |
| if (traversal.getSteps().size() <= 1) |
| return; |
| |
| for (IdentityStep<?> identityStep : TraversalHelper.getStepsOfClass(IdentityStep.class, traversal)) { |
| if (identityStep.getLabels().isEmpty() || !(identityStep.getPreviousStep() instanceof EmptyStep)) { |
| TraversalHelper.copyLabels(identityStep, identityStep.getPreviousStep(), false); |
| traversal.removeStep(identityStep); |
| } |
| } |
| } |
| |
| public static IdentityRemovalStrategy instance() { |
| return INSTANCE; |
| } |
| } |
| ---- |
| |
| This strategy simply removes any `IdentityStep` steps in the Traversal as `aStep().identity().identity().bStep()` |
| is equivalent to `aStep().bStep()`. For those traversal strategies that require other strategies to execute prior or |
| post to the strategy, then the following two methods can be defined in `TraversalStrategy` (with defaults being an |
| empty set). If the `TraversalStrategy` is in a particular traversal category (i.e. decoration, optimization, |
| provider-optimization, finalization, or verification), then priors and posts are only possible within the respective category. |
| |
| [source,java] |
| public Set<Class<? extends S>> applyPrior(); |
| public Set<Class<? extends S>> applyPost(); |
| |
| IMPORTANT: `TraversalStrategy` categories are sorted within their category and the categories are then executed in |
| the following order: decoration, optimization, provider optimization, finalization, and verification. If a designed strategy |
| does not fit cleanly into these categories, then it can implement `TraversalStrategy` and its prior and posts can reference |
| strategies within any category. However, such generalization are strongly discouraged. |
| |
| An example of a `GraphSystemOptimizationStrategy` is provided below. |
| |
| [source,groovy] |
| g.V().has('name','marko') |
| |
| The expression above can be executed in a `O(|V|)` or `O(log(|V|)` fashion in <<tinkergraph-gremlin,TinkerGraph>> |
| depending on whether there is or is not an index defined for "name." |
| |
| [source,java] |
| ---- |
| public final class TinkerGraphStepStrategy extends AbstractTraversalStrategy<TraversalStrategy.ProviderOptimizationStrategy> implements TraversalStrategy.ProviderOptimizationStrategy { |
| |
| private static final TinkerGraphStepStrategy INSTANCE = new TinkerGraphStepStrategy(); |
| |
| private TinkerGraphStepStrategy() { |
| } |
| |
| @Override |
| public void apply(Traversal.Admin<?, ?> traversal) { |
| if (TraversalHelper.onGraphComputer(traversal)) |
| return; |
| |
| for (GraphStep originalGraphStep : TraversalHelper.getStepsOfClass(GraphStep.class, traversal)) { |
| TinkerGraphStep<?, ?> tinkerGraphStep = new TinkerGraphStep<>(originalGraphStep); |
| TraversalHelper.replaceStep(originalGraphStep, tinkerGraphStep, traversal); |
| Step<?, ?> currentStep = tinkerGraphStep.getNextStep(); |
| while (currentStep instanceof HasStep || currentStep instanceof NoOpBarrierStep) { |
| if (currentStep instanceof HasStep) { |
| for (HasContainer hasContainer : ((HasContainerHolder) currentStep).getHasContainers()) { |
| if (!GraphStep.processHasContainerIds(tinkerGraphStep, hasContainer)) |
| tinkerGraphStep.addHasContainer(hasContainer); |
| } |
| TraversalHelper.copyLabels(currentStep, currentStep.getPreviousStep(), false); |
| traversal.removeStep(currentStep); |
| } |
| currentStep = currentStep.getNextStep(); |
| } |
| } |
| } |
| |
| public static TinkerGraphStepStrategy instance() { |
| return INSTANCE; |
| } |
| } |
| ---- |
| |
| The traversal is redefined by simply taking a chain of `has()`-steps after `g.V()` (`TinkerGraphStep`) and providing |
| their `HasContainers` to `TinkerGraphStep`. Then its up to `TinkerGraphStep` to determine if an appropriate index exists. |
| Given that the strategy uses non-TinkerPop provided steps, it should go into the `ProviderOptimizationStrategy` category |
| to ensure the added step does not interfere with the assumptions of the `OptimizationStrategy` strategies. |
| |
| [gremlin-groovy,modern] |
| ---- |
| t = g.V().has('name','marko'); null |
| t.toString() |
| t.iterate(); null |
| t.toString() |
| ---- |
| |
| WARNING: The reason that `OptimizationStrategy` and `ProviderOptimizationStrategy` are two different categories is |
| that optimization strategies should only rewrite the traversal using TinkerPop steps. This ensures that the |
| optimizations executed at the end of the optimization strategy round are TinkerPop compliant. From there, provider |
| optimizations can analyze the traversal and rewrite the traversal as desired using graph system specific steps (e.g. |
| replacing `GraphStep.HasStep...HasStep` with `TinkerGraphStep`). If provider optimizations use graph system specific |
| steps and implement `OptimizationStrategy`, then other TinkerPop optimizations may fail to optimize the traversal or |
| mis-understand the graph system specific step behaviors (e.g. `ProviderVertexStep extends VertexStep`) and yield |
| incorrect semantics. |
| |
| Finally, here is a complicated traversal that has various components that are optimized by the default TinkerPop strategies. |
| |
| [gremlin-groovy,modern] |
| ---- |
| g.V().hasLabel('person'). <1> |
| and(has('name'), <2> |
| has('name','marko'), |
| filter(has('age',gt(20)))). <3> |
| match(__.as('a').has('age',lt(32)), <4> |
| __.as('a').repeat(outE().inV()).times(2).as('b')). <5> |
| where('a',neq('b')). <6> |
| where(__.as('b').both().count().is(gt(1))). <7> |
| select('b'). <8> |
| groupCount(). |
| by(out().count()). <9> |
| explain() |
| ---- |
| |
| <1> `TinkerGraphStepStrategy` pulls in `has()`-step predicates for global, graph-centric index lookups. |
| <2> `FilterRankStrategy` sorts filter steps by their time/space execution costs. |
| <3> `InlineFilterStrategy` de-nests filters to increase the likelihood of filter concatenation and aggregation. |
| <4> `InlineFilterStrategy` pulls out named predicates from `match()`-step to more easily allow provider strategies to use indices. |
| <5> `RepeatUnrollStrategy` will unroll loops and `IncidentToAdjacentStrategy` will turn `outE().inV()`-patterns into `out()`. |
| <6> `MatchPredicateStrategy` will pull in `where()`-steps so that they can be subjected to `match()`-steps runtime query optimizer. |
| <7> `CountStrategy` will limit the traversal to only the number of traversers required for the `count().is(x)`-check. |
| <8> `PathRetractionStrategy` will remove paths from the traversers and increase the likelihood of bulking as path data is not required after `select('b')`. |
| <9> `AdjacentToIncidentStrategy` will turn `out()` into `outE()` to increase data access locality. |
| |
| A collection of useful `DecorationStrategy` strategies are provided with TinkerPop and are generally useful to |
| end-users. The following sub-sections detail these strategies: |
| |
| === ElementIdStrategy |
| |
| `ElementIdStrategy` provides control over element identifiers. Some Graph implementations, such as TinkerGraph, |
| allow specification of custom identifiers when creating elements: |
| |
| [gremlin-groovy] |
| ---- |
| g = traversal().withEmbedded(TinkerGraph.open()) |
| v = g.addV().property(id,'42a').next() |
| g.V('42a') |
| ---- |
| |
| Other `Graph` implementations, such as Neo4j, generate element identifiers automatically and cannot be assigned. |
| As a helper, `ElementIdStrategy` can be used to make identifier assignment possible by using vertex and edge indices |
| under the hood. |
| |
| [gremlin-groovy] |
| ---- |
| graph = Neo4jGraph.open('/tmp/neo4j') |
| strategy = ElementIdStrategy.build().create() |
| g = traversal().withEmbedded(graph).withStrategies(strategy) |
| g.addV().property(id, '42a').id() |
| ---- |
| |
| IMPORTANT: The key that is used to store the assigned identifier should be indexed in the underlying graph |
| database. If it is not indexed, then lookups for the elements that use these identifiers will perform a linear scan. |
| |
| === EventStrategy |
| |
| The purpose of the `EventStrategy` is to raise events to one or more `MutationListener` objects as changes to the |
| underlying `Graph` occur within a `Traversal`. Such a strategy is useful for logging changes, triggering certain |
| actions based on change, or any application that needs notification of some mutating operation during a `Traversal`. |
| If the transaction is rolled back, the event queue is reset. |
| |
| The following events are raised to the `MutationListener`: |
| |
| * New vertex |
| * New edge |
| * Vertex property changed |
| * Edge property changed |
| * Vertex property removed |
| * Edge property removed |
| * Vertex removed |
| * Edge removed |
| |
| To start processing events from a `Traversal` first implement the `MutationListener` interface. An example of this |
| implementation is the `ConsoleMutationListener` which writes output to the console for each event. The following |
| console session displays the basic usage: |
| |
| [gremlin-groovy] |
| ---- |
| import org.apache.tinkerpop.gremlin.process.traversal.step.util.event.* |
| graph = TinkerFactory.createModern() |
| l = new ConsoleMutationListener(graph) |
| strategy = EventStrategy.build().addListener(l).create() |
| g = traversal().withEmbedded(graph).withStrategies(strategy) |
| g.addV().property('name','stephen') |
| g.V().has('name','stephen'). |
| property(list, 'location', 'centreville', 'startTime', 1990, 'endTime', 2000). |
| property(list, 'location', 'dulles', 'startTime', 2000, 'endTime', 2006). |
| property(list, 'location', 'purcellville', 'startTime', 2006) |
| g.V().has('name','stephen'). |
| property(set, 'location', 'purcellville', 'startTime', 2006, 'endTime', 2019) |
| g.E().drop() |
| ---- |
| |
| By default, the `EventStrategy` is configured with an `EventQueue` that raises events as they occur within execution |
| of a `Step`. As such, the final line of Gremlin execution that drops all edges shows a bit of an inconsistent count, |
| where the removed edge count is accounted for after the event is raised. The strategy can also be configured with a |
| `TransactionalEventQueue` that captures the changes within a transaction and does not allow them to fire until the |
| transaction is committed. |
| |
| WARNING: `EventStrategy` is not meant for usage in tracking global mutations across separate processes. In other |
| words, a mutation in one JVM process is not raised as an event in a different JVM process. In addition, events are |
| not raised when mutations occur outside of the `Traversal` context. |
| |
| Another default configuration for `EventStrategy` revolves around the concept of "detachment". Graph elements are |
| detached from the graph as copies when passed to referring mutation events. Therefore, when adding a new `Vertex` in |
| TinkerGraph, the event will not contain a `TinkerVertex` but will instead include a `DetachedVertex`. This behavior |
| can be modified with the `detach()` method on the `EventStrategy.Builder` which accepts the following inputs: `null` |
| meaning no detachment and the return of the original element, `DetachedFactory` which is the same as the default |
| behavior, and `ReferenceFactory` which will return "reference" elements only with no properties. |
| |
| IMPORTANT: If setting the `detach()` configuration to `null`, be aware that transactional graphs will likely create a |
| new transaction immediately following the `commit()` that raises the events. The graph elements raised in the events |
| may also not behave as "snapshots" at the time of their creation as they are "live" references to actual database |
| elements. |
| |
| === PartitionStrategy |
| |
| image::partition-graph.png[width=325] |
| |
| `PartitionStrategy` partitions the vertices and edges of a graph into `String` named partitions (i.e. buckets, |
| subgraphs, etc.). The idea behind `PartitionStrategy` is presented in the image above where each element is in a |
| single partition (represented by its color). Partitions can be read from, written to, and linked/joined by edges |
| that span one or two partitions (e.g. a tail vertex in one partition and a head vertex in another). |
| |
| There are three primary configurations in `PartitionStrategy`: |
| |
| . Partition Key - The property key that denotes a String value representing a partition. |
| . Write Partition - A `String` denoting what partition all future written elements will be in. |
| . Read Partitions - A `Set<String>` of partitions that can be read from. |
| |
| The best way to understand `PartitionStrategy` is via example. |
| |
| [gremlin-groovy] |
| ---- |
| graph = TinkerFactory.createModern() |
| strategyA = PartitionStrategy.build().partitionKey("_partition").writePartition("a").readPartitions("a").create() |
| strategyB = PartitionStrategy.build().partitionKey("_partition").writePartition("b").readPartitions("b").create() |
| gA = traversal().withEmbedded(graph).withStrategies(strategyA) |
| gA.addV() // this vertex has a property of {_partition:"a"} |
| gB = traversal().withEmbedded(graph).withStrategies(strategyB) |
| gB.addV() // this vertex has a property of {_partition:"b"} |
| gA.V() |
| gB.V() |
| ---- |
| |
| Partitions may also extend to `VertexProperty` elements if the `Graph` can support meta-properties and if the |
| `includeMetaProperties` value is set to `true` when the `PartitionStrategy` is built. The `partitionKey` will be |
| stored in the meta-properties of the `VertexProperty` and blind the traversal to those properties. Please note that |
| the `VertexProperty` will only be hidden by way of the `Traversal` itself. For example, calling `Vertex.property(k)` |
| bypasses the context of the `PartitionStrategy` and will thus allow all properties to be accessed. |
| |
| By writing elements to particular partitions and then restricting read partitions, the developer is able to create |
| multiple graphs within a single address space. Moreover, by supporting references between partitions, it is possible |
| to merge those multiple graphs (i.e. join partitions). |
| |
| [[readonlystrategy]] |
| === ReadOnlyStrategy |
| |
| `ReadOnlyStrategy` is largely self-explanatory. A `Traversal` that has this strategy applied will throw an |
| `IllegalStateException` if the `Traversal` has any mutating steps within it. |
| |
| [source,text] |
| ---- |
| gremlin> readonly = g.withStrategies(ReadOnlyStrategy.instance()) |
| ==>graphtraversalsource[tinkergraph[vertices:5 edges:7], standard] |
| gremlin> readonly.addV('person') |
| The provided traversal has a mutating step and thus is not read only: AddVertexStartStep({label=[person]}) |
| Type ':help' or ':h' for help. |
| Display stack trace? [yN] |
| ---- |
| |
| === SubgraphStrategy |
| |
| `SubgraphStrategy` is similar to `PartitionStrategy` in that it constrains a `Traversal` to certain vertices, edges, |
| and vertex properties as determined by a `Traversal`-based criterion defined individually for each. |
| |
| [gremlin-groovy] |
| ---- |
| graph = TinkerFactory.createTheCrew() |
| g = traversal().withEmbedded(graph) |
| g.V().as('a').values('location').as('b'). <1> |
| select('a','b').by('name').by() |
| g = g.withStrategies(SubgraphStrategy.build().vertexProperties(hasNot('endTime')).create()) <2> |
| g.V().as('a').values('location').as('b'). <3> |
| select('a','b').by('name').by() |
| g.V().as('a').values('location').as('b'). |
| select('a','b').by('name').by().explain() |
| ---- |
| |
| <1> Get all vertices and their vertex property locations. |
| <2> Create a `SubgraphStrategy` where vertex properties must not have an `endTime`-property (thus, the current location). |
| <3> Get all vertices and their current vertex property locations. |
| |
| IMPORTANT: This strategy is implemented such that the vertices attached to an `Edge` must both satisfy the vertex criterion |
| (if present) in order for the `Edge` to be considered a part of the subgraph. |
| |
| The example below uses all three filters: vertex, edge, and vertex property. People vertices must have lived in more than three places, |
| edges must be labeled "develops," and vertex properties must be the persons current location or a non-location property. |
| |
| [gremlin-groovy] |
| ---- |
| graph = TinkerFactory.createTheCrew() |
| g = traversal().withEmbedded(graph).withStrategies(SubgraphStrategy.build(). |
| vertices(or(hasNot('location'),properties('location').count().is(gt(3)))). |
| edges(hasLabel('develops')). |
| vertexProperties(or(hasLabel(neq('location')),hasNot('endTime'))).create()) |
| g.V().elementMap() |
| g.E().elementMap() |
| g.V().outE().inV(). |
| path(). |
| by('name'). |
| by(). |
| by('name') |
| ---- |
| |
| [[dsl]] |
| == Domain Specific Languages |
| |
| Gremlin is a link:http://en.wikipedia.org/wiki/Domain-specific_language[domain specific language] (DSL) for traversing |
| graphs. It operates in the language of vertices, edges and properties. Typically, applications built with Gremlin are |
| not of the graph domain, but instead model their domain within a graph. For example, the |
| link:https://tinkerpop.apache.org/docs/current/images/tinkerpop-modern.png["modern" toy graph] models |
| software and person domain objects with the relationships between them (i.e. a person "knows" another person and a |
| person "created" software). |
| |
| An analyst who wanted to find out if "marko" knows "josh" could write the following Gremlin: |
| |
| [source,java] |
| ---- |
| g.V().hasLabel('person').has('name','marko'). |
| out('knows').hasLabel('person').has('name','josh').hasNext() |
| ---- |
| |
| While this method achieves the desired answer, it requires the analyst to traverse the graph in the domain language |
| of the graph rather than the domain language of the social network. A more natural way for the analyst to write this |
| traversal might be: |
| |
| [source,java] |
| ---- |
| g.persons('marko').knows('josh').hasNext() |
| ---- |
| |
| In the statement above, the traversal is written in the language of the domain, abstracting away the underlying |
| graph structure from the query. The two traversal results are equivalent and, indeed, the "Social DSL" produces |
| the same set of traversal steps as the "Graph DSL" thus producing equivalent strategy application and performance |
| runtimes. |
| |
| To further the example of the Social DSL consider the following: |
| |
| [source,java] |
| ---- |
| // Graph DSL - find the number of persons who created at least 2 projects |
| g.V().hasLabel('person'). |
| where(outE("created").count().is(P.gte(2))).count() |
| |
| // Social DSL - find the number of persons who created at least 2 projects |
| social.persons().where(createdAtLeast(2)).count() |
| |
| // Graph DSL - determine the age of the youngest friend "marko" has |
| g.V().hasLabel('person').has('name','marko'). |
| out("knows").hasLabel("person").values("age").min() |
| |
| // Social DSL - determine the age of the youngest friend "marko" has |
| social.persons("marko").youngestFriendsAge() |
| ---- |
| |
| Learn more about how to implement these DSLs in the <<gremlin-drivers-variants,Gremlin Language Variants>> section |
| specific to the programming language of interest. |
| |
| [[translators]] |
| == Translators |
| |
| image::gremlin-translator.png[width=1024] |
| |
| There are times when is helpful to translate Gremlin from one programming language to another. Perhaps a large Gremlin |
| example is found on StackOverflow written in Java, but the programming language the developer has chosen is Python. |
| Fortunately, TinkerPop has developed `Translator` infrastructure that will convert Gremlin from one programming |
| language syntax to another. |
| |
| The functionality relevant to most users is actually a sub-function of `Translator` infrastructure and is more |
| specifically a `ScriptTranslator` which takes Gremlin `Bytecode` of a traversal and generates a `String` representation |
| of that `Bytecode` in the programming language syntax that the `ScriptTranslator` instance supports. The translation |
| therefore allows Gremlin to be converted from the host programming language of the `Translator` to another. |
| |
| The following translators are available, where the first column identifies the host programming language and the |
| columns represent the language that Gremlin can be generated in: |
| |
| [width="100%",cols="<,^,^,^,^,^",options="header"] |
| |========================================================= |
| | |Java |Groovy |Javascript |.NET |Python |
| |*Java* | |X | | |X |
| |*Groovy* | |X | | |X |
| |*Javascript* | |X | | | |
| |*.NET* | | | | | |
| |*Python* | | | | | |
| |========================================================= |
| |
| Each programming language has its own API for translation, but the pattern is quite similar from one to the next: |
| |
| WARNING: While `Translator` implementations have been around for some time, they are still in their early stages from |
| an interface perspective. API changes may occur in the near future. |
| |
| [source,java,tab] |
| ---- |
| // gremlin-core module |
| import org.apache.tinkerpop.gremlin.process.traversal.translator.*; |
| GraphTraversalSource g = ...; |
| Traversal<Vertex,Integer> t = g.V().has("person","name","marko"). |
| where(in("knows")). |
| values("age"). |
| map(Lambda.Function("it.get() + 1"); |
| |
| Translator.ScriptTranslator groovyTranslator = GroovyTranslator.of("g"); |
| System.out.println(groovyTranslator.translate(t); |
| // OUTPUT: g.V().has("person","name","marko").where(__.in("knows")).values("age").map({it.get() + 1}) |
| |
| Translator.ScriptTranslator pythonTranslator = PythonTranslator.of("g"); |
| System.out.println(pythonTranslator.translate(t); |
| // OUTPUT: g.V().has('person','name','marko').where(__.in_('knows')).age.map(lambda: "it.get() + 1") |
| ---- |
| [source,javascript] |
| ---- |
| const g = ...; |
| const t = g.V().has("person","name","marko"). |
| where(in_("knows")). |
| values("age"); |
| |
| // groovy |
| const translator = new gremlin.process.Translator('g'); |
| console.log(translator.translate(t)); |
| // OUTPUT: g.V().has('person','name','marko').where(__.in('knows')).values('age') |
| ---- |
| |