| //// |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| //// |
| [[gremlin-semantics]] |
| = Gremlin Semantics |
| |
| image:tinkerpop-meeting-room.png[] |
| |
| This section provides details on Gremlin language operational semantics. Describing these semantics and reinforcing |
| them with tests in the Gremlin test suite makes it easier for providers to implement the language and for the |
| TinkerPop Community to have better consistency in their user experiences. While the general Gremlin test suite offers |
| an integrated approach to testing Gremlin queries, the `@StepClassSemantics` oriented tests found |
| link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-test/src/main/resources/org/apache/tinkerpop/gremlin/test/features[here] are especially focused to the |
| definitions found in this section. References to the location of specific tests can be found throughout the |
| sub-sections below. |
| |
| == Types |
| |
| The TinkerPop query execution runtime is aligned with Java primitives and handles the following primitive types: |
| |
| * Boolean |
| * Integer |
| ** Byte (int8) |
| ** Short (int16) |
| ** Integer (int32) |
| ** Long (int64) |
| ** BigInteger |
| * Decimal |
| ** Float (32-bit) (including +/-Infinity and NaN) |
| ** Double (64-bit) (including +/-Infinity and NaN) |
| ** BigDecimal |
| * String |
| * UUID |
| * Date |
| * `nulltype` |
| ** Has only one value in its type space - the "undefined" value `null` |
| |
| NOTE: TinkerPop has a bit of a JVM-centric view of types as it was developed within that ecosystem. |
| |
| Graph providers may not support all of these types depending on the architecture and implementation. Therefore |
| TinkerPop must provide a way for Graph providers to override the behavior while it has its own default behavior. Also |
| when some types are not supported Graph providers needs to map unsupported types into supported types internally. This |
| mapping can be done in either information-preserving manner or non-preserving manner. Graph providers must tell which |
| mapping they support through `Graph.Features` as well as which types they support. |
| |
| * Which primitive types are supported |
| ** Boolean, Integer, Float, String, UUID and Date |
| ** TinkerPop by default supports all of them |
| * Which integer types are supported |
| ** TinkerPop by default supports int8 (Byte), int16 (Short), int32 (Integer), int64 (Long) and BigInteger in Java |
| * Which float types are supported |
| ** TinkerPop by default supports all as float, double, and BigDecimal in Java |
| |
| In addition to these, there are composite types as follows: |
| |
| * Graph Element |
| ** Vertex |
| ** Edge |
| ** VertexProperty |
| * Property (edge properties and meta properties) |
| ** (Key, Value) pair |
| ** Key is String, Value is any of the primitive types defined above |
| * Path |
| * List |
| * Map |
| * Map.Entry |
| * Set |
| |
| === Numeric Type Promotion |
| |
| TinkerPop performs type promotion a.k.a type casting for Numbers. Numbers are Byte, Short, Integer, Long, Float, |
| Double, BigInteger, and BigDecimal. In general, numbers are compared using semantic equivalence, without regard for |
| their specific type, e.g. `1 == 1.0`. |
| |
| Numeric types are promoted as follows: |
| |
| * First determine whether to use floating point or not. If any numbers in the comparison are floating point then we |
| convert all of them to floating point. |
| * Next determine the maximum bit size of the numerics being compared. |
| * If any floating point are present: |
| ** If the maximum bit size is 32 (up to Integer/Float), we compare as Float |
| ** If the maximum bit size is 64 (up to Long/Double), we compare as Double |
| ** Otherwise we compare as BigDecimal |
| * If no floating point are present: |
| ** If the maximum bit size is 8 we compare as Byte |
| ** If the maximum bit size is 16 we compare as Short |
| ** If the maximum bit size is 32 we compare as Integer |
| ** If the maximum bit size is 64 we compare as Long |
| ** Otherwise we compare as BigInteger |
| |
| BigDecimal and BigInteger may not be supported depending on the language and Storage, therefore the behavior of type |
| casting for these two types can vary depending on a Graph provider. |
| |
| [[gremlin-semantics-concepts]] |
| == Comparability, Equality, Orderability, and Equivalence |
| |
| This section of the document attempts to more clearly define the semantics for different types of value comparison |
| within the Gremlin language and reference implementation. There are four concepts related to value comparison: |
| |
| **Equality** |
| |
| Equality semantics is used by the equality operators (`P.eq/neq`) and contains operators derived from them |
| (`P.within/without`). It is also used for implicit `P.eq` comparisons, for example `g.V().has("age", 25)` - equality |
| semantics are used to look up vertices by `age` when considering the value. |
| |
| **Comparability** |
| |
| Comparability semantics is used by the compare operators (`P.lt/lte/gt/gte`) and operators derived from them |
| (`P.inside/outside/between`) and defines the semantics of how to compare two values. |
| |
| **Orderability** |
| |
| Orderability semantics defines how two values are compared in the context of an `order()` operation. These semantics |
| have important differences from Comparability. |
| |
| **Equivalence** |
| |
| Equivalence semantics are slightly different from Equality and are used for operations such as `dedup()` and `group()`. |
| Key differences include handling of numeric types and NaN. |
| |
| Both Equality and Equivalence can be understood as complete, i.e. the result of equality and equivalence checks is |
| always either `TRUE` or `FALSE` (in particular, it never returns `nulltype` or throws an exception). Similarly, |
| Orderability can be also understood as complete - any two values can be compared without error for ordering purposes. |
| Comparability semantics are not complete with respect to binary boolean semantics, and as such, Gremlin introduces a |
| ternary boolean semantics for Comparability that includes a third boolean state - `ERROR`, with its own well-defined |
| semantics. |
| |
| === Ternary Boolean Logics |
| |
| When evaluating boolean value expressions, we sometimes encounter situations that cannot be proved as either `TRUE` or |
| `FALSE`. Common `ERROR` cases are Comparability against `NaN`, cross-type Comparability (e.g. `String` vs `Numeric`), or |
| other invalid arguments to other boolean value expressions. |
| |
| Rather than throwing an exception and halting the traversal, we extend normal binary boolean logics and introduce a |
| third boolean option - `ERROR`. How this `ERROR` result is handled is Graph provider dependent. For the reference |
| implementation, `ERROR` is an internal representation only and will not be propagated back to the client as an |
| exception - it will eventually hit a binary reduction operation and be reduced to `FALSE` (thus quietly filters the |
| solution that produced the `ERROR`). Before that happens though, it will be treated as its own boolean value with its |
| own semantics that can be used in other boolean value expressions, such as Connective predicates (`P.and/or`) and |
| negation (`P.not`). |
| |
| ==== Ternary Boolean Semantics for `AND` |
| |
| |=== |
| |A|B|AND|Intuition |
| |
| | `TRUE` | `TRUE` | `TRUE` | |
| | `TRUE` | `FALSE` | `FALSE` | |
| | `TRUE` | `ERROR` | `ERROR` | `TRUE && X == X` |
| | `FALSE` | `TRUE` | `FALSE` | |
| | `FALSE` | `FALSE` | `FALSE` | |
| | `FALSE` | `ERROR` | `FALSE` | `FALSE && X == FALSE` |
| | `ERROR` | `TRUE` | `ERROR` | `X && TRUE == X` |
| | `ERROR` | `FALSE` | `FALSE` | `X && FALSE == FALSE` |
| | `ERROR` | `ERROR` | `ERROR` | `X && X == X` |
| |=== |
| |
| ==== Ternary Boolean Semantics for `OR` |
| |
| |=== |
| |A|B|OR|Intuition |
| |
| | `TRUE` | `TRUE` | `TRUE` | |
| | `TRUE` | `FALSE` | `TRUE` | |
| | `TRUE` | `ERROR` | `TRUE` | `TRUE \|\| X == TRUE` |
| | `FALSE` | `TRUE` | `TRUE` | |
| | `FALSE` | `FALSE` | `FALSE` | |
| | `FALSE` | `ERROR` | `ERROR` | `FALSE \|\| X == X` |
| | `ERROR` | `TRUE` | `TRUE` | `X \|\| TRUE == TRUE` |
| | `ERROR` | `FALSE` | `ERROR` | `X \|\| FALSE == X` |
| | `ERROR` | `ERROR` | `ERROR` | `X \|\| X == X` |
| |=== |
| |
| ==== Ternary Boolean Semantics for `NOT` |
| |
| The `NOT` predicate inverts `TRUE` and `FALSE`, respectively, but maintains `ERROR` values. The key idea is that, for an |
| `ERROR` value, we can neither prove nor disprove the expression, and hence stick with `ERROR`. |
| |=== |
| |Argument | Result |
| |
| |`TRUE` | `FALSE` |
| |`FALSE` | `TRUE` |
| |`ERROR` | `ERROR` |
| |=== |
| |
| [[gremlin-semantics-equality-comparability]] |
| === Equality and Comparability |
| |
| <<Equality,Equality>> and <<Comparability,Comparability>> can be understood to be semantically aligned with one another. |
| As mentioned above, <<Equality,Equality>> is used for `P.eq/neq` (and derived predicates) and |
| <<Comparability,Comparability>> is used for `P.lt/lte/gt/gte` (and derived predicates). |
| If we define Comparability using a `compare()` function over `A` and `B` as follows: |
| |
| [source,text] |
| ---- |
| If (A, B) are Comparable per Gremlin semantics, then: |
| For A < B, Comparability.compare(A, B) < 0 |
| For A > B, Comparability.compare(A, B) > 0 |
| For A == B, Comparability.compare(A, B) == 0 |
| If (A, B) not Comparable, then: |
| Comparability.compare(A, B) => ERROR |
| ---- |
| |
| Then we can define Equality using an `equals()` function over `A` and `B` that acts as a strict binary |
| reduction of `Comparability.compare(A, B) == 0`: |
| |
| [source,text] |
| ---- |
| For any (A, B): |
| Comparability.compare(A, B) == 0 implies Equality.equals(A, B) == TRUE |
| Comparability.compare(A, B) <> 0 implies Equality.equals(A, B) == FALSE |
| Comparability.compare(A, B) => ERROR implies Equality.equals(A, B) == FALSE |
| ---- |
| |
| The following table illustrates how <<Equality,Equality>> and <<Comparability,Comparability>> operate under various |
| classes of comparison: |
| |
| |=== |
| |Class|Arguments|<<Comparability,Comparability>>|<<Equality,Equality>> |
| |
| |Comparisons Involving `NaN`|`(NaN,X)` + |
| |
| where `X` = any value, including `NaN`|`ERROR` + |
| |
| Comparing `NaN` to anything (including itself) cannot be evaluated.|`FALSE` |
| |
| |Comparisons Involving `null`|`(null,null)`|`compare() == 0`|`TRUE` |
| ||`(null, X)`|`ERROR` + |
| |
| Since `nulltype` is its own type, this falls under the umbrella of cross-type comparisons. |`FALSE` |
| |Comparisons within the same type family (i.e. String vs. String, Number vs. Number, etc.)|`(X, Y)` + |
| |
| where `X` and `Y` of same type|Result of `compare()` depends on type semantics, defined below.|`TRUE` iff `compare() == 0` |
| |Comparisons across types (i.e. String vs. Number)|`(X, Y)` + |
| |
| where `X` and `Y` of different type|`ERROR`|`FALSE` |
| |
| |=== |
| |
| ==== Equality and Comparability Semantics by Type |
| |
| For <<Equality,Equality>> and <<Comparability,Comparability>> evaluation of values within the same type family, we |
| define the semantics per type family as follows. |
| |
| ===== Number |
| |
| Numbers are compared using type promotion, described above. As such, `1 == 1.0`. |
| |
| Edge cases: |
| |
| * `-0.0 == 0.0 == +0.0` |
| * `+INF == +INF`, `-INF == -INF`, `-INF != +INF` |
| ** `Float.±Infinity` and `Double.±Infinity` adhere to the same type promotion rules. |
| * As described above `NaN` is not <<Equality,Equal>> and not <<Comparability,Comparable>> to any Number (including itself). |
| |
| See: link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-test/src/main/resources/org/apache/tinkerpop/gremlin/test/features/semantics/Equality.feature[Equality Tests - Scenarios prefixed with "Primitives_Number_"] |
| |
| ===== nulltype |
| |
| As described in the table above, `null == null`, but is not <<Equality,Equal>> and not <<Comparability,Comparable>> to |
| any non-`null` value. |
| |
| ===== Boolean |
| |
| For Booleans, `TRUE == TRUE`, `FALSE == FALSE`, `TRUE != FALSE`, and `FALSE < TRUE`. |
| |
| ===== String |
| |
| We assume the common lexicographical order over unicode strings. `A` and `B` are compared lexicographically, and |
| `A == B` if `A` and `B` are lexicographically equal. |
| |
| ===== UUID |
| |
| UUID is evaluated based on its String representation. However, `UUID("b46d37e9-755c-477e-9ab6-44aabea51d50")` and the |
| String `"b46d37e9-755c-477e-9ab6-44aabea51d50"` are not <<Equality,Equal>> and not <<Comparability,Comparable>>. |
| |
| ===== Date |
| |
| Dates are evaluated based on the numerical comparison of Unix Epoch time. |
| |
| ===== Graph Element (Vertex / Edge / VertexProperty) |
| |
| If they are the same type of Element, these are compared by the value of their `T.id` according to the semantics for |
| the particular primitive type used for ids (implementation-specific). Elements of different types are |
| not <<Equality,Equal>> and not <<Comparability,Comparable>>. |
| |
| ===== Property |
| |
| Properties are compared first by key (String semantics), then by value, according to the semantics for the particular |
| primitive type of the value. Properties with values in different type families are |
| not <<Equality,Equal>> and not <<Comparability,Comparable>>. |
| |
| ===== List |
| |
| Lists are compared pairwise, element-by-element, in their natural list order. For each element, if the pairs are |
| <<Equality and Comparability,Equal>>, we simply move on to the next element pair until we encounter a pair whose |
| `Comparability.compare()` value is non-zero (`-1`, `1`, or `ERROR`), and we return that value. Lists can be evaluated |
| for <<Equality and Comparability,Equality and Comparability>> even if they contain multiple types of elements, so long |
| as their elements are pairwise comparable per <<Equality and Comparability,Equality/Comparability>> semantics. During |
| this element by element comparison, if iteration `A` exhausts its elements before iteration `B` then `A < B`, and |
| vice-versa. |
| |
| Empty lists are equal to other empty lists and less than non-empty lists. |
| |
| |=== |
| |`A`|`B`|`compare(A,B)`|`P`|Reason |
| |
| |`[]`|`[]`|`0`|`P.eq`|empty lists are equal |
| |`[]`|`[1]`|`-1`|`P.lt`|empty < non-empty |
| |`[1]`|`[]`|`1`|`P.gt`|non-empty > empty |
| |`[1,2,3]`|`[1,2,3]`|`0`|`P.eq`|pairwise equal |
| |`[1,2,3]`|`[1,2,4]`|`-1`|`P.lt`|pairwise equal until last element: `3 < 4` |
| |`[1,2,3]`|`[1,2,3,4]`|`-1`|`P.lt`|`A` exhausts first |
| |`[1,2,3,4]`|`[1,2,3]`|`1`|`P.gt`|`B` exhausts first |
| |`[1,2]`|`[1.0,2.0]`|`0`|`P.eq`|type promotion |
| |`[1,"a"]`|`[1,"b"]`|`-1`|`P.lt`|pairwise <<Comparability,Comparable>> and `"a" < "b"` |
| |`[1]`|`["a"]`|`ERROR`|`P.neq`|cross-type comparison |
| |=== |
| |
| ===== Path |
| |
| <<Equality and Comparability,Equality and Comparability>> semantics for `Paths` are similar to those for `Lists`, described above (though |
| `Paths` and `Lists` are still of different types and thus not <<Equality,Equal>> and not <<Comparability,Comparable>>). |
| |
| ===== Set |
| |
| `Sets` are compared pairwise, element-by-element, in the same way as `Lists`, but they are compared in sorted order |
| using <<Orderability,Orderability>> semantics to sort (described further below). We use <<Orderability,Orderability>> |
| semantics for ordering so that `Sets` containing multiple element types can be properly sorted before being compared. |
| |
| For example: |
| |
| |=== |
| |`A`|`B`|`compare(A,B)`|`P`|Reason |
| |
| |`{1, 2}`|`{2, 1}`|`0`|`P.eq`|sort before compare |
| |`{1, "foo"}`|`{"foo", 1}`|`0`|`P.eq`|we use <<Orderability,Orderability>> semantics to sort across types |
| |=== |
| |
| Sets do introduce a bit of semantic stickiness, in that on the one hand they do respect type promotion semantics for |
| Equality and Comparability: |
| |
| [source,text] |
| ---- |
| {1, 2} == {1.0, 2.0} |
| ---- |
| |
| But on the other hand they also allow two elements that would be equal (and thus duplicates) according to type |
| promotion: |
| |
| [source,text] |
| ---- |
| {1, 1.0, 2} is a valid set and != {1, 2} |
| ---- |
| |
| We allow some "wiggle-room" in the implementation for providers to decide how to handle this logical inconsistency. The |
| reference implementation allows for semantically equivalent numerics to appear in a set (e.g `{1, 1.0}`), while at the |
| same time evaluating the same semantically equivalent numerics as equal during pairwise comparison across sets (e.g. |
| `{1,2} == {1.0,2.0}`). |
| |
| ===== Map |
| |
| 'Map' semantics can be thought of as similar to `Set` semantics for the entry set the comprises the `Map`. So again, |
| we compare pairwise, entry-by-entry, in the same way as `Lists`, and again, we first sort the entries using |
| <<Orderability,Orderability>> semantics. Map entries are compared first by key, then by value using the |
| <<Equality and Comparability Semantics by Type,Equality and Comparability>> semantics that apply to the specific type |
| of key and value. |
| |
| Maps semantics have the same logical inconsistency as set semantics, because of type promotion. Again, we leave room |
| for providers to decide how to handle this in their implementation. The reference implementation allows for semantically |
| equivalent keys to appear in a map (e.g. `1` and `1.0` can both be keys in the same map), but when comparing maps we |
| treat pairwise entries with semantically equivalent keys as the same. |
| |
| [[gremlin-semantics-orderability]] |
| === Orderability |
| |
| <<Equality and Comparability,Equality and Comparability>> were described in depth in the sections above, and their |
| semantics map to the `P` predicates. <<Comparability,Comparability>> in particular is limited to |
| comparison of values within the same type family. Comparability is complete within a given type (except for `NaN`, |
| which results in `ERROR` for any comparison), but returns `ERROR` for comparisons across types (e.g., an integer cannot |
| be compared to a string). |
| |
| Orderability semantics are very similar to Comparability for the most part, except that Orderability will never result |
| in `ERROR` for comparison of any two values - even if two values are incomparable according to Comparability semantics |
| we will still be able to determine their respective order. This allows for a total order across all Gremlin values. In |
| the reference implementation, any step using `Order.asc` or `Order.desc` (e.g. OrderGlobalStep, OrderLocalStep) will |
| follow these semantics. |
| |
| To achieve this globally complete order, we need to address any cases in Comparability that produce a comparison |
| `ERROR`, we must define a global order across type families, and we must provide semantics for ordering "unknown" |
| values (for cases of in-process JVM implementations, like the TinkerGraph). |
| |
| We define the type space, and the global order across the type space as follows: |
| |
| [source,text] |
| ---- |
| 1. nulltype |
| 2. Boolean |
| 3. Number |
| 4. Date |
| 5. String |
| 6. Vertex |
| 7. Edge |
| 8. VertexProperty |
| 9. Property |
| 10. Path |
| 11. Set |
| 12. List |
| 13. Map |
| 14. Unknown |
| ---- |
| |
| Values in different type spaces will be ordered according to their priority (e.g. all Numbers < all Strings). |
| |
| Within a given type space, Orderability determines if two values are ordered at the same position or one value is |
| positioned before or after the another. When the position is identical, which value comes first (in other words, |
| whether it should perform stable sort) depends on graph providers' implementation. |
| |
| To allow for this total ordering, we must also address the cases in <<Equality and Comparability,Comparability>> that |
| produce an comparison `ERROR`: |
| |
| |=== |
| |`ERROR` Scenario|Comparability|Orderability |
| |
| |Comparison against `NaN`|`NaN` not comparable to anything, including itself.|`NaN` appears after `+Infinity` in the numeric type space. |
| |Comparison across types|Cannot compare values of different types. This includes the `nulltype`.|Subject to a total |
| type ordering where every value of type A appears before or after every value of Type B per the priorty list above. |
| |=== |
| |
| ==== Key differences from Comparability |
| |
| One key difference to note is that we use Orderability semantics to compare values within containers (`List`, `Set`, |
| `Path`, `Map`, `Property`) rather than using Comparability semantics (i.e. Orderability all the way down). |
| |
| ===== Numeric Ordering |
| |
| Same as Comparability, except `NaN` is equivalent to `NaN` and is greater than all other Numbers, including `+Infinity`. |
| Additionally, because of type promotion (`1` == `1.0`), numbers of the same value but of different numeric types will |
| not have a stable sort order (`1` can appear either before or after `1.0`). |
| |
| ===== Property |
| |
| Same as Comparability, except Orderability semantics are used for the property value. |
| |
| ===== Iterables (Path, List, Set, Map) |
| |
| Same as Comparability, except Orderability semantics apply for the pairwise element-by-element comparisons. |
| |
| ===== Unknown Types |
| |
| For Orderability semantics, we allow for the possibility of "unknown" types. If the "unknown" arguments are of the same |
| type, we use `java.lang.Object#equals()` and `java.lang.Comparable` (if implemented) to determine their natural order. |
| If the unknown arguments are of different types or do not define a natural order, we order first by Class, |
| then by `Object.toString()`. |
| |
| [[gremlin-semantics-equivalence]] |
| === Equivalence |
| |
| Equivalence defines how TinkerPop deals with two values to be grouped or de-duplicated. Specifically it is necessary |
| for the `dedup()` and `group()` steps in Gremlin. |
| |
| For example: |
| |
| [source,text] |
| ---- |
| // deduplication needs equivalence over two property values |
| gremlin> g.V().dedup().by("name") |
| // grouping by equivalence over two property values |
| gremlin> g.V().group().by("age") |
| ---- |
| |
| Like Equality, Equivalence checks always return `true` or `false`, never `nulltype` or `error`, nor do they produce |
| exceptions. For the most part Equivalence and Equality are the same, with the following key differences: |
| |
| * Equivalence ignores type promotion semantics, i.e. two values of different types (e.g. 2^^int vs. 2.0^^float) are |
| always considered to be non-equivalent. |
| * `NaN` Equivalence is the reverse of Equality: `NaN` is equivalent to `NaN` and not |
| Equivalent to any other Number. |
| |
| === Further Reference |
| |
| ==== Mapping for P |
| |
| The following table maps the notions proposed above to the various `P` operators: |
| |
| [%header] |
| |================ |
| |Predicate|Concept |
| |P.eq |Equality |
| |P.neq |Equality |
| |P.within |Equality |
| |P.without|Equality |
| |P.lt |Comparability |
| |P.gt |Comparability |
| |P.lte |Equality, Comparability |
| |P.gte |Equality, Comparability |
| |P.inside |Comparability |
| |P.outside|Comparability |
| |P.between|Equality, Comparability |
| |================ |
| |
| ==== See Also |
| |
| link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-test/src/main/resources/org/apache/tinkerpop/gremlin/test/features/semantics/Equality.feature[Equality Tests], |
| link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-test/src/main/resources/org/apache/tinkerpop/gremlin/test/features/semantics/Comparability.feature[Comparability Tests] |
| |
| == Steps |
| |
| While TinkerPop has a full test suite for validating functionality of Gremlin, tests alone aren't always exhaustive or |
| fully demonstrative of Gremlin step semantics. It is also hard to simply read the tests to understand exactly how a |
| step is meant to behave. This section discusses the semantics for individual steps to help users and providers |
| understand implementation expectations. |
| |
| [[call-step]] |
| === call() |
| |
| *Description:* Provides support for provider-specific service calls. |
| |
| *Syntax:* `call()` | `call(String, Map)` | `call(String, Traversal)` | `call(String, Map, Traversal)` |
| |
| [width="100%",options="header"] |
| |========================================================= |
| |Start Step |Mid Step |Modulated |Domain |Range |
| |Y |Y |`with()` |`any` |`any` |
| |========================================================= |
| |
| *Arguments:* |
| |
| * `String` - The name of the service call. |
| * `Map` - A collection of static parameters relevant to the particular service call. Keys and values can be any |
| type currently supported by the Gremlin type system. |
| * `Traversal` - A traversal used to dynamically build at query time a collection of parameters relevant to the service |
| call. |
| |
| *Modulation:* |
| |
| * `with(key, value)` - Sets an additional static parameter relevant to the service call. Key and value can be any |
| type currently supported by the Gremlin type system. |
| * `with(key, Traversal)` - Sets an additional dynamic parameter relevant to the service call. Key can be any |
| type currently supported by the Gremlin type system. |
| |
| How static and dynamic parameters are merged is a detail left to the provider implementation. The reference implementation |
| (`CallStep`) uses effectively a "left to right" merge of the parameters - it starts with the static parameter `Map` |
| argument, then merges in the parameters from the dynamic `Traversal` argument, then merges in each `with` modulation |
| one by one in the order they appear. |
| |
| Service calls in the reference implementation can be specified as `Start` (start of traversal), `Streaming` |
| (mid-traversal flat map step), and `Barrier` (mid-traversal barrier step). Furthermore, the `Barrier` type can be |
| all-at-once or with a maximum chunk size. A single service can support more than one of these modes, and if it does, |
| must provide semantics for how to configure the mode at query time via parameters. |
| |
| Providers using the reference implementation to support service call with need to provide a `ServiceFactory` for each |
| named service that can create `Service` instances for execution during traversal. The `ServiceFactory` is a singleton |
| that is registered with the `ServiceRegistry` located on the provider `Graph`. The `Service` instance is local to |
| each traversal, although providers can choose to re-use instances across traversals provided there is no state. |
| |
| *Considerations:* |
| |
| Providers using the reference implementation can return `Traverser` output or raw value output - the `CallStep` will |
| handle either case appropriately. In the case of a `Streaming` service, where there is exactly one input to each |
| call, the reference implementation can preserve `Path` information by splitting the input `Traverser` when receiving |
| raw output from the call. In the case of `Barrier` however, it is the responsiblity of the `Service` to preserve |
| `Path` information by producing its own `Traversers` as output, since the `CallStep` cannot match input and ouput |
| across a barrier. The ability to split input `Traversers` and generate output is provided by the reference |
| implementation's `ServiceCallContext` object, which is supplied to the `Service` during execution. |
| |
| There are three execution methods in the reference implementation service call API: |
| |
| * `execute(ServiceCallContext, Map)` - execute a service call to start a traversal |
| * `execute(ServiceCallContext, Traverser, Map)` - execute a service call mid-traversal streaming (one input) |
| * `execute(ServiceCallContext, TraverserSet, Map)` - execute a service call mid-traversal barrier |
| |
| The Map is the merged collection of all static and dynamic parameters. In the case of `Barrier` execution, notice |
| that there is one `Map` for many input. Since the `call()` API support dynamic parameters, this implies that all |
| input must reduce to the same set of parameters for `Barrier` execution. In the reference implementation, if more |
| than one parameter set is detected, this will cause an execution and the traversal will halt. Providers that |
| implement their own version of a call operation may decide on other strategies to handle this case - for example |
| it may be sensible to group traversers by Map in the case where multiple parameter sets are detected. |
| |
| The no-arg version of the `call()` API is meant to be a directory service and should only be used to start a traversal. |
| The reference implementation provides a default version, with will produce a list of service names or a service |
| description if run with `verbose=true`. Providers using the own implementation of the call operation must provide their |
| own directory listing service with the service name `"--list"`. |
| |
| *Exceptions* |
| |
| * If a named service does not support the execution mode implied by the traversal, for example, using a `Streaming` or |
| `Barrier` step as a traversal source, this will result in an `UnsupportedOperationException`. |
| * As mentioned above, dynamic property parameters (`Traversals`) that reduce to more than one property set for a chunk |
| of input is not supported in the reference implementation and will result in an `UnsupportedOperationException`. |
| * Use of the reference implementation's built-in directory service - `call()` or `call("--list")` - mid-traversal |
| will result in an `UnsupportedOperationException`. |
| |
| See: link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/map/CallStep.java[CallStep], |
| link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/service/Service.java[Service], |
| link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/service/ServiceRegistry.java[ServiceRegistry], |
| link:https://tinkerpop.apache.org/docs/x.y.z/reference/#call-step[reference] |
| |
| [[dedup-step]] |
| === dedup() |
| |
| *Description:* Removes repeatedly seen results from the Traversal Stream. |
| |
| *Syntax:* `dedup()` | `dedup(labels)` |
| |
| [width="100%",options="header"] |
| |========================================================= |
| |Start Step |Mid Step |Modulated |Domain |Range |
| |N |Y |`by()` |`any` |`any` |
| |========================================================= |
| |
| *Arguments:* |
| |
| * `labels` - If `dedup()` is provided a list of labels, then it will ensure that the de-duplication |
| is not with respect to the current traverser object, but to the path history of the traverser. |
| |
| For Example: |
| |
| [source,groovy] |
| ---- |
| g.V().as('a').out('created').as('b').in('created').as('c'). |
| dedup('a','b').select('a','b','c') |
| ---- |
| |
| will filter out any such `a` and `b` pairs which have previously been seen. |
| |
| *Modulation:* |
| |
| * `by()` - Performs dedup according to the property specified in the by modulation. For example: |
| `g.V().dedup().by("name")` will filter out vertices with duplicate names. |
| |
| *Considerations:* |
| |
| * There is no guarantee that ordering of results will be preserved across a dedup step. |
| * There is no guarantee which element is selected as the survivor when filtering duplicates. |
| |
| For example, given a graph with the following three vertices: |
| |
| [width="100%",options="header"] |
| |========================================================= |
| |name |age |
| |Alex |38 |
| |Bob |45 |
| |Chloe |38 |
| |========================================================= |
| |
| and the traversal of: |
| |
| [source,groovy] |
| ---- |
| g.V().order().by("name", Order.asc). |
| dedup().by("age").values("name") |
| ---- |
| |
| can return any of: |
| |
| `["Alex", "Bob"]`, `["Bob", "Alex"]`, `["Bob", "Chloe"]`, or `["Chloe", "Bob"]` |
| |
| See: link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/filter/DedupGlobalStep.java[source], |
| link:https://tinkerpop.apache.org/docs/x.y.z/reference/#dedup-step[reference] |
| |
| [[element-step]] |
| === element() |
| |
| *Description:* Traverse from `Property` to its `Element`. |
| |
| *Syntax:* `element()` |
| |
| [width="100%",options="header"] |
| |========================================================= |
| |Start Step |Mid Step |Modulated |Domain |Range |
| |N |Y |N |`Property` |`Element` |
| |========================================================= |
| |
| *Arguments:* |
| |
| None |
| |
| *Modulation:* |
| |
| None |
| |
| [[merge-e-step]] |
| === mergeE() |
| |
| *Description:* Provides upsert-like functionality for edges. |
| |
| *Syntax:* `mergeE()` | `mergeE(Map)` | `mergeE(Traversal)` |
| |
| [width="100%",options="header"] |
| |========================================================= |
| |Start Step |Mid Step |Modulated |Domain |Range |
| |Y |Y |`option()` |`Map`/`Vertex` |`Edge` |
| |========================================================= |
| |
| *Arguments:* |
| |
| * `searchCreate` - A `Map` used to match an `Edge` and if not found will be the default set of data to create the new one. |
| * `onCreate` - A `Map` used to specify additional existence criteria and/or properties not already specified in `searchCreate`. |
| * `onMatch` - A `Map` used to update the `Edge` that is found using the `searchCreate` criteria. |
| * `outV` - A `Vertex` that will be late-bound into the `searchCreate` and `onCreate` `Maps` for the `Direction.OUT` key, |
| or else another `Map` used to search for that `Vertex` |
| * `inV` - A `Vertex` that will be late-bound into the `searchCreate` and `onCreate` `Maps` for the `Direction.IN` key, |
| or else another `Map` used to search for that `Vertex` |
| |
| The `searchCreate` and `onCreate` `Map` instances must consist of any combination of: |
| |
| * `T` - `id`, `label` |
| * `Direction` - `IN` or `to`, `OUT` or `from` |
| * Arbitrary `String` keys (which are assumed to be vertex properties). |
| |
| The `onMatch` `Map` instance only allows for `String` keys as the `id` and `label` of a `Vertex` are immutable as are |
| the incident vertices. Values for these valid keys that are `null` will be treated according to the semantics of the |
| `addE()` step. |
| |
| The `Map` that is used as the argument for `searchCreate` may be assigned from the incoming `Traverser` for the no-arg |
| `mergeE()`. If `mergeE(Map)` is used, then it will override the incoming `Traverser`. If `mergeE(Traversal)` is used, |
| the `Traversal` argument must resolve to a `Map` and it would also override the incoming Traverser. The `onCreate` and |
| `onMatch` arguments are assigned via modulation as described below. |
| |
| If `onMatch` is triggered the `Traverser` becomes the matched `Edge`, but the traversal still must return a `Map` |
| instance to be applied. `Null` is considered semantically equivalent to an empty `Map`. |
| |
| [width="100%",options="header"] |
| |========================================================= |
| |Event |Empty `Map` (or Null) |
| |Search |Matches all edges |
| |Create |New edge with defaults |
| |Update |No update to matched edge |
| |========================================================= |
| |
| If `T.id` is used for `searchCreate` or `onCreate`, it may be ignored for edge creation if the `Graph` does not |
| support user supplied identifiers. `onCreate` inherits from `searchCreate` - values for `T.id`, `T.label`, and |
| `Direction.OUT/IN` do not need to be specified twice. Additionally, `onCreate` cannot override values in `searchCreate` |
| (i.e. `if (exists(x)) return(x) else create(y)` is not supported). |
| |
| *Modulation:* |
| |
| * `option(Merge, Map)` - Sets the `onCreate` or `onMatch` arguments directly. |
| * `option(Merge, Traversal)` - Sets the `onCreate` or `onMatch` arguments dynamically where the `Traversal` must |
| resolve to a `Map`. |
| * `option(Merge.outV/inV)` can also accept a `Traversal` that resolves to a `Vertex`, allowing `mergeE` to be combined |
| with `mergeV` via a `select` operation. |
| |
| *Exceptions* |
| |
| * `Map` arguments are validated for their keys resulting in exception if they do not meet requirements defined above. |
| * Use of `T.label` should always have a value that is a `String`. |
| * If `T.id`, `T.label`, and/or `Direction.IN/OUT` are specified in `searchCreate`, they cannot be overriden in `onCreate`. |
| * For late binding of the from and to vertices, `Direction.OUT` must be set to `Merge.outV` and `Direction.IN` must |
| be set to `Merge.inV`. Other combinations are not allowed and will result in exception. |
| |
| *Considerations:* |
| |
| * `mergeE()` (i.e. the zero-arg overload) can only be used mid-traversal. It is not a start step. |
| * As is common to Gremlin, it is expected that `Traversal` arguments may utilize `sideEffect()` steps. |
| |
| See: link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/map/MergeEdgeStep.java[source], |
| link:https://tinkerpop.apache.org/docs/x.y.z/reference/#mergee-step[reference] |
| |
| [[merge-v-step]] |
| === mergeV() |
| |
| *Description:* Provides upsert-like functionality for vertices. |
| |
| *Syntax:* `mergeV()` | `mergeV(Map)` | `mergeV(Traversal)` |
| |
| [width="100%",options="header"] |
| |========================================================= |
| |Start Step |Mid Step |Modulated |Domain |Range |
| |Y |Y |`option()` |`Map` |`Vertex` |
| |========================================================= |
| |
| *Arguments:* |
| |
| * `searchCreate` - A `Map` used to match a `Vertex` and if not found will be the default set of data to create the new one. |
| * `onCreate` - A `Map` used to specify additional existence criteria and/or properties not already specified in `searchCreate`. |
| * `onMatch` - A `Map` used to update the `Vertex` that is found using the `searchCreate` criteria. |
| |
| The `searchCreate` and `onCreate` `Map` instances must consists of any combination of `T.id`, `T.label`, or arbitrary |
| `String` keys (which are assumed to be vertex properties). The `onMatch` `Map` instance only allows for `String` keys |
| as the `id` and `label` of a `Vertex` are immutable. `null` Values for these valid keys are not allowed. |
| |
| The `Map` that is used as the argument for `searchCreate` may be assigned from the incoming `Traverser` for the no-arg |
| `mergeV()`. If `mergeV(Map)` is used, then it will override the incoming `Traverser`. If `mergeV(Traversal)` is used, |
| the `Traversal` argument must resolve to a `Map` and it would also override the incoming Traverser. The `onCreate` and |
| `onMatch` arguments are assigned via modulation as described below. |
| |
| If `onMatch` is triggered the `Traverser` becomes the matched `Vertex`, but the traversal still must return a `Map` |
| instance to be applied. `Null` is considered semantically equivalent to an empty `Map`. |
| |
| [width="100%",options="header"] |
| |========================================================= |
| |Event |Empty `Map` (or Null) |
| |Search |Matches all vertices |
| |Create |New vertex with defaults |
| |Update |No update to matched vertex |
| |========================================================= |
| |
| If `T.id` is used for `searchCreate` or `onCreate`, it may be ignored for vertex creation if the `Graph` does not |
| support user supplied identifiers. `onCreate` inherits from `searchCreate` - values for `T.id`, `T.label` do not need |
| to be specified twice. Additionally, `onCreate` cannot override values in `searchCreate` |
| (i.e. `if (exists(x)) return(x) else create(y)` is not supported). |
| |
| *Modulation:* |
| |
| * `option(Merge, Map)` - Sets the `onCreate` or `onMatch` arguments directly. |
| * `option(Merge, Traversal)` - Sets the `onCreate` or `onMatch` arguments dynamically where the `Traversal` must |
| resolve to a `Map`. |
| |
| *Exceptions* |
| |
| * `Map` arguments are validated for their keys resulting in exception if they do not meet requirements defined above. |
| * Use of `T.label` should always have a value that is a `String`. |
| * If `T.id` and/or `T.label` are specified in `searchCreate`, they cannot be overriden in `onCreate`. |
| |
| *Considerations:* |
| |
| * `mergeV()` (i.e. the zero-arg overload) can only be used mid-traversal. It is not a start step. |
| * As is common to Gremlin, it is expected that `Traversal` arguments may utilize `sideEffect()` steps. |
| |
| See: link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/map/MergeVertexStep.java[source], |
| link:https://tinkerpop.apache.org/docs/x.y.z/reference/#mergev-step[reference] |