| //// |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| //// |
| |
| = TinkerPop 3.4.0 |
| |
| image::https://raw.githubusercontent.com/apache/tinkerpop/master/docs/static/images/avant-gremlin.png[width=225] |
| |
| *Avant-Gremlin Construction #3 for Theremin and Flowers* |
| |
| == TinkerPop 3.4.9 |
| |
| *Release Date: NOT OFFICIALLY RELEASED YET* |
| |
| Please see the link:https://github.com/apache/tinkerpop/blob/3.4.9/CHANGELOG.asciidoc#release-3-4-9[changelog] for a |
| complete list of all the modifications that are part of this release. |
| |
| === withEmbedded() |
| |
| The `AnonymousTraversalSource` was introduced in 3.3.5 and is most typically used for constructing remote |
| `TraversalSource` instances, but it also provides a way to construct a `TraversalSource` from an embedded `Graph` |
| instance: |
| |
| [source,text] |
| ---- |
| gremlin> g = traversal().withGraph(TinkerFactory.createModern()) |
| ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] |
| gremlin> g = traversal().withRemote(DriverRemoteConnection.using('localhost',8182)) |
| ==>graphtraversalsource[emptygraph[empty], standard] |
| ---- |
| |
| The `withGraph(Graph)` method is now deprecated in favor the new `withEmbedded(Graph)` method that is more explicit |
| about its intent: |
| |
| [source,text] |
| ---- |
| gremlin> g = traversal().withEmbedded(TinkerFactory.createModern()) |
| ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] |
| ---- |
| |
| This change is mostly applicable to JVM languages where embedded `Graph` instances are available. For Gremlin Language |
| Variants not on the JVM, the `withGraph(Graph)` method has simply been deprecated and not replaced (with the preference |
| to use `withRemote()` variants). |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2413[TINKERPOP-2413] |
| |
| === Upgrading for Providers |
| |
| ==== Graph System Providers |
| |
| ===== Custom TraverserSet |
| |
| It is now possible to provide a custom `TraverserSet` to `Step` implementations that make use of those objects to |
| introduce new logic for how they are populated and managed. Providers can take advantage of this capability by |
| constructing their own `Traversal` implementation and overriding the `getTraverserSetSupplier()` method. When new |
| `TraverserSet` instances are needed during traversal execution, steps will consult this method to get those instances. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2396[TINKERPOP-2396] |
| |
| == TinkerPop 3.4.8 |
| |
| *Release Date: August 3, 2020* |
| |
| Please see the link:https://github.com/apache/tinkerpop/blob/3.4.8/CHANGELOG.asciidoc#release-3-4-8[changelog] for a |
| complete list of all the modifications that are part of this release. |
| |
| === Upgrading for Users |
| |
| ==== Gremlin.NET: Automatic Reconnect |
| |
| The Gremlin.NET driver now automatically tries to reconnect to a server when no open connection is available to submit |
| a request. This should enable the driver to handle cases where the server is only temporarily unavailable or where the |
| server has closed connections which some graph providers do when no requests were sent for some time. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2288[TINKERPOP-2288] |
| |
| == TinkerPop 3.4.7 |
| |
| *Release Date: June 1, 2020* |
| |
| Please see the link:https://github.com/apache/tinkerpop/blob/3.4.7/CHANGELOG.asciidoc#release-3-4-7[changelog] for a |
| complete list of all the modifications that are part of this release. |
| |
| === Upgrading for Users |
| |
| ==== Clear Screen Command |
| |
| Gremlin Console now has the `:cls` command to clear the screen. This feature acts as an alternative to platform |
| specific clear operations and provides a common way to perform that function. |
| |
| link:https://issues.apache.org/jira/browse/TINKERPOP-2357[TINKERPOP-2357] |
| |
| == TinkerPop 3.4.6 |
| |
| *Release Date: February 20, 2020* |
| |
| Please see the link:https://github.com/apache/tinkerpop/blob/3.4.6/CHANGELOG.asciidoc#release-3-4-6[changelog] for a |
| complete list of all the modifications that are part of this release. |
| |
| === Upgrading for Users |
| |
| ==== drop() Properties |
| |
| In 3.4.5 the equality of the `Property` object changed to allow language features like `dedup()` to work more |
| consistently. An unnoticed side-effect of that change was that calling `drop()` on properties that had the same values |
| would not properly remove all their instances. This problem affected edge and meta property instances but not the |
| properties of vertices as their equality definitions had not changed. |
| |
| This issue is now resolved with the side-effect being that the inclusion of `drop()` will disable `LazyBarrierStrategy` |
| which prevents automatic bulking. In most common cases, the impact of that optimization loss should be minimal and |
| could be added back manually with `barrier()` steps in the appropriate places. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2338[TINKERPOP-2338] |
| |
| == TinkerPop 3.4.5 |
| |
| *Release Date: February 3, 2020* |
| |
| Please see the link:https://github.com/apache/tinkerpop/blob/3.4.5/CHANGELOG.asciidoc#release-3-4-5[changelog] for a |
| complete list of all the modifications that are part of this release. |
| |
| WARNING: A link:https://issues.apache.org/jira/browse/TINKERPOP-2338[bug] was noted in 3.4.5 soon after release and |
| was quickly patched. Users and providers should avoid version 3.4.5 and should instead prefer usage of 3.4.6. |
| |
| === Upgrading for Users |
| |
| ==== by(String) Modulator |
| |
| It is quite common to use the `by(String)` modulator when doing some for of selection operation where the `String` is |
| the key to the value of the current `Traverser`, demonstrated as follows: |
| |
| [source,text] |
| ---- |
| gremlin> g.V().project('name').by('name') |
| ==>[name:marko] |
| ==>[name:vadas] |
| ==>[name:lop] |
| ==>[name:josh] |
| ==>[name:ripple] |
| ==>[name:peter] |
| gremlin> g.V().order().by('name').values('name') |
| ==>josh |
| ==>lop |
| ==>marko |
| ==>peter |
| ==>ripple |
| ==>vadas |
| ---- |
| |
| Of course, this approach usually only works when the current `Traverser` is an `Element`. If it is not an element, the |
| error is swift and severe: |
| |
| [source,text] |
| ---- |
| gremlin> g.V().valueMap().project('x').by('name') |
| java.util.LinkedHashMap cannot be cast to org.apache.tinkerpop.gremlin.structure.Element |
| Type ':help' or ':h' for help. |
| Display stack trace? [yN]n |
| ---- |
| |
| and while it is perhaps straightforward to see the problem in the above example, it is not always clear exactly where |
| the mistake is. The above example is the typical misuse of `by(String)` and comes when one tries to treat a `Map` the |
| same way as an `Element` (which is quite reasonable). |
| |
| In 3.4.5, the issue of using `by(String)` on a `Map` and the error messaging have been resolved as follows: |
| |
| [source,text] |
| ---- |
| gremlin> g.V().valueMap().project('x').by('name') |
| ==>[x:[marko]] |
| ==>[x:[vadas]] |
| ==>[x:[lop]] |
| ==>[x:[josh]] |
| ==>[x:[ripple]] |
| ==>[x:[peter]] |
| gremlin> g.V().elementMap().order().by('name') |
| ==>[id:4,label:person,name:josh,age:32] |
| ==>[id:3,label:software,name:lop,lang:java] |
| ==>[id:1,label:person,name:marko,age:29] |
| ==>[id:6,label:person,name:peter,age:35] |
| ==>[id:5,label:software,name:ripple,lang:java] |
| ==>[id:2,label:person,name:vadas,age:27] |
| gremlin> g.V().values('name').project('x').by('name') |
| The by("name") modulator can only be applied to a traverser that is an Element or a Map - it is being applied to [marko] a String class instead |
| Type ':help' or ':h' for help. |
| Display stack trace? [yN]n |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2314[TINKERPOP-2314] |
| |
| ==== hasKey() Step and hasValue() Step |
| |
| Previously, `hasKey()`-step and `hasValue()`-step only applied to vertex properties. In order to support more |
| generalized scenarios, the behavior of these steps were modified to allow them to be applied to both edge properties |
| and meta-properties. |
| |
| The original behavior is demonstrated as follows: |
| |
| [source,groovy] |
| ---- |
| gremlin> graph = TinkerFactory.createTheCrew() |
| ==>tinkergraph[vertices:6 edges:14] |
| gremlin> g = graph.traversal() |
| ==>graphtraversalsource[tinkergraph[vertices:6 edges:14], standard] |
| gremlin> g.E().properties().hasKey('since') |
| ==>TinkerProperty cannot be cast to Element |
| gremlin> g.V().properties("location").properties().hasKey("startTime") |
| ==>TinkerProperty cannot be cast to Element |
| gremlin> g.E().properties().hasValue(2010) |
| ==>TinkerProperty cannot be cast to Element |
| gremlin> g.V().properties("location").properties().hasValue(2005) |
| ==>TinkerProperty cannot be cast to Element |
| ---- |
| |
| The new behavior of `hasKey()` with edge property can be seen as: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.E().properties().hasKey('since') |
| ==>p[since->2009] |
| ==>p[since->2010] |
| ==>p[since->2010] |
| ==>p[since->2011] |
| ==>p[since->2012] |
| ---- |
| |
| Similar behavior of for `hasKey()` can be seen with meta-properties: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V().properties().hasKey('location').properties().hasKey('startTime') |
| ==>p[startTime->1997] |
| ==>p[startTime->2001] |
| ==>p[startTime->2004] |
| ==>p[startTime->2004] |
| ==>p[startTime->2005] |
| ==>p[startTime->2005] |
| ==>p[startTime->1990] |
| ==>p[startTime->2000] |
| ==>p[startTime->2006] |
| ==>p[startTime->2007] |
| ==>p[startTime->2011] |
| ==>p[startTime->2014] |
| ==>p[startTime->1982] |
| ==>p[startTime->2009] |
| ---- |
| |
| The new behavior for `hasValue()` with edge property is as follows: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.E().properties().hasValue(2010) |
| ==>p[since->2010] |
| ==>p[since->2010] |
| ---- |
| |
| and similarly with `hasValue()` for meta-properties: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V().properties().hasKey('location').properties().hasValue(2005) |
| ==>p[endTime->2005] |
| ==>p[endTime->2005] |
| ==>p[startTime->2005] |
| ==>p[startTime->2005] |
| ---- |
| |
| link:https://issues.apache.org/jira/browse/TINKERPOP-1733[TINKERPOP-1733] |
| |
| ==== Properties Equality |
| |
| There was some inconsistency in terms of `Property` equality in earlier versions of Gremlin. Equality is now |
| defined as follows: two properties are equal only if their key and value are equal, even if their parent elements are |
| not equal. It makes sense when comparing properties regardless of parent elements to just focus on the property itself |
| as it yields more uniform and concise results to reason about. The benefit of this change is that the behavior of |
| property comparison and `dedup()`-step are predictable, and it will not affect the result if the property is detached |
| from the parent element. |
| |
| NOTE: The "property" here refer to edge properties and meta-properties, thus excluding vertex property. |
| |
| The old behavior can be shown using "The Crew" toy graph as follows: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.E().properties().count() |
| ==>13 |
| gremlin> g.E().properties() |
| ==>p[since->2009] |
| ==>p[since->2010] |
| ==>p[skill->4] |
| ==>p[skill->5] |
| ==>p[since->2010] |
| ==>p[since->2011] |
| ==>p[skill->5] |
| ==>p[skill->4] |
| ==>p[since->2012] |
| ==>p[skill->3] |
| ==>p[skill->3] |
| ==>p[skill->5] |
| ==>p[skill->3] |
| gremlin> g.E().properties().dedup().count() |
| ==>13 |
| gremlin> g.E().dedup().properties() |
| ==>p[since->2009] |
| ==>p[since->2010] |
| ==>p[skill->4] |
| ==>p[skill->5] |
| ==>p[since->2010] |
| ==>p[since->2011] |
| ==>p[skill->5] |
| ==>p[skill->4] |
| ==>p[since->2012] |
| ==>p[skill->3] |
| ==>p[skill->3] |
| ==>p[skill->5] |
| ==>p[skill->3] |
| ---- |
| |
| The new more consistent behavior is demonstrated below: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.E().properties().count() |
| ==>13 |
| gremlin> g.E().properties() |
| ==>p[since->2009] |
| ==>p[since->2010] |
| ==>p[skill->4] |
| ==>p[skill->5] |
| ==>p[since->2010] |
| ==>p[since->2011] |
| ==>p[skill->5] |
| ==>p[skill->4] |
| ==>p[since->2012] |
| ==>p[skill->3] |
| ==>p[skill->3] |
| ==>p[skill->5] |
| ==>p[skill->3] |
| gremlin> g.E().properties().dedup().count() |
| ==>7 |
| gremlin> g.E().properties().dedup() |
| ==>p[since->2009] |
| ==>p[since->2010] |
| ==>p[skill->4] |
| ==>p[skill->5] |
| ==>p[since->2011] |
| ==>p[since->2012] |
| ==>p[skill->3] |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2318[TINKERPOP-2318] |
| |
| === Upgrading for Providers |
| |
| ==== Graph Driver Providers |
| |
| ===== GraphBinary API Change |
| |
| In GraphBinary serialization, Java `GraphBinaryReader` and `GraphBinaryWriter`, along with `TypeSerializer<T>` |
| interface now take a `Buffer` instance instead of Netty's `ByteBuf`, that way we avoid exposing Netty's API in our own |
| public API. |
| |
| Using our own `Buffer` interface, wrapping Netty's buffer API, allowed us to move `TypeSerializer<T>` implementations, |
| the reader and the writer to `org.apache.tinkerpop.gremlin.structure.io.binary` package in `gremlin-core`. |
| |
| Additionally, `GraphBinaryReader` and `GraphBinaryWriter` methods now throw an java's `IOException`, instead of our |
| own `SerializationException`. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2305[TINKERPOP-2305] |
| |
| == TinkerPop 3.4.4 |
| |
| *Release Date: October 14, 2019* |
| |
| Please see the link:https://github.com/apache/tinkerpop/blob/3.4.4/CHANGELOG.asciidoc#release-3-4-4[changelog] for a complete list of all the modifications that are part of this release. |
| |
| === Upgrading for Users |
| |
| ==== Python GraphBinary |
| |
| There is now support for GraphBinary in Python. As with Java, it remains a working but experimental format that is |
| still under evaluation. This new serializer can be used by first ensuring that it is available on the server and then |
| configuring the connection as follows: |
| |
| [source,python] |
| ---- |
| from gremlin_python.driver.serializer import GraphBinarySerializersV1 |
| gremlin_server_url = "ws://172.17.0.2:45940/gremlin" |
| remote_conn = DriverRemoteConnection(gremlin_server_url, 'g', |
| message_serializer=GraphBinarySerializersV1()) |
| g = Graph().traversal().withRemote(remote_conn) |
| ---- |
| |
| link:https://issues.apache.org/jira/browse/TINKERPOP-2279[TINKERPOP-2279] |
| |
| ==== elementMap() Step |
| |
| Since graph elements (i.e. `Vertex`, `Edge`, and `VertexProperty`) are returned from remote sources as references |
| (i.e. without properties), one of the more common needs for Gremlin users is the ability to easily return a `Map` |
| representation of the elements that they are querying. Typically, such transformations are handled by `valueMap()`: |
| |
| [source,text] |
| ---- |
| gremlin> g.V().has('person','name','marko').valueMap(true) |
| ==>[id:1,label:person,name:[marko],age:[29]] |
| gremlin> g.V().has('person','name','marko').valueMap().by(unfold()) |
| ==>[name:marko,age:29] |
| ---- |
| |
| or by way of `project()`: |
| |
| [source,text] |
| ---- |
| gremlin> g.V().has('person','name','marko'). |
| ......1> project('name','age','vid','vlabel'). |
| ......2> by('name'). |
| ......3> by('age'). |
| ......4> by(id). |
| ......5> by(label) |
| ==>[name:marko,age:29,vid:1,vlabel:person] |
| ---- |
| |
| While `valueMap()` works reasonably well for `Vertex` and `VertexProperty` transformations it does less well for `Edge` |
| as it fails to include incident vertices: |
| |
| [source,text] |
| ---- |
| gremlin> g.E(11).valueMap(true) |
| ==>[id:11,label:created,weight:0.4] |
| ---- |
| |
| This limitation forces a fairly verbose use of `project()` for what is a reasonably common requirement: |
| |
| [source,text] |
| ---- |
| gremlin> g.E(12).union(valueMap(true), |
| ......1> project('inV','outV','inVLabel','outVLabel'). |
| ......2> by(inV().id()). |
| ......3> by(outV().id()). |
| ......4> by(inV().label()). |
| ......5> by(outV().label())).unfold(). |
| ......6> group(). |
| ......7> by(keys). |
| ......8> by(select(values)) |
| ==>[inV:3,id:12,inVLabel:software,weight:0.2,outVLabel:person,label:created,outV:6] |
| ---- |
| |
| By introducing `elementMap()`-step, there is now a single step that covers the most common transformation requirements |
| for all three graph elements: |
| |
| [source,text] |
| ---- |
| gremlin> g.V().has('person','name','marko').elementMap() |
| ==>[id:1,label:person,name:marko,age:29] |
| gremlin> g.V().has('person','name','marko').elementMap('name') |
| ==>[id:1,label:person,name:marko] |
| gremlin> g.V().has('person','name','marko').properties('name').elementMap() |
| ==>[id:0,key:name,value:marko] |
| gremlin> g.E(11).elementMap() |
| ==>[id:11,label:created,IN:[id:3,label:software],OUT:[id:4,label:person],weight:0.4] |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2284[TINKERPOP-2284], |
| link:https://tinkerpop.apache.org/docs/3.4.4/reference/#elementmap-step[Reference Documentation] |
| |
| == TinkerPop 3.4.3 |
| |
| *Release Date: August 5, 2019* |
| |
| Please see the link:https://github.com/apache/tinkerpop/blob/3.4.3/CHANGELOG.asciidoc#release-3-4-3[changelog] for a complete list of all the modifications that are part of this release. |
| |
| === Upgrading for Users |
| |
| ==== Deprecated store() |
| |
| The `store()`-step and `aggregate()`-step do the same thing in different ways, where the former is lazy and the latter |
| is eager in the side-effect collection of objects from the traversal. The different behaviors can be thought of as |
| differing applications of `Scope` where `global` is eager and `local` is lazy. As a result, there is no need for both |
| steps when one will do. |
| |
| As of 3.4.3, `store(String)` is now deprecated in favor of `aggregate(Scope, String)` where the `Scope` should be set |
| to `local` to ensure the same functionality as `store()`. Note that `aggregate('x')` is the same as |
| `aggregate(global,'x')`. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1553[TINKERPOP-1553] |
| |
| ==== Deprecate Gryo in Gremlin Server |
| |
| Gryo is now deprecated as a serialization format for Gremlin Server, however, it is still configured as a default |
| option in the sample configuration files packaged with the server. The preferred serialization option should now be |
| GraphBinary. Note that Gremlin Console is now configured to use GraphBinary by default. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2250[TINKERPOP-2250] |
| |
| === Upgrading for Providers |
| |
| ==== Graph Driver Providers |
| |
| ===== Gremlin Server Test Configuration |
| |
| Gremlin Server has a test configuration built into its Maven build process which all integration tests and Gremlin |
| Language Variants use to validate their operations. While this approach has worked really well for test automation |
| within Maven, there are often times where it would be helpful to simply have Gremlin Server running with that |
| configuration. This need is especially true when developing Gremlin Language Variants which is something that is done |
| outside of the JVM. |
| |
| This release introduces a Docker script that will start Gremlin Server with this test configuration. It can be started |
| with: |
| |
| [source,text] |
| docker/gremlin-server.sh |
| |
| Once started, it is then possible to run GLV tests directly from a debugger against this instance which should |
| hopefully reduce development friction. |
| |
| see: link:https://tinkerpop.apache.org/docs/3.4.3/dev/developer/#docker-integration[Developer Documentation] |
| |
| == TinkerPop 3.4.2 |
| |
| *Release Date: May 28, 2019* |
| |
| Please see the link:https://github.com/apache/tinkerpop/blob/3.4.2/CHANGELOG.asciidoc#release-3-4-2[changelog] for a complete list of all the modifications that are part of this release. |
| |
| === Upgrading for Users |
| |
| ==== Per Request Options |
| |
| In 3.4.0, the notion of `RequestOptions` were added so that users could have an easier way to configure settings on |
| individual requests made through the Java driver. While that change provided a way to set those configurations for |
| script based requests, it didn't include options to make those settings in a `Traversal` submitted via `Bytecode`. In |
| this release those settings become available via `with()` step on the `TraversalSource` as follows: |
| |
| [source,java] |
| ---- |
| GraphTraversalSource g = traversal().withRemote(conf); |
| List<Vertex> vertices = g.with(Tokens.ARGS_SCRIPT_EVAL_TIMEOUT, 500L).V().out("knows").toList() |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2211[TINKERPOP-2211] |
| |
| ==== Gremlin Console Timeout |
| |
| The Gremlin Console timeout that is set by `:remote config timeout x` was client-side only in prior versions, which |
| meant that if the console timeout was less than the server timeout the client would timeout but the server might still |
| be processing the request. Similarly, a longer timeout on the console would not change the server and the timeout |
| would occur sooner than expected. These discrepancies often led to confusion. |
| |
| As of 3.4.0, the Java Driver API allowed for timeout settings to be more easily passed per request, so the console |
| was modified for this current version to pass the console timeout for each remote submission thus yielding more |
| consistent and intuitive behavior. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2203[TINKERPOP-2203] |
| |
| === Upgrading for Providers |
| |
| ==== Graph System Providers |
| |
| ===== Warnings |
| |
| It is now possible to pass warnings over the Gremlin Server protocol using a `warnings` status attribute. The warnings |
| are expected to be a string value or a `List` of string values which can be consumed by the user or tools that check |
| for that status attribute. Note that Gremlin Console is one such tool that will respond to this status attribute - it |
| will print the messages to the console as they are detected when doing remote script submissions. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2216[TINKERPOP-2216] |
| |
| ==== Graph Driver Providers |
| |
| ===== GraphBinary API Change |
| |
| In GraphBinary serialization, Java `write()` and `writeValue()` from `TypeSerializer<T>` interface now take a Netty's |
| `ByteBuf` instance instead of an `ByteBufAllocator`, that way the same buffer instance gets reused during the write |
| of a message. Additionally, we took the opportunity to remove the unused parameter from `ResponseMessageSerializer`. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2161[TINKERPOP-2161] |
| |
| == TinkerPop 3.4.1 |
| |
| *Release Date: March 18, 2019* |
| |
| Please see the link:https://github.com/apache/tinkerpop/blob/3.4.1/CHANGELOG.asciidoc#release-3-4-1[changelog] for a complete list of all the modifications that are part of this release. |
| |
| === Upgrading for Users |
| |
| ==== Mix SPARQL and Gremlin |
| |
| In the initial release of `sparql-gremlin` it was only possible to execute a SPARQL query and have it translate to |
| Gremlin. Therefore, it was only possible to write a query like this: |
| |
| [source,text] |
| ---- |
| gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name . ?person v:age ?age }") |
| ==>[name:marko,age:29] |
| ==>[name:vadas,age:27] |
| ==>[name:josh,age:32] |
| ==>[name:peter,age:35] |
| gremlin> g.sparql("SELECT * WHERE { }") |
| ==>v[1] |
| ==>v[2] |
| ==>v[3] |
| ==>v[4] |
| ==>v[5] |
| ==>v[6] |
| ---- |
| |
| In this release, however, it is now possible to further process that result with Gremlin steps: |
| |
| [source,text] |
| ---- |
| gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name . ?person v:age ?age }").select("name") |
| ==>marko |
| ==>vadas |
| ==>josh |
| ==>peter |
| gremlin> g.sparql("SELECT * WHERE { }").out("knows").values("name") |
| ==>vadas |
| ==>josh |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2171[TINKERPOP-2171], |
| link:https://tinkerpop.apache.org/docs/3.4.1/reference/#sparql-with-gremlin[Reference Documentation] |
| |
| === Upgrading for Providers |
| |
| ==== Graph Database Providers |
| |
| ===== GraphBinary Serializer Changes |
| |
| In GraphBinary serialization, Java `write()` and `writeValue()` from `TypeSerializer<T>` interface now take a Netty's |
| `ByteBuf` instance instead of an `ByteBufAllocator`, that way the same buffer instance gets reused during the write |
| of a message. Additionally, we took the opportunity to remove the unused parameter from `ResponseMessageSerializer`. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2161[TINKERPOP-2161] |
| |
| == TinkerPop 3.4.0 |
| |
| *Release Date: January 2, 2019* |
| |
| Please see the link:https://github.com/apache/tinkerpop/blob/3.4.0/CHANGELOG.asciidoc#release-3-4-0[changelog] for a complete list of all the modifications that are part of this release. |
| |
| === Upgrading for Users |
| |
| ==== sparql-gremlin |
| |
| The `sparql-gremlin` module is a link:https://en.wikipedia.org/wiki/SPARQL[SPARQL] to Gremlin compiler, which allows |
| SPARQL to be executed over any TinkerPop-enabled graph system. |
| |
| [source,groovy] |
| ---- |
| graph = TinkerFactory.createModern() |
| g = graph.traversal(SparqlTraversalSource) |
| g.sparql("""SELECT ?name ?age |
| WHERE { ?person v:name ?name . ?person v:age ?age } |
| ORDER BY ASC(?age)""") |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1878[TINKERPOP-1878], |
| link:https://tinkerpop.apache.org/docs/3.4.0/reference/#sparql-gremlin[Reference Documentation] |
| |
| ==== Gremlin.NET Driver Improvements |
| |
| The Gremlin.NET driver now uses request pipelining. This allows connections to be reused for different requests in |
| parallel which should lead to better utilization of connections. The `ConnectionPool` now also has a fixed size |
| whereas it could previously create an unlimited number of connections. Each `Connection` can handle up to |
| `MaxInProcessPerConnection` requests in parallel. If this limit is reached for all connections, then a |
| `NoConnectionAvailableException` is thrown which makes this a breaking change. |
| |
| These settings can be set as properties on the `ConnectionPoolSettings` instance that can be passed to the `GremlinClient`. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1774[TINKERPOP-1774], |
| link:https://issues.apache.org/jira/browse/TINKERPOP-1775[TINKERPOP-1775], |
| link:https://tinkerpop.apache.org/docs/3.4.0/reference/#_connection_pool[Reference Documentation] |
| |
| ==== Indexing of Collections |
| |
| TinkerPop 3.4.0 adds a new `index()`-step, which allows users to transform simple collections into index collections or maps. |
| |
| ``` |
| gremlin> g.V().hasLabel("software").values("name").fold(). |
| ......1> order(local). |
| ......2> index().unfold() |
| ==>[lop,0] |
| ==>[ripple,1] |
| gremlin> g.V().hasLabel("person").values("name").fold(). |
| ......1> order(local).by(decr). |
| ......2> index(). |
| ......3> with(WithOptions.indexer, WithOptions.map) |
| ==>[0:vadas,1:peter,2:marko,3:josh] |
| ``` |
| |
| ==== Modulation of valueMap() |
| |
| The `valueMap()` step now supports `by` and `with` modulation, which also led to the deprecation of `valueMap(true)` overloads. |
| |
| ===== by() Modulation |
| |
| With the help of the `by()` modulator `valueMap()` result values can now be adjusted, which is particularly useful to turn multi-/list-values into single values. |
| |
| ``` |
| gremlin> g.V().hasLabel("person").valueMap() |
| ==>[name:[marko],age:[29]] |
| ==>[name:[vadas],age:[27]] |
| ==>[name:[josh],age:[32]] |
| ==>[name:[peter],age:[35]] |
| gremlin> g.V().hasLabel("person").valueMap().by(unfold()) |
| ==>[name:marko,age:29] |
| ==>[name:vadas,age:27] |
| ==>[name:josh,age:32] |
| ==>[name:peter,age:35] |
| ``` |
| ===== with() Modulation |
| |
| The `with()` modulator can be used to include certain tokens (`id`, `label`, `key` and/or `value`). |
| |
| The old way (still valid, but deprecated): |
| |
| ``` |
| gremlin> g.V().hasLabel("software").valueMap(true) |
| ==>[id:10,label:software,name:[gremlin]] |
| ==>[id:11,label:software,name:[tinkergraph]] |
| gremlin> g.V().has("person","name","marko").properties("location").valueMap(true) |
| ==>[id:6,key:location,value:san diego,startTime:1997,endTime:2001] |
| ==>[id:7,key:location,value:santa cruz,startTime:2001,endTime:2004] |
| ==>[id:8,key:location,value:brussels,startTime:2004,endTime:2005] |
| ==>[id:9,key:location,value:santa fe,startTime:2005] |
| ``` |
| |
| The new way: |
| |
| ``` |
| gremlin> g.V().hasLabel("software").valueMap().with(WithOptions.tokens) |
| ==>[id:10,label:software,name:[gremlin]] |
| ==>[id:11,label:software,name:[tinkergraph]] |
| gremlin> g.V().has("person","name","marko").properties("location").valueMap().with(WithOptions.tokens) |
| ==>[id:6,key:location,value:san diego,startTime:1997,endTime:2001] |
| ==>[id:7,key:location,value:santa cruz,startTime:2001,endTime:2004] |
| ==>[id:8,key:location,value:brussels,startTime:2004,endTime:2005] |
| ==>[id:9,key:location,value:santa fe,startTime:2005] |
| ``` |
| |
| Furthermore, now there's a finer control over which of the tokens should be included: |
| |
| ``` |
| gremlin> g.V().hasLabel("software").valueMap().with(WithOptions.tokens, WithOptions.labels) |
| ==>[label:software,name:[gremlin]] |
| ==>[label:software,name:[tinkergraph]] |
| gremlin> g.V().has("person","name","marko").properties("location").valueMap().with(WithOptions.tokens, WithOptions.values) |
| ==>[value:san diego,startTime:1997,endTime:2001] |
| ==>[value:santa cruz,startTime:2001,endTime:2004] |
| ==>[value:brussels,startTime:2004,endTime:2005] |
| ==>[value:santa fe,startTime:2005] |
| ``` |
| |
| As shown above, the support of the `with()` modulator for `valueMap()` makes the `valueMap(boolean)` overload |
| superfluous, hence this overload is now deprecated. This is a breaking API change, since `valueMap()` will now always |
| yield instances of type `Map<Object, Object>`. Prior this change only the `valueMap(boolean)` overload yielded |
| `Map<Object, Object>` objects, `valueMap()` without the boolean parameter used to yield instances of type |
| `Map<String, Object>`. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2059[TINKERPOP-2059] |
| |
| ==== Predicate Number Comparison |
| |
| In previous versions `within()` and `without()` performed strict number comparisons; that means these predicates did |
| not only compare number values, but also the type. This was inconsistent with how other predicates (like `eq`, `gt`, |
| etc.) work. All predicates will now ignore the number type and instead compare numbers only based on their value. |
| |
| Old behavior: |
| |
| ``` |
| gremlin> g.V().has("age", eq(32L)) |
| ==>v[4] |
| gremlin> g.V().has("age", within(32L, 35L)) |
| gremlin> |
| ``` |
| |
| New behavior: |
| |
| ``` |
| gremlin> g.V().has("age", eq(32L)) |
| ==>v[4] |
| gremlin> g.V().has("age", within(32L, 35L)) |
| ==>v[4] |
| ==>v[6] |
| ``` |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2058[TINKERPOP-2058] |
| |
| ==== ReferenceElementStrategy |
| |
| Gremlin Server has had some inconsistent behavior in the serialization of the results it returns. Remote traversals |
| based on Gremlin bytecode always detach returned graph elements to "reference" (i.e. removes properties and only |
| include the `id` and `label`), but scripts would detach graph elements and include the properties. For 3.4.0, |
| TinkerPop introduces the `ReferenceElementStrategy` which can be configured on a `GraphTraversalSource` to always |
| detach to "reference". |
| |
| [source,text] |
| ---- |
| gremlin> graph = TinkerFactory.createModern() |
| ==>tinkergraph[vertices:6 edges:6] |
| gremlin> g = graph.traversal().withStrategies(ReferenceElementStrategy.instance()) |
| ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] |
| gremlin> v = g.V().has('person','name','marko').next() |
| ==>v[1] |
| gremlin> v.class |
| ==>class org.apache.tinkerpop.gremlin.structure.util.reference.ReferenceVertex |
| gremlin> v.properties() |
| gremlin> |
| ---- |
| |
| The packaged initialization scripts that come with Gremlin Server now pre-configure the sample graphs with this |
| strategy to ensure that both scripts and bytecode based requests over any protocol (HTTP, websocket, etc) and |
| serialization format all return a "reference". To revert to the old form, simply remove the strategy in the |
| initialization script. |
| |
| It is recommended that users choose to configure their `GraphTraversalSource` instances with `ReferenceElementStrategy` |
| as working with "references" only is the recommended method for developing applications with TinkerPop. In the future, |
| it is possible that `ReferenceElementStrategy` will be configured by default for all graphs on or off Gremlin Server, |
| so it would be best to start utilizing it now and grooming existing Gremlin and related application code to account |
| for it. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2075[TINKERPOP-2075] |
| |
| ==== Text Predicates |
| |
| Gremlin now supports simple text predicates on top of the existing `P` predicates. Both, the new `TextP` text |
| predicates and the old `P` predicates, can be chained using `and()` and `or()`. |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V().has("person","name", containing("o")).valueMap() |
| ==>[name:[marko],age:[29]] |
| ==>[name:[josh],age:[32]] |
| gremlin> g.V().has("person","name", containing("o").and(gte("j").and(endingWith("ko")))).valueMap() |
| ==>[name:[marko],age:[29]] |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2041[TINKERPOP-2041] |
| |
| ==== Changed Infix Behavior |
| |
| The infix notation of `and()` and `or()` now supports an arbitrary number of traversals and `ConnectiveStrategy` |
| produces a traversal with proper AND and OR semantics. |
| |
| ``` |
| Input: a.or.b.and.c.or.d.and.e.or.f.and.g.and.h.or.i |
| |
| *BEFORE* |
| Output: or(a, or(and(b, c), or(and(d, e), or(and(and(f, g), h), i)))) |
| |
| *NOW* |
| Output: or(a, and(b, c), and(d, e), and(f, g, h), i) |
| ``` |
| |
| Furthermore, previous versions failed to apply 3 or more `and()` steps using the infix notation, this is now fixed. |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V().has("name","marko").and().has("age", lt(30)).or().has("name","josh").and().has("age", gt(30)).and().out("created") |
| ==>v[1] |
| ==>v[4] |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2029[TINKERPOP-2029] |
| |
| ==== GraphBinary |
| |
| GraphBinary is a new language agnostic, network serialization format designed to replace Gryo and GraphSON. At this |
| time it is only available on the JVM, but support will be added for other languages in upcoming releases. The |
| serializer has been configured in Gremlin Server's packaged configuration files. The serializer can be configured |
| using the Java driver as follows: |
| |
| [source,java] |
| ---- |
| Cluster cluster = Cluster.build("localhost").port(8182). |
| serializer(Serializers.GRAPHBINARY_V1D0).create(); |
| Client client = cluster.connect(); |
| List<Result> r = client.submit("g.V().has('person','name','marko')").all().join(); |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1942[TINKERPOP-1942], |
| link:https://tinkerpop.apache.org/docs/3.4.0/dev/io/#graphbinary[IO Documentation] |
| |
| ==== Status Attributes |
| |
| The Gremlin Server protocol allows for status attributes to be returned in responses. These attributes were typically |
| for internal use, but were designed with extensibility in mind so that providers could place return their own |
| attributes to calling clients. Unfortunately, unless the client was being used with protocol level requests (which |
| wasn't convenient) those attributes were essentially hidden from view. As of this version however, status attributes |
| are fully retrievable for both successful requests and exceptions. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1913[TINKERPOP-1913] |
| |
| ==== with() Step |
| |
| This version of TinkerPop introduces the `with()`-step to Gremlin. It isn't really a step but is instead a step |
| modulator. This modulator allows the step it is modifying to accept configurations that can be used to alter the |
| behavior of the step itself. A good example of its usage is shown with the revised syntax of the `pageRank()`-step |
| which now uses `with()` to replace the old `by()` options: |
| |
| [source,groovy] |
| ---- |
| g.V().hasLabel('person'). |
| pageRank(). |
| with(PageRank.edges, __.outE('knows')). |
| with(PageRank.propertyName, 'friendRank'). |
| order(). |
| by('friendRank',desc). |
| valueMap('name','friendRank') |
| ---- |
| |
| A similar change was made for `peerPressure()`-step: |
| |
| [source,groovy] |
| ---- |
| g.V().hasLabel('person'). |
| peerPressure(). |
| with(PeerPressure.propertyName, 'cluster'). |
| group(). |
| by('cluster'). |
| by('name') |
| ---- |
| |
| Note that the `by()` modulators still work, but should be considered deprecated and open for removal in a future |
| release where breaking changes are allowed. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1975[TINKERPOP-1975], |
| link:https://tinkerpop.apache.org/docs/3.4.0/reference/#with-step[Reference Documentation] |
| |
| ==== shortestPath() Step |
| |
| Calculating the link:https://en.wikipedia.org/wiki/Shortest_path_problem[shortest path] between vertices is a common |
| graph use case. While the traversal to determine a shortest path can be expressed in Gremlin, this particular problem |
| is common enough that the feature has been encapsulated into its own step, demonstrated as follows: |
| |
| [source,text] |
| ---- |
| gremlin> g.withComputer().V().has('name','marko'). |
| ......1> shortestPath().with(ShortestPath.target, has('name','peter')) |
| ==>[v[1],v[3],v[6]] |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1990[TINKERPOP-1990], |
| link:https://tinkerpop.apache.org/docs/3.4.0/reference/#shortestpath-step[Reference Documentation] |
| |
| ==== connectedComponent() Step |
| |
| In prior version of TinkerPop, it was recommended that the identification of |
| link:https://en.wikipedia.org/wiki/Connected_component_(graph_theory)[Connected Component] instances in a graph be |
| computed by way of a reasonably complex bit of Gremlin that looked something like this: |
| |
| [source,groovy] |
| ---- |
| g.V().emit(cyclicPath().or().not(both())).repeat(both()).until(cyclicPath()). |
| path().aggregate("p"). |
| unfold().dedup(). |
| map(__.as("v").select("p").unfold(). |
| filter(unfold().where(eq("v"))). |
| unfold().dedup().order().by(id).fold()). |
| dedup() |
| ---- |
| |
| The above approach had a number of drawbacks that included a large execution cost as well as incompatibilities in OLAP. |
| To simplify usage of this commonly use graph algorithm, TinkerPop 3.4.0 introduces the `connectedComponent()` step |
| which reduces the above operation to: |
| |
| [source,groovy] |
| ---- |
| g.withComputer().V().connectedComponent() |
| ---- |
| |
| It is important to note that this step does require the use of a `GraphComputer` to work, as it utilizes a |
| `VertexProgram` behind the scenes. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1967[TINKERPOP-1967], |
| link:https://tinkerpop.apache.org/docs/x.y.z/reference/#connectedcomponent-step[Reference Documentation] |
| |
| ==== io() Step |
| |
| There have been some important changes to IO operations for reading and writing graph data. The use of `Graph.io()` |
| has been deprecated to further remove dependence on the Graph (Structure) API for users and to extend these basic |
| operations to GLV users by making these features available as part of the Gremlin language. |
| |
| It is now possible to simply use Gremlin: |
| |
| [source,groovy] |
| ---- |
| graph = ... |
| g = graph.traversal() |
| g.io(someInputFile).read().iterate() |
| g.io(someOutputFile).write().iterate() |
| ---- |
| |
| While `io()`-step is still single-threaded for OLTP style loading, it can be utilized in conjunction with OLAP which |
| internally uses `CloneVertexProgram` and therefore any graph `InputFormat` or `OutputFormat` can be configured in |
| conjunction with this step for parallel loads of large datasets. |
| |
| It is also worth noting that the `io()`-step may be overridden by graph providers to utilize their native bulk-loading |
| features, so consult the documentation of the implementation being used to determine if there are any improved |
| efficiencies there. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1996[TINKERPOP-1996], |
| link:https://tinkerpop.apache.org/docs/3.4.0/reference/#io-step[Reference Documentation] |
| |
| ==== Per Request Options |
| |
| The Java driver now allows for various options to be set on a per-request basis via new overloads to `submit()` that |
| accept `RequestOption` instances. A good use-case for this feature is to set a per-request override to the |
| `scriptEvaluationTimeout` so that it only applies to the current request. |
| |
| [source,java] |
| ---- |
| Cluster cluster = Cluster.open(); |
| Client client = cluster.connect(); |
| RequestOptions options = RequestOptions.build().timeout(500).create(); |
| List<Result> result = client.submit("g.V()", options).all().get(); |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1342[TINKERPOP-1342] |
| |
| ==== min() max() and Comparable |
| |
| Previously `min()` and `max()` were only working for numeric values. This has been changed and these steps can now |
| operate over any `Comparable` value. The common workaround was the combination of `order().by()` and `limit()` as |
| shown here: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V().values('name').order().by().limit(1) // workaround for min() |
| ==>josh |
| gremlin> g.V().values('name').order().by(decr).limit(1) // workaround for max() |
| ==>vadas |
| ---- |
| |
| Any attempt to use `min()` or `max()` on non-numeric values lead to an exception: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V().values('name').min() |
| java.lang.String cannot be cast to java.lang.Number |
| Type ':help' or ':h' for help. |
| Display stack trace? [yN] |
| ---- |
| |
| With the changes in this release these kind of queries became a lot easier: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V().values('name').min() |
| ==>josh |
| gremlin> g.V().values('name').max() |
| ==>vadas |
| ---- |
| |
| ==== Nested Loop Support |
| |
| Traversals now support nesting of `repeat()` loops. |
| |
| These can now be used to repeat another traversal while in a looped context, either inside the body of a `repeat()` or |
| in its step modifiers (`until()` or `emit()`). |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V().repeat(__.in('traverses').repeat(__.in('develops')).emit()).emit().values('name') |
| ==>stephen |
| ==>matthias |
| ==>marko |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-967[TINKERPOP-967] |
| |
| ==== EventStrategy API |
| |
| There were some minor modifications to how `EventStrategy` is constructed and what can be expected from events raised |
| from the addition of new properties. |
| |
| With respect to the change in terms of `EventStrategy` construction, the `detach()` builder method formerly took a |
| `Class` as an argument and that `Class` was meant to be one of the various "detachment factories" or `null`. That |
| approach was a bit confusing, so that signature has changed to `detach(EventStrategy.Detachment)` where the argument |
| is a more handy enum of detachment options. |
| |
| As for the changes related to events themselves, it is first worth noting that the previously deprecated |
| `vertexPropertyChanged(Vertex, Property, Object, Object...)` on `MutationListener` has been removed for what should |
| have originally been the correct signature of `vertexPropertyChanged(Vertex, VertexProperty, Object, Object...)`. In |
| prior versions when this method and its related `edgePropertyChanged()` and `vertexPropertyPropertyChanged()` were |
| triggered by way of the addition of a new property a "fake" property was included with a `null` value for the |
| "oldValue" argument to these methods (as it did not exist prior to this event). That was a bit awkward to reason about |
| when dealing with that event. To make this easier, the event now raises with a `KeyedVertexProperty` or |
| `KeyedProperty` instance, which only contains a property key and no value in them. |
| |
| link:https://issues.apache.org/jira/browse/TINKERPOP-1831[TINKERPOP-1831] |
| |
| ==== Reducing Barrier Steps |
| |
| The behavior of `min()`, `max()`, `mean()` and `sum()` has been modified to return no result if there's no input. |
| Previously these steps yielded the internal seed value: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V().values('foo').min() |
| ==>NaN |
| gremlin> g.V().values('foo').max() |
| ==>NaN |
| gremlin> g.V().values('foo').mean() |
| ==>NaN |
| gremlin> g.V().values('foo').sum() |
| ==>0 |
| ---- |
| |
| These traversals will no longer emit a result. Note, that this also affects more complex scenarios, e.g. if these |
| steps are used in `by()` modulators: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V().group(). |
| ......1> by(label). |
| ......2> by(outE().values("weight").sum()) |
| ==>[software:0,person:3.5] |
| ---- |
| |
| Since software vertices have no outgoing edges and thus no weight values to sum, `software` will no longer show up in |
| the result. In order to get the same result as before, one would have to add a `coalesce()`-step: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V().group(). |
| ......1> by(label). |
| ......2> by(outE().values("weight").sum()) |
| ==>[person:3.5] |
| gremlin> g.V().group(). |
| ......1> by(label). |
| ......2> by(coalesce(outE().values("weight"), constant(0)).sum()) |
| ==>[software:0,person:3.5] |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1777[TINKERPOP-1777] |
| |
| ==== Order of select() Scopes |
| |
| The order of select scopes has been changed to: maps, side-effects, paths. Previously the order was: side-effects, |
| maps, paths - which made it almost impossible to select a specific map entry if a side-effect with the same name |
| existed. |
| |
| The following snippets illustrate the changed behavior: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V(1). |
| ......1> group("a"). |
| ......2> by(__.constant("a")). |
| ......3> by(__.values("name")). |
| ......4> select("a") |
| ==>[a:marko] |
| gremlin> g.V(1). |
| ......1> group("a"). |
| ......2> by(__.constant("a")). |
| ......3> by(__.values("name")). |
| ......4> select("a").select("a") |
| ==>[a:marko] |
| ---- |
| |
| Above is the old behavior; the second `select("a")` has no effect, it selects the side-effect `a` again, although one |
| would expect to get the map entry `a`. What follows is the new behavior: |
| |
| [source,groovy] |
| ---- |
| gremlin> g.V(1). |
| ......1> group("a"). |
| ......2> by(__.constant("a")). |
| ......3> by(__.values("name")). |
| ......4> select("a") |
| ==>[a:marko] |
| gremlin> g.V(1). |
| ......1> group("a"). |
| ......2> by(__.constant("a")). |
| ......3> by(__.values("name")). |
| ......4> select("a").select("a") |
| ==>marko |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1522[TINKERPOP-1522] |
| |
| ==== GraphSON BulkSet |
| |
| In earlier versions of TinkerPop, `BulkSet` was coerced to a `List` for GraphSON which was convenient in that it |
| didn't add a new data type to support, but inconvenient in that it meant that certain process tests were not consistent |
| in terms of how they ran and the benefits of the `BulkSet` were "lost" in that the "bulk" was being resolved server |
| side. With the addition of `BulkSet` as a GraphSON type the "bulk" is now resolved on the client side by the language |
| variant. How that resolution occurs depends upon the language variant. For Java, there is a `BulkSet` object which |
| maintains that structure sent from the server. For the other variants, the `BulkSet` is deserialized to a `List` form |
| which results in a much larger memory footprint than what is contained the `BulkSet`. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2111[TINKERPOP-2111] |
| |
| ==== Python Bindings |
| |
| Bindings were formerly created using a Python 2-tuple as a bit of syntactic sugar, but all other language variants |
| used an explicit `Bindings` object which `gremlin-python` already had in place. To make all work variants behave |
| consistently, the 2-tuple syntax has been removed in favor of the explicit `Bindings.of()` option. |
| |
| [source,python] |
| ---- |
| g.V(Bindings.of('id',1)).out('created').map(lambda: ("it.get().value('name').length()", "gremlin-groovy")).sum() |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2116[TINKERPOP-2116] |
| |
| ==== Deprecation and Removal |
| |
| This section describes newly deprecated classes, methods, components and patterns of usage as well as which previously |
| deprecated features have been officially removed or repurposed. |
| |
| ===== Moving of RemoteGraph |
| |
| `RemoteGraph` was long ago deprecated in favor of `withRemote()`. It became even less useful with the introduction of |
| the `AnonymousTraversalSource` concept in 3.3.5. It's only real use was for testing remote bytecode based traversals |
| in the test suite as the test suite requires an actual `Graph` object to function properly. As such, `RemoteGraph` has |
| been moved to `gremlin-test`. It should no longer be used in any capacity besides that. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2079[TINKERPOP-2079] |
| |
| ===== Removal of Giraph Support |
| |
| Support for Giraph has been removed as of this version. There were a number of reasons for this decision which were |
| discussed in the community prior to taking this step. Users should switch to Spark for their OLAP based graph-computing |
| needs. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1930[TINKERPOP-1930] |
| |
| ===== Removal of Rebindings Options |
| |
| The "rebindings" option is no longer supported for clients. It was deprecated long ago at 3.1.0. The server will not |
| respond to them on any channel - websockets, nio or HTTP. Use the "aliases" option instead. |
| |
| link:https://issues.apache.org/jira/browse/TINKERPOP-1705[TINKERPOP-1705] |
| |
| ===== gremlin-server.sh -i Removal |
| |
| The `-i` option for installing dependencies in Gremlin Server was long ago deprecated and has now been removed. Please |
| use `install` as its replacement going forward. |
| |
| link:https://issues.apache.org/jira/browse/TINKERPOP-2031[TINKERPOP-2031] |
| |
| ===== Deprecation Removal |
| |
| The following deprecated classes, methods or fields have been removed in this version: |
| |
| * `gremlin-core` |
| ** `org.apache.tinkerpop.gremlin.jsr223.ImportCustomizer#GREMLIN_CORE` |
| ** `org.apache.tinkerpop.gremlin.process.remote.RemoteGraph` - moved to `gremlin-test` |
| ** `org.apache.tinkerpop.gremlin.process.remote.RemoteConnection.submit(Traversal)` |
| ** `org.apache.tinkerpop.gremlin.process.remote.RemoteConnection.submit(Bytecode)` |
| ** `org.apache.tinkerpop.gremlin.process.remote.traversal.strategy.decoration.RemoteStrategy#identity()` |
| ** `org.apache.tinkerpop.gremlin.process.traversal.TraversalEngine` |
| ** `org.apache.tinkerpop.gremlin.process.traversal.engine.*` |
| ** `org.apache.tinkerpop.gremlin.process.traversal.strategy.decoration.PartitionStrategy.Builder#addReadPartition(String)` |
| ** `org.apache.tinkerpop.gremlin.process.traversal.strategy.decoration.SubgraphStrategy.Builder#edgeCriterion(Traversal)` |
| ** `org.apache.tinkerpop.gremlin.process.traversal.strategy.decoration.SubgraphStrategy.Builder#vertexCriterion(Traversal)` |
| ** `org.apache.tinkerpop.gremlin.process.traversal.step.map.LambdaCollectingBarrierStep.Consumers` |
| ** `org.apache.tinkerpop.gremlin.process.traversal.step.util.HasContainer#makeHasContainers(String, P)` |
| ** `org.apache.tinkerpop.gremlin.process.traversal.step.util.event.MutationListener#vertexPropertyChanged(Vertex, Property, Object, Object...)` |
| ** `org.apache.tinkerpop.gremlin.structure.Element.Exceptions#elementAlreadyRemoved(Class, Object)` |
| ** `org.apache.tinkerpop.gremlin.structure.Graph.Exceptions#elementNotFound(Class, Object)` |
| ** `org.apache.tinkerpop.gremlin.structure.Graph.Exceptions#elementNotFound(Class, Object, Exception)` |
| * `gremlin-driver` |
| ** `org.apache.tinkerpop.gremlin.driver.Client#rebind(String)` |
| ** `org.apache.tinkerpop.gremlin.driver.Client.ReboundClusterdClient` |
| ** `org.apache.tinkerpop.gremlin.driver.Tokens#ARGS_REBINDINGS` |
| * `gremlin-groovy` |
| ** `org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.close()` - no longer implements `AutoCloseable` |
| * `gremlin-server` |
| ** `org.apache.tinkerpop.gremlin.server.GraphManager#getGraphs()` |
| ** `org.apache.tinkerpop.gremlin.server.GraphManager#getTraversalSources()` |
| ** `org.apache.tinkerpop.gremlin.server.Settings#serializedResponseTimeout` |
| ** `org.apache.tinkerpop.gremlin.server.Settings.AuthenticationSettings#className` |
| ** `org.apache.tinkerpop.gremlin.server.handler.OpSelectorHandler(Settings, GraphManager, GremlinExecutor, ScheduledExecutorService)` |
| ** `org.apache.tinkerpop.gremlin.server.op.AbstractOpProcessor#makeFrame(ChannelHandlerContext, RequestMessage, MessageSerializer serializer, boolean, List, ResponseStatusCode code)` |
| * `hadoop-graph` |
| ** `org.apache.tinkerpop.gremlin.hadoop.structure.HadoopConfiguration#getGraphInputFormat()` |
| ** `org.apache.tinkerpop.gremlin.hadoop.structure.HadoopConfiguration#getGraphOutputFormat()` |
| |
| Please see the javadoc deprecation notes or upgrade documentation specific to when the deprecation took place to |
| understand how to resolve this breaking change. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1143[TINKERPOP-1143], |
| link:https://issues.apache.org/jira/browse/TINKERPOP-1296[TINKERPOP-1296], |
| link:https://issues.apache.org/jira/browse/TINKERPOP-1705[TINKERPOP-1705], |
| link:https://issues.apache.org/jira/browse/TINKERPOP-1707[TINKERPOP-1707], |
| link:https://issues.apache.org/jira/browse/TINKERPOP-1954[TINKERPOP-1954], |
| link:https://issues.apache.org/jira/browse/TINKERPOP-1986[TINKERPOP-1986], |
| link:https://issues.apache.org/jira/browse/TINKERPOP-2079[TINKERPOP-2079], |
| link:https://issues.apache.org/jira/browse/TINKERPOP-2103[TINKERPOP-2103] |
| |
| ===== Deprecated GraphSONMessageSerializerGremlinV2d0 |
| |
| The `GraphSONMessageSerializerGremlinV2d0` serializer is now analogous to `GraphSONMessageSerializerV2d0` and therefore |
| redundant. It has technically always been equivalent in terms of functionality as both serialized to the same format |
| (i.e. GraphSON 2.0 with embedded types). It is no longer clear why these two classes were established this way, but |
| it does carry the negative effect where multiple serializer versions could not be bound to Gremlin Server's HTTP |
| endpoint as the MIME types conflicted on `application/json`. By simply making both message serializers support |
| `application/json` and `application/vnd.gremlin-v2.0+json`, it then became possible to overcome that limitation. In |
| short, prefer use of `GraphSONMessageSerializerV2d0` when possible. |
| |
| Note that this is a breaking change in the sense that `GraphSONMessageSerializerV2d0` will no longer set the header of |
| requests messages to `application/json`. As a result, older versions of Gremlin Server not configured with |
| `GraphSONMessageSerializerGremlinV2d0` will not find a deserializer to match the request. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1984[TINKERPOP-1984] |
| |
| ===== Removed groovy-sql Dependency |
| |
| Gremlin Console and Gremlin Server no longer include groovy-sql. If you depend on groovy-sql, |
| you can install it in Gremlin Console or Gremlin Server using the plugin system. |
| |
| Console: |
| ``` |
| :install org.codehaus.groovy groovy-sql 2.5.2 |
| ``` |
| |
| Server: |
| ``` |
| bin/gremlin-server.sh install org.codehaus.groovy groovy-sql 2.5.2 |
| ``` |
| |
| If your project depended on groovy-sql transitively, simply include it in your project's build file (e.g. maven: pom.xml). |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2037[TINKERPOP-2037] |
| |
| === Upgrading for Providers |
| |
| ==== Graph Database Providers |
| |
| ===== io() Step |
| |
| The new `io()`-step that was introduced provides some new changes to consider. Note that `Graph.io()` has been |
| deprecated and users are no longer instructed to utilize that method. It is not yet decided when that method will be |
| removed completely, but given the public nature of it and the high chance of common usage, it should be hanging around |
| for some time. |
| |
| As with any step in Gremlin, it is possible to replace it with a more provider specific implementation that could be |
| more efficient. Developing a `TraversalStrategy` to do this is encouraged, especially for those graph providers who |
| might have special bulk loaders that could be abstracted by this step. Examples of this are already shown with |
| `HadoopGraph` which replaces the simple single-threaded loader with `CloneVertexProgram`. Graph providers are |
| encouraged to use the `with()` step to capture any necessary configurations required for their underlying loader to |
| work. Graph providers should not feel restricted to `graphson`, `gryo` and `graphml` formats either. If a graph |
| supports CSV or some custom graph specific format, it shouldn't be difficult to gather the configurations necessary to |
| make that available to users. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1996[TINKERPOP-1996] |
| |
| ===== Caching Graph Features |
| |
| For graph implementations that have expensive creation times, it can be time consuming to run the TinkerPop test suite |
| as each test run requires a `Graph` instance even if the test is ultimately ignored becaue it doesn't pass the feature |
| checks. To possibly help alleviate this problem, the `GraphProvider` interface now includes this method: |
| |
| [source,java] |
| ---- |
| public default Optional<Graph.Features> getStaticFeatures() { |
| return Optional.empty(); |
| } |
| ---- |
| |
| This method can be implemented to return a cacheable set of features for a `Graph` generated from that `GraphProvider`. |
| Assuming this method is faster than the cost of creating a new `Graph` instance, the test suite should execute |
| significantly faster depending on how many tests end up being ignored. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1518[TINKERPOP-1518] |
| |
| ===== Configuring Interface |
| |
| There were some changes to interfaces that were related to `Step`. A new `Configuring` interface was added that was |
| helpful in the implementation of the `with()`-step modulator. This new interface extends the `Parameterizing` interface |
| (which moved to the `org.apache.tinkerpop.gremlin.process.traversal.step` package with the other step interfaces) and |
| in turn is extended by the `Mutating` interface. Making this change meant that the `Mutating.addPropertyMutations()` |
| method could be removed in favor of the new `Configuring.configure()` method. |
| |
| All of the changes above basically mean, that if the `Mutating` interface was being used in prior versions, the |
| `addPropertyMutations()` method simply needs to be changed to `configure()`. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1975[TINKERPOP-1975] |
| |
| ===== OptionsStrategy |
| |
| `OptionsStrategy` is a `TraversalStrategy` that makes it possible for users to set arbitrary configurations on a |
| `Traversal`. These configurations can be used by graph providers to allow for traversal-level configurations to be |
| accessible to their custom steps. A user would write something like: |
| |
| [source,java] |
| ---- |
| g.withStrategies(OptionsStrategy.build().with("specialLimit", 10000).create()).V(); |
| ---- |
| |
| The `OptionsStrategy` is really only the carrier for the configurations and while users can choose to utilize that |
| more verbose method for constructing it shown above, it is more elegantly constructed as follows using `with()`-step: |
| |
| [source,java] |
| ---- |
| g.with("specialLimit", 10000)).V(); |
| ---- |
| |
| The graph provider could then access that value of "specialLimit" in their custom step (or elsewhere) as follows: |
| |
| [source,java] |
| ---- |
| OptionsStrategy strategy = this.getTraversal().asAdmin().getStrategies().getStrategy(OptionsStrategy.class).get(); |
| int specialLimit = (int) strategy.getOptions().get("specialLimit"); |
| ---- |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2053[TINKERPOP-2053] |
| |
| ===== Removed hadoop-gremlin Test Artifact |
| |
| The `hadoop-gremlin` module no longer generates a test jar that can be used as a test dependency in other modules. |
| Generally speaking, that approach tends to be a bad practice and can cause build problems with Maven that aren't always |
| obvious to troubleshoot. With the removal of `giraph-gremlin` for 3.4.0, it seemed even less useful to have this |
| test artifact present. All tests are still present. The follow provides a basic summary of how this refactoring |
| occurred: |
| |
| * A new `AbstractFileGraphProvider` was created in `gremlin-test` which provided a lot of the features that |
| `HadoopGraphProvider` was exposing. Both `HadoopGraphProvider` and `SparkHadoopGraphProvider` extend from that class |
| now. |
| * `ToyIoRegistry` and related classes were moved to `gremlin-test`. |
| * The various tests that validated capabilities of `Storage` have been moved to `spark-gremlin` and are part of those |
| tests now. Obviously, that makes those tests specific to Spark testing now. If that location creates a problem for some |
| reason, that decision can be revisited at some point. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1410[TINKERPOP-1410] |
| |
| ===== TraversalEngine Moved |
| |
| The `TraversalEngine` interface was deprecated in 3.2.0 along with all related methods that used it and classes that |
| implemented it. It was replaced by the `Computer` interface and provided a much nicer way to plug different |
| implementations of `Computer` into a traversal. `TraversalEngine` was never wholly removed however as it had some deep |
| dependencies in the inner workings of the test suite. That infrastructure has largely remained as is until now. |
| |
| As of 3.4.0, `TraversalEngine` is no longer in `gremlin-core` and can instead be found in `gremlin-test` as it is |
| effectively a "test-only" component and serves no other real function. As explained in the javadocs going back to |
| 3.2.0, providers should implement the `Computer` class and use that instead. At this point, graph providers should have |
| long ago moved to the `Computer` infrastructure as methods for constructing a `TraversalSource` with a |
| `TraversalEngine` were long ago removed. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1143[TINKERPOP-1143] |
| |
| ===== Upsert Graph Feature |
| |
| Some `Graph` implementations may be able to offer upsert functionality for vertices and edges, which can help improve |
| usability and performance. To help make it clear to users that a graph operates in this fashion, the `supportsUpsert()` |
| feature has been added to both `Graph.VertexFeatures` and `Graph.EdgeFeatures`. By default, both of these methods will |
| return `false`. |
| |
| Should a provider wish to support this feature, the behavior of `addV()` and/or `addE()` should change such that when |
| a vertex or edge with the same identifier is provided, the respective step will insert the new element if that value |
| is not present or update an existing element if it is found. The method by which the provider "identifies" an element |
| is completely up to the capabilities of that provider. In the most simple fashion, a graph could simply check the |
| value of the supplied `T.id`, however graphs that support some form of schema will likely have other methods for |
| determining whether or not an existing element is present. |
| |
| The extent to which TinkerPop tests "upsert" is fairly narrow. Graph providers that choose to support this feature |
| should consider their own test suites carefully to ensure appropriate coverage. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-1685[TINKERPOP-1685] |
| |
| ===== TypeTranslator Changes |
| |
| The `TypeTranslator` experienced a change in its API and `GroovyTranslator` a change in expectations. |
| |
| `TypeTranslator` now implements `BiFunction` and takes the graph traversal source name as an argument along with the |
| object to translate. This interface is implemented by default for Groovy with `GroovyTranslator.DefaultTypeTranslator` |
| which encapsulates all the functionality of what `GroovyTranslator` formerly did by default. To provide customize |
| translation, simply extend the `DefaultTypeTranslator` and override the methods. |
| |
| `GroovyTranslator` now expects that the `TypeTranslator` provide to it as part of its `of()` static method overload |
| is "complete" - i.e. that it provides all the functionality to translate the types passed to it. Thus, a "complete" |
| `TypeTranslator` is one that does everything that `DefaultTypeTranslator` does as a minimum requirement. Therefore, |
| the extension model described above is the easiest way to get going with a custom `TypeTranslator` approach. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2072[TINKERPOP-2072] |
| |
| ==== Graph Driver Providers |
| |
| ===== Deprecation Removal in RemoteConnection |
| |
| The two deprecated synchronous `submit()` methods on the `RemoteConnection` interface have been removed, which means |
| that `RemoteConnection` implementations will need to implement `submitAsync(Bytecode)` if they have not already done |
| so. |
| |
| See: link:https://issues.apache.org/jira/browse/TINKERPOP-2103[TINKERPOP-2103] |