| //// |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| //// |
| |
| image::apache-tinkerpop-logo.png[width=500,link="https://tinkerpop.apache.org"] |
| |
| *x.y.z* |
| |
| == Gremlin Language Variants |
| |
| Gremlin is an embeddable query language that can be represented using the constructs of a host programming language. |
| Any programming language that supports link:https://en.wikipedia.org/wiki/Function_composition[function composition] |
| (e.g. fluent chaining) and link:https://en.wikipedia.org/wiki/Nested_function[function nesting] (e.g. call stacks) |
| can support Gremlin. Nearly every modern programming language is capable of meeting both requirements. |
| With Gremlin, the distinction between a programming language and a query language is not as large as they |
| have historically been. For instance, with Gremlin-Java, the developer is able to have their application code and their |
| graph database queries at the same level of abstraction -- both written in Java. A simple example is presented below |
| where the `MyApplication` Java class contains both application-level and database-level code written in Java. |
| |
| image::gremlin-house-of-mirrors.png[width=1024] |
| |
| WARNING: This is an advanced tutorial intended for experts knowledgeable in Gremlin in particular and TinkerPop in |
| general. Moreover, the audience should understand advanced programming language concepts such as reflection, |
| meta-programming, source code generation, and virtual machines. |
| |
| [source,java] |
| ---- |
| public class MyApplication { |
| |
| public static void run(String[] args) { |
| |
| // assumes args[0] is a configuration file location |
| Graph graph = GraphFactory.open(args[0]); |
| GraphTraversalSource g = traversal().withEmbedded(graph); |
| |
| // assumes that args[1] and args[2] are range boundaries |
| Iterator<Map<String,Double>> result = |
| g.V().hasLabel("product"). |
| order().by("unitPrice", asc). |
| range(Integer.valueOf(args[1]), Integer.valueOf(args[2])). |
| valueMap("name", "unitPrice") |
| |
| while(result.hasNext()) { |
| Map<String,Double> map = result.next(); |
| System.out.println(map.get("name") + " " + map.get("unitPrice")); |
| } |
| } |
| } |
| ---- |
| |
| In query languages like link:https://en.wikipedia.org/wiki/SQL[SQL], the user must construct a string representation of |
| their query and submit it to the database for evaluation. This is because SQL cannot be expressed in Java as they use |
| fundamentally different constructs in their expression. The same example above is presented below using SQL and the |
| link:https://en.wikipedia.org/wiki/Java_Database_Connectivity[JDBC] interface. The take home point is that Gremlin does |
| not exist outside the programming language in which it will be used. Gremlin was designed to be able to be |
| embedded in any modern programming language and thus, always free from the complexities of string manipulation as seen |
| in other database and analytics query languages. |
| |
| [source,java] |
| ---- |
| public class MyApplication { |
| |
| public static void run(final String[] args) { |
| |
| // assumes args[0] is a URI to the database |
| Connection connection = DriverManager.getConnection(args[0]) |
| Statement statement = connection.createStatement(); |
| |
| // assumes that args[1] and args[2] are range boundaries |
| ResultSet result = statement.executeQuery( |
| "SELECT Products.ProductName, Products.UnitPrice \n" + |
| " FROM (SELECT ROW_NUMBER() \n" + |
| " OVER ( \n" + |
| " ORDER BY UnitPrice) AS [ROW_NUMBER], \n" + |
| " ProductID \n" + |
| " FROM Products) AS SortedProducts \n" + |
| " INNER JOIN Products \n" + |
| " ON Products.ProductID = SortedProducts.ProductID \n" + |
| " WHERE [ROW_NUMBER] BETWEEN " + args[1] + " AND " + args[2] + " \n" + |
| "ORDER BY [ROW_NUMBER]" |
| |
| while(result.hasNext()) { |
| result.next(); |
| System.out.println(result.getString("Products.ProductName") + " " + result.getDouble("Products.UnitPrice")); |
| } |
| } |
| } |
| ---- |
| |
| The purpose of this tutorial is to explain how to develop a _Gremlin language variant_. That is, for those developers |
| who are interested in supporting Gremlin in their native language and there currently does not exist a (good) Gremlin |
| variant in their language, they can develop one for the Apache TinkerPop community (and their language community in |
| general). In this tutorial, link:https://www.python.org/[Python] will serve as the host language and two typical |
| implementation models will be presented. |
| |
| 1. <<using-jython-and-the-jvm,**Using Jython and the JVM**>>: This is perhaps the easiest way to produce a Gremlin |
| language variant. With link:https://www.jcp.org/en/jsr/detail?id=223[JSR-223], any language compiler written for the JVM |
| can directly access the JVM and any of its libraries (including Gremlin-Java). |
| |
| 2. <<using-python-and-remoteconnection,**Using Python and GremlinServer**>>: This model requires that there exist a Python |
| class that mimics Gremlin-Java's `GraphTraversal` API. With each method call of this Python class, Gremlin `Bytecode` is |
| generated which is ultimately translated into a Gremlin variant that can execute the traversal (e.g. Gremlin-Java). |
| |
| IMPORTANT: Apache TinkerPop's Gremlin-Java is considered the idiomatic, standard implementation of Gremlin. |
| Any Gremlin language variant, regardless of the implementation model chosen, **must**, within the constraints of the |
| host language, be in 1-to-1 correspondence with Gremlin-Java. This ensures that language variants are collectively |
| consistent and easily leveraged by anyone versed in Gremlin. |
| |
| IMPORTANT: The "Gremlin-Python" presented in this tutorial is basic and provided to show the primary techniques used to |
| construct a Gremlin language variant. Apache TinkerPop distributes with a full fledged |
| link:https://tinkerpop.apache.org/docs/x.y.z/reference/#gremlin-python[Gremlin-Python] variant that uses many of the |
| techniques presented in this tutorial. |
| |
| [[language-drivers-vs-language-variants]] |
| == Language Drivers vs. Language Variants |
| |
| Before discussing how to implement a Gremlin language variant in Python, it is necessary to understand two concepts |
| related to Gremlin language development. There is a difference between a _language driver_ and a _language variant_ |
| and it is important that these two concepts (and their respective implementations) remain separate. |
| |
| === Language Drivers |
| |
| image:language-drivers.png[width=375,float=right] A Gremlin language driver is a software library that is able to |
| communicate with a TinkerPop-enabled graph system whether directly via the JVM or indirectly via |
| link:https://tinkerpop.apache.org/docs/x.y.z/reference/#connecting-gremlin-server[Gremlin Server] Gremlin Server or some |
| other link:https://tinkerpop.apache.org/docs/x.y.z/reference/#connecting-rgp[RemoteConnection] enabled graph system. |
| Language drivers are responsible for submitting Gremlin traversals to a TinkerPop-enabled graph system and |
| returning results to the developer that are within the developer's language's type system. |
| For instance, resultant doubles should be coerced to floats in Python. |
| |
| This tutorial is not about language drivers, but about language variants. Moreover, community libraries should make |
| this distinction clear and **should not** develop libraries that serve both roles. Language drivers will be useful to |
| a collection of Gremlin variants within a language community -- able to support `GraphTraversal`-variants as well as |
| also other link:https://en.wikipedia.org/wiki/Domain-specific_language[DSL]-variants (e.g. `SocialTraversal`). |
| |
| NOTE: `GraphTraversal` is a particular Gremlin domain-specific language (link:https://en.wikipedia.org/wiki/Domain-specific_language[DSL]), |
| albeit the most popular and foundational DSL. If another DSL is created, then the same techniques discussed in this |
| tutorial for `GraphTraversal` apply to `XXXTraversal`. |
| |
| === Language Variants |
| |
| image:language-variants.png[width=375,float=right] A Gremlin language variant is a software library that allows a |
| developer to write a Gremlin traversal within their native programming language. The language variant is responsible |
| for creating Gremlin `Bytecode` that will ultimately be translated and compiled to a `Traversal` by a |
| TinkerPop-enabled graph system. |
| |
| Every language variant, regardless of the implementation details, will have to account for the four core concepts below: |
| |
| 1. `Graph` (**data**): The source of the graph data to be traversed and the interface which enables the creation of a |
| `GraphTraversalSource` (via `traversal().withEmbedded(graph)`). |
| |
| 2. `GraphTraversalSource` (**compiler**): This is the typical `g` reference. A `GraphTraversalSource` maintains the |
| `withXXX()`-strategy methods as well as the "traversal spawn"-methods such as `V()`, `E()`, `addV()`, etc. |
| A traversal source's registered `TraversalStrategies` determine how the submitted traversal will be ultimately |
| evaluated. |
| |
| 3. `GraphTraversal` (**function composition**): A graph traversal maintains the computational steps such as `out()`, `groupCount()`, |
| `match()`, etc. This fluent interface supports method chaining and thus, a linear "left-to-right" representation of a |
| traversal/query. |
| |
| 4. `__` (**function nesting**) : The anonymous traversal class is used for passing a traversal as an argument to a |
| parent step. For example, in `+repeat(__.out())+`, `+__.out()+` is an anonymous traversal passed to the traversal parent |
| `repeat()`. Anonymous traversals enable the "top-to-bottom" representation of a traversal. |
| |
| 5. `Bytecode` (**language agnostic encoding**): The source and traversal steps and their arguments are encoded in a |
| language agnostic representation called Gremlin bytecode. This representation is a nested list of the form |
| `[step,[args*]]*`. |
| |
| Both `GraphTraversal` and `+__+` define the structure of the Gremlin language. Gremlin is a _two-dimensional language_ |
| supporting linear, nested step sequences. Historically, many Gremlin language variants have failed to make the |
| distinctions above clear and in doing so, either complicate their implementations or yield variants that are not in |
| 1-to-1 correspondence with Gremlin-Java. By keeping these concepts clear when designing a language variant, the |
| construction of the Gremlin bytecode representation is easy. |
| |
| IMPORTANT: The term "Gremlin-Java" denotes the language that is defined by `GraphTraversalSource`, `GraphTraversal`, |
| and `__`. These three classes exist in `org.apache.tinkerpop.gremlin.process.traversal.dsl.graph` and form the |
| definitive representation of the Gremlin traversal language. |
| |
| == Gremlin-Jython and Gremlin-Python |
| |
| [[using-jython-and-the-jvm]] |
| === Using Jython and the JVM |
| |
| image:jython-logo.png[width=200,float=left,link="http://www.jython.org/"] link:http://www.jython.org/[Jython] provides a |
| link:https://www.jcp.org/en/jsr/detail?id=223[JSR-223] `ScriptEngine` implementation that enables the evaluation of |
| Python on the link:https://en.wikipedia.org/wiki/Java_virtual_machine[Java virtual machine]. In other words, Jython's |
| virtual machine is not the standard link:https://wiki.python.org/moin/CPython[CPython] reference implementation |
| distributed with most operating systems, but instead the JVM. The benefit of Jython is that Python code and classes |
| can easily interact with the Java API and any Java packages on the `CLASSPATH`. In general, any JSR-223 Gremlin language |
| variant is trivial to "implement." |
| |
| [source,python] |
| ---- |
| Jython 2.7.0 (default:9987c746f838, Apr 29 2015, 02:25:11) |
| [Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.8.0_40 |
| Type "help", "copyright", "credits" or "license" for more information. |
| >>> import sys |
| # this list is longer than displayed, including all jars in lib/, not just Apache TinkerPop jars |
| # there is probably a more convenient way of importing jars in Jython though, at the time of writing, no better solution was found. |
| >>> sys.path.append("/usr/local/apache-gremlin-console-x.y.z-standalone/lib/gremlin-console-x.y.z.jar") |
| >>> sys.path.append("/usr/local/apache-gremlin-console-x.y.z-standalone/lib/gremlin-core-x.y.z.jar") |
| >>> sys.path.append("/usr/local/apache-gremlin-console-x.y.z-standalone/lib/gremlin-driver-x.y.z.jar") |
| >>> sys.path.append("/usr/local/apache-gremlin-console-x.y.z-standalone/lib/gremlin-shaded-x.y.z.jar") |
| >>> sys.path.append("/usr/local/apache-gremlin-console-x.y.z-standalone/ext/tinkergraph-gremlin/lib/tinkergraph-gremlin-x.y.z.jar") |
| # import Java classes |
| >>> from org.apache.tinkerpop.gremlin.tinkergraph.structure import TinkerFactory |
| >>> from org.apache.tinkerpop.gremlin.process.traversal.dsl.graph import __ |
| >>> from org.apache.tinkerpop.gremlin.process.traversal import * |
| >>> from org.apache.tinkerpop.gremlin.structure import * |
| # create the toy "modern" graph and spawn a GraphTraversalSource |
| >>> graph = TinkerFactory.createModern() |
| >>> g = graph.traversal() |
| # The Jython shell does not automatically iterate Iterators like the GremlinConsole |
| >>> g.V().hasLabel("person").out("knows").out("created") |
| [GraphStep(vertex,[]), HasStep([~label.eq(person)]), VertexStep(OUT,[knows],vertex), VertexStep(OUT,[created],vertex)] |
| # toList() will do the iteration and return the results as a list |
| >>> g.V().hasLabel("person").out("knows").out("created").toList() |
| [v[5], v[3]] |
| >>> g.V().repeat(__.out()).times(2).values("name").toList() |
| [ripple, lop] |
| # results can be interacted with using Python |
| >>> g.V().repeat(__.out()).times(2).values("name").toList()[0] |
| u'ripple' |
| >>> g.V().repeat(__.out()).times(2).values("name").toList()[0][0:3].upper() |
| u'RIP' |
| >>> |
| ---- |
| |
| Most every JSR-223 `ScriptEngine` language will allow the developer to immediately interact with `GraphTraversal`. |
| The benefit of this model is that nearly every major programming language has a respective `ScriptEngine`: |
| link:https://en.wikipedia.org/wiki/Nashorn_(JavaScript_engine)[JavaScript], link:http://groovy-lang.org/[Groovy], |
| link:http://www.scala-lang.org/[Scala], Lisp (link:https://clojure.org/[Clojure]), link:http://jruby.org/[Ruby], etc. A |
| list of implementations is provided link:https://en.wikipedia.org/wiki/List_of_JVM_languages[here]. |
| |
| ==== Traversal Wrappers |
| |
| While it is possible to simply interact with Java classes in a `ScriptEngine` implementation, such Gremlin language |
| variants will not leverage the unique features of the host language. It is for this reason that JVM-based language |
| variants such as link:https://github.com/mpollmeier/gremlin-scala[Gremlin-Scala] were developed. Scala provides many |
| syntax niceties not available in Java. To leverage these niceties, Gremlin-Scala "wraps" `GraphTraversal` in order to |
| provide Scala-idiomatic extensions. Another example is Apache TinkerPop's Gremlin-Groovy which does the same via the |
| link:https://tinkerpop.apache.org/docs/x.y.z/reference/#sugar-plugin[Sugar plugin], but uses |
| link:http://groovy-lang.org/metaprogramming.html[meta-programming] instead of object wrapping, where "behind the |
| scenes," Groovy meta-programming is doing object wrapping. |
| |
| The Jython example below uses Python meta-programming to add functionality to `GraphTraversal`. In particular, the |
| `+__getitem__+` and `+__getattr__+` "magic methods" are leveraged. |
| |
| [source,python] |
| ---- |
| def getitem_bypass(self, index): |
| if isinstance(index,int): |
| return self.range(index,index+1) |
| elif isinstance(index,slice): |
| return self.range(index.start,index.stop) |
| else: |
| return TypeError('Index must be int or slice')"); |
| GraphTraversal.__getitem__ = getitem_bypass |
| GraphTraversal.__getattr__ = lambda self, key: self.values(key) |
| ---- |
| |
| The two methods `+__getitem__+` and `+__getattr__+` support Python _slicing_ and _object attribute interception_, |
| respectively. In this way, the host language is able to use its native constructs in a meaningful way within a |
| Gremlin traversal. |
| |
| IMPORTANT: Gremlin-Java serves as the standard/default representation of the Gremlin traversal language. Any Gremlin |
| language variant **must** provide all the same functionality (methods) as `GraphTraversal`, but **can** extend it |
| with host language specific constructs. This means that the extensions **must** compile to `GraphTraversal`-specific |
| steps. A Gremlin language variant **should not** add steps/methods that do not exist in `GraphTraversal`. If an extension |
| is desired, the language variant designer should submit a proposal to link:https://tinkerpop.apache.org[Apache TinkerPop] |
| to have the extension added to a future release of Gremlin. |
| |
| [[using-python-and-remoteconnection]] |
| === Using Python and RemoteConnection |
| |
| image:python-logo.png[width=125,float=left,link="https://www.python.org/"] The JVM is a powerful piece of technology |
| that has, over the years, become a meeting ground for developers from numerous language communities. However, not all |
| applications will use the JVM. Given that Apache TinkerPop is a Java-framework, there must be a way for two different |
| virtual machines to communicate traversals and their results. This section presents the second Gremlin language |
| variant implementation model which does just that. |
| |
| NOTE: Apache TinkerPop is a JVM-based graph computing framework. Most graph databases and processors today are built |
| on the JVM. This makes it easy for these graph system providers to implement Apache TinkerPop. However, TinkerPop is more |
| than its graph API and tools -- it is also the Gremlin traversal machine and language. While Apache's Gremlin traversal |
| machine was written for the JVM, its constructs are simple and can/should be ported to other VMs for those graph systems |
| that are not JVM-based. A theoretical review of the concepts behind the Gremlin traversal machine is provided in |
| link:http://arxiv.org/abs/1508.03843[this article]. |
| |
| This section's Gremlin language variant design model does not leverage the JVM directly. Instead, it constructs a |
| `Bytecode` representation of a `Traversal` that will ultimately be evaluated by `RemoteConnection` (e.g. GremlinServer). |
| It is up to the language variant designer to choose a _language driver_ to use for submitting the generated bytecode and |
| coercing its results. The language driver is the means by which, for this example, the CPython |
| VM communicates with the JVM. |
| |
| [source,bash] |
| ---- |
| # sudo easy_install pip |
| $ pip install gremlinpython |
| ---- |
| |
| The Groovy source code below uses Java reflection to generate a Python class that is in 1-to-1 correspondence with |
| Gremlin-Java. |
| |
| [source,groovy] |
| ---- |
| class GraphTraversalSourceGenerator { |
| |
| public static void create(final String graphTraversalSourceFile) { |
| |
| final StringBuilder pythonClass = new StringBuilder() |
| |
| pythonClass.append("from .traversal import Traversal\n") |
| pythonClass.append("from .traversal import TraversalStrategies\n") |
| pythonClass.append("from .traversal import Bytecode\n") |
| pythonClass.append("from ..driver.remote_connection import RemoteStrategy\n") |
| pythonClass.append("from .. import statics\n\n") |
| |
| ////////////////////////// |
| // GraphTraversalSource // |
| ////////////////////////// |
| pythonClass.append( |
| """class GraphTraversalSource(object): |
| def __init__(self, graph, traversal_strategies, bytecode=None): |
| self.graph = graph |
| self.traversal_strategies = traversal_strategies |
| if bytecode is None: |
| bytecode = Bytecode() |
| self.bytecode = bytecode |
| def __repr__(self): |
| return "graphtraversalsource[" + str(self.graph) + "]" |
| """) |
| GraphTraversalSource.getMethods(). // SOURCE STEPS |
| findAll { GraphTraversalSource.class.equals(it.returnType) }. |
| findAll { |
| !it.name.equals("clone") && |
| !it.name.equals(TraversalSource.Symbols.withRemote) |
| }. |
| collect { SymbolHelper.toPython(it.name) }. |
| unique(). |
| sort { a, b -> a <=> b }. |
| forEach { method -> |
| pythonClass.append( |
| """ def ${method}(self, *args): |
| source = GraphTraversalSource(self.graph, TraversalStrategies(self.traversal_strategies), Bytecode(self.bytecode)) |
| source.bytecode.add_source("${SymbolHelper.toJava(method)}", *args) |
| return source |
| """) |
| } |
| pythonClass.append( |
| """ def withRemote(self, remote_connection): |
| source = GraphTraversalSource(self.graph, TraversalStrategies(self.traversal_strategies), Bytecode(self.bytecode)) |
| source.traversal_strategies.add_strategies([RemoteStrategy(remote_connection)]) |
| return source |
| """) |
| GraphTraversalSource.getMethods(). // SPAWN STEPS |
| findAll { GraphTraversal.class.equals(it.returnType) }. |
| collect { SymbolHelper.toPython(it.name) }. |
| unique(). |
| sort { a, b -> a <=> b }. |
| forEach { method -> |
| pythonClass.append( |
| """ def ${method}(self, *args): |
| traversal = GraphTraversal(self.graph, self.traversal_strategies, Bytecode(self.bytecode)) |
| traversal.bytecode.add_step("${SymbolHelper.toJava(method)}", *args) |
| return traversal |
| """) |
| } |
| pythonClass.append("\n\n") |
| |
| //////////////////// |
| // GraphTraversal // |
| //////////////////// |
| pythonClass.append( |
| """class GraphTraversal(Traversal): |
| def __init__(self, graph, traversal_strategies, bytecode): |
| Traversal.__init__(self, graph, traversal_strategies, bytecode) |
| def __getitem__(self, index): |
| if isinstance(index, int): |
| return self.range(index, index + 1) |
| elif isinstance(index, slice): |
| return self.range(index.start, index.stop) |
| else: |
| raise TypeError("Index must be int or slice") |
| def __getattr__(self, key): |
| return self.values(key) |
| """) |
| GraphTraversal.getMethods(). |
| findAll { GraphTraversal.class.equals(it.returnType) }. |
| findAll { !it.name.equals("clone") }. |
| collect { SymbolHelper.toPython(it.name) }. |
| unique(). |
| sort { a, b -> a <=> b }. |
| forEach { method -> |
| pythonClass.append( |
| """ def ${method}(self, *args): |
| self.bytecode.add_step("${SymbolHelper.toJava(method)}", *args) |
| return self |
| """) |
| }; |
| pythonClass.append("\n\n") |
| |
| //////////////////////// |
| // AnonymousTraversal // |
| //////////////////////// |
| pythonClass.append("class __(object):\n"); |
| __.class.getMethods(). |
| findAll { GraphTraversal.class.equals(it.returnType) }. |
| findAll { Modifier.isStatic(it.getModifiers()) }. |
| collect { SymbolHelper.toPython(it.name) }. |
| unique(). |
| sort { a, b -> a <=> b }. |
| forEach { method -> |
| pythonClass.append( |
| """ @staticmethod |
| def ${method}(*args): |
| return GraphTraversal(None, None, Bytecode()).${method}(*args) |
| """) |
| }; |
| pythonClass.append("\n\n") |
| // add to gremlin.python.statics |
| __.class.getMethods(). |
| findAll { GraphTraversal.class.equals(it.returnType) }. |
| findAll { Modifier.isStatic(it.getModifiers()) }. |
| findAll { !it.name.equals("__") }. |
| collect { SymbolHelper.toPython(it.name) }. |
| unique(). |
| sort { a, b -> a <=> b }. |
| forEach { |
| pythonClass.append("def ${it}(*args):\n").append(" return __.${it}(*args)\n\n") |
| pythonClass.append("statics.add_static('${it}', ${it})\n\n") |
| } |
| pythonClass.append("\n\n") |
| |
| // save to a python file |
| final File file = new File(graphTraversalSourceFile); |
| file.delete() |
| pythonClass.eachLine { file.append(it + "\n") } |
| } |
| } |
| ---- |
| |
| When the above Groovy script is evaluated (e.g. in GremlinConsole), **Gremlin-Python** is born. The generated Python |
| file is similar to the one available at |
| link:https://github.com/apache/tinkerpop/blob/x.y.z/gremlin-python/src/main/jython/gremlin_python/process/graph_traversal.py[graph_traversal.py]. |
| It is important to note that there is a bit more to Gremlin-Python in that there also exists Python implementations |
| of `TraversalStrategies`, `Traversal`, `Bytecode`, etc. Please review the full implementation of Gremlin-Python |
| link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-python/src/main/jython/gremlin_python[here]. |
| |
| NOTE: In practice, TinkerPop uses the Groovy's `GStringTemplateEngine` to help with the code generation task described |
| above and automates that generation as part of the standard build with Maven using the `gmavenplus-plugin`. See the |
| `gremlin-python` link:https://github.com/apache/tinkerpop/blob/x.y.z/gremlin-python/pom.xml[pom.xml] for more details. |
| |
| Of particular importance is Gremlin-Python's implementation of `Bytecode`. |
| |
| [source,python] |
| ---- |
| class Bytecode(object): |
| def __init__(self, bytecode=None): |
| self.source_instructions = [] |
| self.step_instructions = [] |
| self.bindings = {} |
| if bytecode is not None: |
| self.source_instructions = list(bytecode.source_instructions) |
| self.step_instructions = list(bytecode.step_instructions) |
| |
| def add_source(self, source_name, *args): |
| newArgs = () |
| for arg in args: |
| newArgs = newArgs + (self.__convertArgument(arg),) |
| self.source_instructions.append((source_name, newArgs)) |
| return |
| |
| def add_step(self, step_name, *args): |
| newArgs = () |
| for arg in args: |
| newArgs = newArgs + (self.__convertArgument(arg),) |
| self.step_instructions.append((step_name, newArgs)) |
| return |
| |
| def __convertArgument(self,arg): |
| if isinstance(arg, Traversal): |
| self.bindings.update(arg.bytecode.bindings) |
| return arg.bytecode |
| elif isinstance(arg, tuple) and 2 == len(arg) and isinstance(arg[0], str): |
| self.bindings[arg[0]] = arg[1] |
| return Binding(arg[0],arg[1]) |
| else: |
| return arg |
| ---- |
| |
| As `GraphTraversalSource` and `GraphTraversal` are manipulated, the step-by-step instructions are written to `Bytecode`. |
| This bytecode is simply a list of lists. For instance, `g.V(1).repeat(out('knows').hasLabel('person')).times(2).name` has |
| the `Bytecode` form: |
| |
| [source,json] |
| ---- |
| [ |
| ["V", [1]], |
| ["repeat", [[ |
| ["out", ["knows"]] |
| ["hasLabel", ["person"]]]]] |
| ["times", [2]] |
| ["values", ["name"]] |
| ] |
| ---- |
| |
| This nested list representation is ultimately converted by the language variant into link:https://tinkerpop.apache.org/docs/x.y.z/reference/#graphson[GraphSON] |
| for serialization to a `RemoteConnection` such as GremlinServer. |
| |
| [source,bash] |
| ---- |
| $ bin/gremlin-server.sh install org.apache.tinkerpop gremlin-python x.y.z |
| $ bin/gremlin-server.sh conf/gremlin-server-modern-py.yaml |
| [INFO] GremlinServer - |
| \,,,/ |
| (o o) |
| ---oOOo-(3)-oOOo--- |
| |
| [INFO] GremlinServer - Configuring Gremlin Server from conf/gremlin-server-modern-py.yaml |
| [INFO] MetricManager - Configured Metrics Slf4jReporter configured with interval=180000ms and loggerName=org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics |
| [INFO] GraphManager - Graph [graph] was successfully configured via [conf/tinkergraph-empty.properties]. |
| [INFO] ServerGremlinExecutor - Initialized Gremlin thread pool. Threads in pool named with pattern gremlin-* |
| [INFO] ServerGremlinExecutor - Initialized GremlinExecutor and configured ScriptEngines. |
| [INFO] Logger - 56 attributes loaded from 90 stream(s) in 21ms, 56 saved, 1150 ignored: ["Ant-Version", "Archiver-Version", "Bnd-LastModified", "Boot-Class-Path", "Build-Jdk", "Build-Version", "Built-By", "Bundle-Activator", "Bundle-BuddyPolicy", "Bundle-ClassPath", "Bundle-Description", "Bundle-DocURL", "Bundle-License", "Bundle-ManifestVersion", "Bundle-Name", "Bundle-RequiredExecutionEnvironment", "Bundle-SymbolicName", "Bundle-Vendor", "Bundle-Version", "Can-Redefine-Classes", "Created-By", "DynamicImport-Package", "Eclipse-BuddyPolicy", "Export-Package", "Extension-Name", "Extension-name", "Fragment-Host", "Gremlin-Plugin-Dependencies", "Ignore-Package", "Implementation-Build", "Implementation-Title", "Implementation-URL", "Implementation-Vendor", "Implementation-Vendor-Id", "Implementation-Version", "Import-Package", "Include-Resource", "JCabi-Build", "JCabi-Date", "JCabi-Version", "Main-Class", "Main-class", "Manifest-Version", "Originally-Created-By", "Package", "Private-Package", "Require-Capability", "Specification-Title", "Specification-Vendor", "Specification-Version", "Tool", "Url", "X-Compile-Source-JDK", "X-Compile-Target-JDK", "hash", "version"] |
| [INFO] ServerGremlinExecutor - A GraphTraversalSource is now bound to [g] with graphtraversalsource[tinkergraph[vertices:0 edges:0], standard] |
| [INFO] OpLoader - Adding the standard OpProcessor. |
| [INFO] OpLoader - Adding the session OpProcessor. |
| [INFO] OpLoader - Adding the traversal OpProcessor. |
| [INFO] TraversalOpProcessor - Initialized cache for TraversalOpProcessor with size 1000 and expiration time of 600000 ms |
| [INFO] GremlinServer - Executing start up LifeCycleHook |
| [INFO] Logger$info - Executed once at startup of Gremlin Server. |
| [WARN] AbstractChannelizer - The org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0 serialization class is deprecated. |
| [INFO] AbstractChannelizer - Configured application/vnd.gremlin-v3.0+gryo with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0 |
| [WARN] AbstractChannelizer - The org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0 serialization class is deprecated. |
| [INFO] AbstractChannelizer - Configured application/vnd.gremlin-v3.0+gryo-stringd with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0 |
| [INFO] AbstractChannelizer - Configured application/vnd.gremlin-v3.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0 |
| [INFO] AbstractChannelizer - Configured application/json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0 |
| [INFO] AbstractChannelizer - Configured application/vnd.graphbinary-v1.0 with org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1 |
| [INFO] AbstractChannelizer - Configured application/vnd.graphbinary-v1.0-stringd with org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1 |
| [INFO] GremlinServer$1 - Gremlin Server configured with worker thread pool of 1, gremlin pool of 8 and boss thread pool of 1. |
| [INFO] GremlinServer$1 - Channel started at port 8182. |
| ---- |
| |
| Within the CPython console, it is possible to evaluate the following. |
| |
| [source,python] |
| ---- |
| Python 2.7.2 (default, Oct 11 2012, 20:14:37) |
| [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin |
| Type "help", "copyright", "credits" or "license" for more information. |
| >>> from gremlin_python import statics |
| >>> from gremlin_python.structure.graph import Graph |
| >>> from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection |
| # loading statics enables __.out() to be out() and P.gt() to be gt() |
| >>> statics.load_statics(globals()) |
| >>> graph = Graph() |
| >>> g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g')) |
| # nested traversal with Python slicing and attribute interception extensions |
| >>> g.V().hasLabel("person").repeat(both()).times(2).name[0:2].toList() |
| [u'marko', u'marko'] |
| >>> g = g.withComputer() |
| >>> g.V().hasLabel("person").repeat(both()).times(2).name[0:2].toList() |
| [u'peter', u'peter'] |
| # a complex, nested multi-line traversal |
| >>> g.V().match( \ |
| ... as_("a").out("created").as_("b"), \ |
| ... as_("b").in_("created").as_("c"), \ |
| ... as_("a").out("knows").as_("c")). \ |
| ... select("c"). \ |
| ... union(in_("knows"),out("created")). \ |
| ... name.toList() |
| [u'ripple', u'marko', u'lop'] |
| >>> |
| ---- |
| |
| IMPORTANT: Learn more about Apache TinkerPop's distribution of Gremlin-Python link:https://tinkerpop.apache.org/docs/x.y.z/reference/#gremlin-python[here]. |
| |
| [[gremlin-language-variant-conventions]] |
| == Gremlin Language Variant Conventions |
| |
| Every programming language is different and a Gremlin language variant must ride the fine line between leveraging the |
| conventions of the host language and ensuring consistency with Gremlin-Java. A collection of conventions for navigating |
| this dual-language bridge are provided. |
| |
| * If camelCase is not an accepted method naming convention in the host language, then the host language's convention can be used instead. For instance, in a Gremlin-Ruby implementation, `outE("created")` may be `out_e("created")`. |
| * If Gremlin-Java step names conflict with the host language's reserved words, then a consistent amelioration should be used. For instance, in Python `as` is a reserved word, thus, Gremlin-Python uses `as_`. |
| * If the host language does not use dot-notion for method chaining, then its method chaining convention should be used instead of going the route of operator overloading. For instance, a Gremlin-PHP implementation should do `$g->V()->out()`. |
| * If a programming language does not support method overloading, then varargs and type introspection should be used. In Gremlin-Python, `*args` does just that. |
| |
| == Conclusion |
| |
| Gremlin is a simple language because it uses two fundamental programming language constructs: *function composition* |
| and *function nesting*. Because of this foundation, it is relatively easy to implement Gremlin in any modern programming |
| language. Two ways of doing this for the Python language were presented in this tutorial. One using Jython (on the JVM) and one using Python |
| (on CPython). It is strongly recommended that language variant designers leverage (especially when not on the JVM) |
| the reflection-based source code generation technique presented. This method ensures that the language |
| variant is always in sync with the corresponding Apache TinkerPop Gremlin-Java release version. Moreover, it reduces |
| the chance of missing methods or creating poorly implemented methods. While Gremlin is simple, there are nearly 200 |
| step variations in `GraphTraversal`. As such, mechanical means of host language embedding are strongly advised. |