You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by Marko Rodriguez <ok...@gmail.com> on 2016/10/13 12:37:19 UTC

On the concept of BytecodeStrategies

Hello,

There are two types of “programs” in Gremlin: Bytecode and Traversals.

	Bytecode => Virtual machine instructions (like Java bytecode)
	Traversals => Machine instructions (like Intel machine code)

The core of Gremlin’s compiler is its TraversalStrategies. A traversal strategy works on a traversal-by-traversal level walking the traversal tree rewriting sections of the traversal into (typically) more optimal forms.

void TraversalStrategy.apply(Traversal<S,E> traversal)

Working at the Traversal object level is important because the Gremlin language steps (has(), out(), in(), etc.) don’t always map one-to-one with the machine instructions (HasStep, VertexStep, VertexStep). Its better to work at the machine-level because there are more nick-nack mutations one can do at that level. However, as you can see, traversal strategies are “machine dependent.” That is, they are tied to the Gremlin traversal machine implementation.

While there is currently only one Gremlin virtual machine (Gremlin-Java machine), there are many Gremlin language variants — Gremlin-Java, -Groovy, -Python, SQL-Gremlin, SPARQL-Gremlin, etc. When these languages communicate with a/the Gremlin traversal machine, they communicate via Gremlin bytecode. Now, it is possible to optimize bytecode. In principle, we can do “client side” optimizations on the bytecode prior to sending it to the Gremlin traversal machine for execution. Why would we want do this?

	1. We can reduce the amount of work (clock cycles) required of “the server” which would ultimately do the TraversalStrategy optimization.
	2. We can have optimizations that are machine independent and thus, can be useful against any Gremlin traversal machine implementation.
	3. While the server is “streaming in” the Bytecode, it can also optimize the bytecode prior to applying TraversalStrategy optimizations.

[Gremlin-Java Traversal Machine] <== network connection ==> [Gremlin-XXX Language Variant]
  * pre-process bytecode                                      * pre-process bytecode 
    before translating to traversal                             before sending over network				     
  * apply traversal strategies
  * execute traversal

What would Bytecode strategies look like? Here is an idea:

void TraversalStrategy.apply(Bytecode bytecode)

Lets look at a simple strategy. IdentityRemoveStrategy will turn traversals of the form g.V().identity().as(“a”).identity() into g.V().as(“a”). Here is this strategy written in both Java and Python:

	https://gist.github.com/okram/7bb2512935f8955551f9e3f87623b488 <https://gist.github.com/okram/7bb2512935f8955551f9e3f87623b488>

Given that there (currently) is no Gremlin-Python traversal machine implementation, __apply_traversal(traversal) does nothing. However, given that there is a Gremlin-Python language variant, __apply_bytecode(traversal) does something. Moreover, note that we already have IdentityRemovalStrategy in Gremlin-Python, but, as you can see, it does nothing as (currently) strategies only operate on traversals.

	https://github.com/apache/tinkerpop/blob/master/gremlin-python/src/main/jython/gremlin_python/process/strategies.py#L117-L119 <https://github.com/apache/tinkerpop/blob/master/gremlin-python/src/main/jython/gremlin_python/process/strategies.py#L117-L119>

AS A SIDE: The reason strategies exists in Gremlin-Python is so that users can do stuff like:
	https://github.com/apache/tinkerpop/blob/master/gremlin-python/src/main/jython/tests/driver/test_driver_remote_connection.py#L80-L98 <https://github.com/apache/tinkerpop/blob/master/gremlin-python/src/main/jython/tests/driver/test_driver_remote_connection.py#L80-L98>

Anywho, so there you have it. I’ve made a ticket:
	https://issues.apache.org/jira/browse/TINKERPOP-1501 <https://issues.apache.org/jira/browse/TINKERPOP-1501>

You thoughts on the idea are more than appreciated.

Take care,
Marko.

http://markorodriguez.com