You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tinkerpop.apache.org by Marko Rodriguez <ok...@gmail.com> on 2015/03/09 18:13:40 UTC

On the Concept of TraversalContext

Hi everyone,

We have reached a wall regarding the following problems:

	1. GraphStrategies via wrappers is not working for various vendors.
	2. TraversalEngine is specific to a Graph and thus, global to all traversals spawned off that graph.
	3. User defined Traversal DSLs are not easily created and are not susceptible to OLAP processing.

As a solution, Stephen and I are bouncing around the idea of a TraversalContext:

Graph graph = GraphFactory.open(configuration);
GraphTraversalContext g = graph.traversal(GraphTraversal.of()
				.engine(StandardTraversalEngine.instance())
				.strategy(ReadOnlyTraversalStrategy.instance()));
g.V().out().values("name").iterate();
g.V().values("age").iterate(); // spawn as many traversals as you want off of g

In essence, we want to introduce one new level of indirection from the Graph to the Traversal. This new level is called a "TraversalContext" (no better name yet) and it bundles the following objects:

	1. Graph (the raw data structure)
	2. Traversal DSL (GraphTraversal, SocialTraversal, etc. etc.)
	3. TraversalEngine (Spark, Giraph, Standard, etc.)
	4. TraversalStrategies (ReadOnlyTraversalStrategy, IdTraversalStrategy, PartitionTraversalStrategy, etc.)

 You can see a working implementation of GraphTraversalContext here:
	https://gist.github.com/okram/e67252705a920cd34571

What problems does this solve beyond 1,2, and 3 above?

	1. Graph.Iterators.vertexIterator() -> Graph.vertices() // back to a "Blueprints"-style API for people wanting to work with the graph object directly
	2. Graph.engine() goes away -- no more ThreadLocal hack.
	3. Graph.of(Traversal.class) goes away -- we will be DSL friendly with TraversalContext.
	4. Graph.V() goes away -- its all in terms of the traversal context. GraphTraversalContext.V() exists.

One big pill that must be swallowed with this model -- Vertex.outE() doesn't exist:

Vertex v = g.V().out().next()
String name = g.V(v).out().values("name")

Graph, Vertex, Edge, etc. no longer have Traversal methods off of them (that is NOT DSL friendly). Therefore, everything is off of TraversalContext. This is actually going to make DSL execution on GraphComputer extremely easy and its going to simplify vendor strategy code a lot -- strategies are simply cached with respects to MyGraph.class.

Anywho… its a big deal. Functionally, things don't really change. Its just a reorganization that is going to ultimately solve 1-3 in the beginning which need solving before we release GA.

If anyone has any thoughts/concerns with the desired change, please raise them.

Thanks,
Marko.

http://markorodriguez.com

Re: On the Concept of TraversalContext

Posted by Matt Frantz <ma...@gmail.com>.

I should have started with a usage example.  I would have realized we don't
really need a new interface, but just to reappropriate existing Graph
methods so that they build facades.  That is, there is an internal object
that represents what we now call Graph, but which we might want to call
GraphImpl.  Everything public is a facade to that object.  The facade could
be called Graph.  It would have factory methods for new facades.

I like that GraphTraversal becomes yet another DSL.  I couldn't see from
your gist how the DSL interface emerges from the GraphTraversalContext, so
I address that by keeping the "of" syntax.

Here goes.

Graph graph = GraphFactory.open(configuration)

// You can't do any traversal without specifying a Traversal class with
"of".
// Depending on the vendor, there might be a default DSL (e.g.
GraphTraversal), so this could work, or it could throw a runtime error.
graph.of().V().has('foo')...

// If I want a different DSL, we generate one like we do today.
graph.of(SocialTraversal).knows('foo')...

// If I want a different engine, we use the mutation syntax, except it
produces a new Graph and leaves the original alone.
Graph olap = graph.engine(MyEngine)

// If we want a new strategy, we use the mutation syntax, except it
produces a new Graph and leaves the original alone.
Graph secure = graph.strategy(MySecurityStrategy)

This Graph is more of an "immutable" API than today's, which solves the
problem of concurrently configuring Graph.  The underlying GraphImpl object
does not have to be reconstructed for each facade, so it is efficient.
Graph is still a fluent API, so it composes new Graphs with a convenient
syntax.  Essentially Graph = GraphImpl + engine + strategies, and DSL is
specified as the starting point for any new traversal.

You could then prune Graph down to a minimal API.  The ability to do
server-side actions would be exposed either by generating a special DSL
with "of", or, since we might have a server-side API that is NOT a
Traversal subtype, via a distinct "admin" API.

MyAdmin admin = graph.admin(MyAdmin)
admin.restoreFromBackup('/var/log/backup.zip')

On Fri, Mar 13, 2015 at 9:51 AM, Marko Rodriguez <ok...@gmail.com>
wrote:

> Hi Matt,
>
> Can you explain a bit more of what you mean? Do you mean:
>
> Graph graph = GraphFactory.open(configuration)
> Graph g = graph.facade(GiraphGraphComputer)
> g.V() …
>
> We could do this, but we are interested in a level of indirection from the
> Graph API for the following reasons:
>
>         1. GremlinServer can expose TraversalContexts and thus, remote
> users only have what is available via the context.
>                 - e.g. no Graph.close(), Graph.io(), Graph.V() when you
> want them to use a particular DSL for your system.
>         2. Graph no longer has V() and E() methods… that is a
> GraphTraversal (DSL) concept.
>                 - This makes providing other DSLs as natural as providing
> GraphTraversals. GraphTraversal isn't "special."
>
> Thoughts?,
> Marko.
>
> http://markorodriguez.com
>
> On Mar 13, 2015, at 9:04 AM, Matt Frantz <ma...@gmail.com>
> wrote:
>
> > Have you considered a GraphFacade approach (which IS-A Graph), in which a
> > Graph acts as it does today, but can spawn new facades, each of which
> > internally HAS-A context (graph + DSL + engine + strategies)?  That would
> > accomplish the separation that you seek, but would simplify the simplest
> > use case.
> >
> > On Mon, Mar 9, 2015 at 10:13 AM, Marko Rodriguez <ok...@gmail.com>
> > wrote:
> >
> >> Hi everyone,
> >>
> >> We have reached a wall regarding the following problems:
> >>
> >>        1. GraphStrategies via wrappers is not working for various
> vendors.
> >>        2. TraversalEngine is specific to a Graph and thus, global to all
> >> traversals spawned off that graph.
> >>        3. User defined Traversal DSLs are not easily created and are not
> >> susceptible to OLAP processing.
> >>
> >> As a solution, Stephen and I are bouncing around the idea of a
> >> TraversalContext:
> >>
> >> Graph graph = GraphFactory.open(configuration);
> >> GraphTraversalContext g = graph.traversal(GraphTraversal.of()
> >>
> .engine(StandardTraversalEngine.instance())
> >>
> >> .strategy(ReadOnlyTraversalStrategy.instance()));
> >> g.V().out().values("name").iterate();
> >> g.V().values("age").iterate(); // spawn as many traversals as you want
> off
> >> of g
> >>
> >> In essence, we want to introduce one new level of indirection from the
> >> Graph to the Traversal. This new level is called a "TraversalContext"
> (no
> >> better name yet) and it bundles the following objects:
> >>
> >>        1. Graph (the raw data structure)
> >>        2. Traversal DSL (GraphTraversal, SocialTraversal, etc. etc.)
> >>        3. TraversalEngine (Spark, Giraph, Standard, etc.)
> >>        4. TraversalStrategies (ReadOnlyTraversalStrategy,
> >> IdTraversalStrategy, PartitionTraversalStrategy, etc.)
> >>
> >> You can see a working implementation of GraphTraversalContext here:
> >>        https://gist.github.com/okram/e67252705a920cd34571
> >>
> >> What problems does this solve beyond 1,2, and 3 above?
> >>
> >>        1. Graph.Iterators.vertexIterator() -> Graph.vertices() // back
> to
> >> a "Blueprints"-style API for people wanting to work with the graph
> object
> >> directly
> >>        2. Graph.engine() goes away -- no more ThreadLocal hack.
> >>        3. Graph.of(Traversal.class) goes away -- we will be DSL friendly
> >> with TraversalContext.
> >>        4. Graph.V() goes away -- its all in terms of the traversal
> >> context. GraphTraversalContext.V() exists.
> >>
> >> One big pill that must be swallowed with this model -- Vertex.outE()
> >> doesn't exist:
> >>
> >> Vertex v = g.V().out().next()
> >> String name = g.V(v).out().values("name")
> >>
> >> Graph, Vertex, Edge, etc. no longer have Traversal methods off of them
> >> (that is NOT DSL friendly). Therefore, everything is off of
> >> TraversalContext. This is actually going to make DSL execution on
> >> GraphComputer extremely easy and its going to simplify vendor strategy
> code
> >> a lot -- strategies are simply cached with respects to MyGraph.class.
> >>
> >> Anywho… its a big deal. Functionally, things don't really change. Its
> just
> >> a reorganization that is going to ultimately solve 1-3 in the beginning
> >> which need solving before we release GA.
> >>
> >> If anyone has any thoughts/concerns with the desired change, please
> raise
> >> them.
> >>
> >> Thanks,
> >> Marko.
> >>
> >> http://markorodriguez.com
> >>
> >>
>
>

Re: On the Concept of TraversalContext

Posted by Marko Rodriguez <ok...@gmail.com>.

Hi Matt,

Can you explain a bit more of what you mean? Do you mean:

Graph graph = GraphFactory.open(configuration)
Graph g = graph.facade(GiraphGraphComputer)
g.V() …

We could do this, but we are interested in a level of indirection from the Graph API for the following reasons:

	1. GremlinServer can expose TraversalContexts and thus, remote users only have what is available via the context.
		- e.g. no Graph.close(), Graph.io(), Graph.V() when you want them to use a particular DSL for your system.
	2. Graph no longer has V() and E() methods… that is a GraphTraversal (DSL) concept.
		- This makes providing other DSLs as natural as providing GraphTraversals. GraphTraversal isn't "special."

Thoughts?,
Marko. 

http://markorodriguez.com

On Mar 13, 2015, at 9:04 AM, Matt Frantz <ma...@gmail.com> wrote:

> Have you considered a GraphFacade approach (which IS-A Graph), in which a
> Graph acts as it does today, but can spawn new facades, each of which
> internally HAS-A context (graph + DSL + engine + strategies)?  That would
> accomplish the separation that you seek, but would simplify the simplest
> use case.
> 
> On Mon, Mar 9, 2015 at 10:13 AM, Marko Rodriguez <ok...@gmail.com>
> wrote:
> 
>> Hi everyone,
>> 
>> We have reached a wall regarding the following problems:
>> 
>>        1. GraphStrategies via wrappers is not working for various vendors.
>>        2. TraversalEngine is specific to a Graph and thus, global to all
>> traversals spawned off that graph.
>>        3. User defined Traversal DSLs are not easily created and are not
>> susceptible to OLAP processing.
>> 
>> As a solution, Stephen and I are bouncing around the idea of a
>> TraversalContext:
>> 
>> Graph graph = GraphFactory.open(configuration);
>> GraphTraversalContext g = graph.traversal(GraphTraversal.of()
>>                                .engine(StandardTraversalEngine.instance())
>> 
>> .strategy(ReadOnlyTraversalStrategy.instance()));
>> g.V().out().values("name").iterate();
>> g.V().values("age").iterate(); // spawn as many traversals as you want off
>> of g
>> 
>> In essence, we want to introduce one new level of indirection from the
>> Graph to the Traversal. This new level is called a "TraversalContext" (no
>> better name yet) and it bundles the following objects:
>> 
>>        1. Graph (the raw data structure)
>>        2. Traversal DSL (GraphTraversal, SocialTraversal, etc. etc.)
>>        3. TraversalEngine (Spark, Giraph, Standard, etc.)
>>        4. TraversalStrategies (ReadOnlyTraversalStrategy,
>> IdTraversalStrategy, PartitionTraversalStrategy, etc.)
>> 
>> You can see a working implementation of GraphTraversalContext here:
>>        https://gist.github.com/okram/e67252705a920cd34571
>> 
>> What problems does this solve beyond 1,2, and 3 above?
>> 
>>        1. Graph.Iterators.vertexIterator() -> Graph.vertices() // back to
>> a "Blueprints"-style API for people wanting to work with the graph object
>> directly
>>        2. Graph.engine() goes away -- no more ThreadLocal hack.
>>        3. Graph.of(Traversal.class) goes away -- we will be DSL friendly
>> with TraversalContext.
>>        4. Graph.V() goes away -- its all in terms of the traversal
>> context. GraphTraversalContext.V() exists.
>> 
>> One big pill that must be swallowed with this model -- Vertex.outE()
>> doesn't exist:
>> 
>> Vertex v = g.V().out().next()
>> String name = g.V(v).out().values("name")
>> 
>> Graph, Vertex, Edge, etc. no longer have Traversal methods off of them
>> (that is NOT DSL friendly). Therefore, everything is off of
>> TraversalContext. This is actually going to make DSL execution on
>> GraphComputer extremely easy and its going to simplify vendor strategy code
>> a lot -- strategies are simply cached with respects to MyGraph.class.
>> 
>> Anywho… its a big deal. Functionally, things don't really change. Its just
>> a reorganization that is going to ultimately solve 1-3 in the beginning
>> which need solving before we release GA.
>> 
>> If anyone has any thoughts/concerns with the desired change, please raise
>> them.
>> 
>> Thanks,
>> Marko.
>> 
>> http://markorodriguez.com
>> 
>>

Re: On the Concept of TraversalContext

Posted by Matt Frantz <ma...@gmail.com>.

Have you considered a GraphFacade approach (which IS-A Graph), in which a
Graph acts as it does today, but can spawn new facades, each of which
internally HAS-A context (graph + DSL + engine + strategies)?  That would
accomplish the separation that you seek, but would simplify the simplest
use case.

On Mon, Mar 9, 2015 at 10:13 AM, Marko Rodriguez <ok...@gmail.com>
wrote:

> Hi everyone,
>
> We have reached a wall regarding the following problems:
>
>         1. GraphStrategies via wrappers is not working for various vendors.
>         2. TraversalEngine is specific to a Graph and thus, global to all
> traversals spawned off that graph.
>         3. User defined Traversal DSLs are not easily created and are not
> susceptible to OLAP processing.
>
> As a solution, Stephen and I are bouncing around the idea of a
> TraversalContext:
>
> Graph graph = GraphFactory.open(configuration);
> GraphTraversalContext g = graph.traversal(GraphTraversal.of()
>                                 .engine(StandardTraversalEngine.instance())
>
> .strategy(ReadOnlyTraversalStrategy.instance()));
> g.V().out().values("name").iterate();
> g.V().values("age").iterate(); // spawn as many traversals as you want off
> of g
>
> In essence, we want to introduce one new level of indirection from the
> Graph to the Traversal. This new level is called a "TraversalContext" (no
> better name yet) and it bundles the following objects:
>
>         1. Graph (the raw data structure)
>         2. Traversal DSL (GraphTraversal, SocialTraversal, etc. etc.)
>         3. TraversalEngine (Spark, Giraph, Standard, etc.)
>         4. TraversalStrategies (ReadOnlyTraversalStrategy,
> IdTraversalStrategy, PartitionTraversalStrategy, etc.)
>
>  You can see a working implementation of GraphTraversalContext here:
>         https://gist.github.com/okram/e67252705a920cd34571
>
> What problems does this solve beyond 1,2, and 3 above?
>
>         1. Graph.Iterators.vertexIterator() -> Graph.vertices() // back to
> a "Blueprints"-style API for people wanting to work with the graph object
> directly
>         2. Graph.engine() goes away -- no more ThreadLocal hack.
>         3. Graph.of(Traversal.class) goes away -- we will be DSL friendly
> with TraversalContext.
>         4. Graph.V() goes away -- its all in terms of the traversal
> context. GraphTraversalContext.V() exists.
>
> One big pill that must be swallowed with this model -- Vertex.outE()
> doesn't exist:
>
> Vertex v = g.V().out().next()
> String name = g.V(v).out().values("name")
>
> Graph, Vertex, Edge, etc. no longer have Traversal methods off of them
> (that is NOT DSL friendly). Therefore, everything is off of
> TraversalContext. This is actually going to make DSL execution on
> GraphComputer extremely easy and its going to simplify vendor strategy code
> a lot -- strategies are simply cached with respects to MyGraph.class.
>
> Anywho… its a big deal. Functionally, things don't really change. Its just
> a reorganization that is going to ultimately solve 1-3 in the beginning
> which need solving before we release GA.
>
> If anyone has any thoughts/concerns with the desired change, please raise
> them.
>
> Thanks,
> Marko.
>
> http://markorodriguez.com
>
>

Re: On the Concept of TraversalContext

Posted by Stephen Mallette <sp...@gmail.com>.

That hasn't been completely determined yet, but I think the configuration
of the server would have to change to allow hosting of Graphs and for each
Graph allow for the hosting of one or more GraphTraversalContexts.  I think
we would then want further configuration to say whether the Graph instance
is exposed as a binding to the ScriptEngine - i think that should be
optional as not everyone will want to hide their Graph from view.

On Tue, Mar 10, 2015 at 8:19 PM, Matthias Broecheler <me...@matthiasb.com>
wrote:

> Hi Marko,
>
> how does this play with Gremlin-Server? In Gremlin-Shell you would probably
> have a shortcut of some sort:
>
> g = GraphFactory.open(configuration).standardTraversal()
>
> but how is this additional level accommodated inside the server?
>
> Thanks,
> Matthias
>
> On Mon, Mar 9, 2015 at 10:13 AM, Marko Rodriguez <ok...@gmail.com>
> wrote:
>
> > Hi everyone,
> >
> > We have reached a wall regarding the following problems:
> >
> >         1. GraphStrategies via wrappers is not working for various
> vendors.
> >         2. TraversalEngine is specific to a Graph and thus, global to all
> > traversals spawned off that graph.
> >         3. User defined Traversal DSLs are not easily created and are not
> > susceptible to OLAP processing.
> >
> > As a solution, Stephen and I are bouncing around the idea of a
> > TraversalContext:
> >
> > Graph graph = GraphFactory.open(configuration);
> > GraphTraversalContext g = graph.traversal(GraphTraversal.of()
> >
>  .engine(StandardTraversalEngine.instance())
> >
> > .strategy(ReadOnlyTraversalStrategy.instance()));
> > g.V().out().values("name").iterate();
> > g.V().values("age").iterate(); // spawn as many traversals as you want
> off
> > of g
> >
> > In essence, we want to introduce one new level of indirection from the
> > Graph to the Traversal. This new level is called a "TraversalContext" (no
> > better name yet) and it bundles the following objects:
> >
> >         1. Graph (the raw data structure)
> >         2. Traversal DSL (GraphTraversal, SocialTraversal, etc. etc.)
> >         3. TraversalEngine (Spark, Giraph, Standard, etc.)
> >         4. TraversalStrategies (ReadOnlyTraversalStrategy,
> > IdTraversalStrategy, PartitionTraversalStrategy, etc.)
> >
> >  You can see a working implementation of GraphTraversalContext here:
> >         https://gist.github.com/okram/e67252705a920cd34571
> >
> > What problems does this solve beyond 1,2, and 3 above?
> >
> >         1. Graph.Iterators.vertexIterator() -> Graph.vertices() // back
> to
> > a "Blueprints"-style API for people wanting to work with the graph object
> > directly
> >         2. Graph.engine() goes away -- no more ThreadLocal hack.
> >         3. Graph.of(Traversal.class) goes away -- we will be DSL friendly
> > with TraversalContext.
> >         4. Graph.V() goes away -- its all in terms of the traversal
> > context. GraphTraversalContext.V() exists.
> >
> > One big pill that must be swallowed with this model -- Vertex.outE()
> > doesn't exist:
> >
> > Vertex v = g.V().out().next()
> > String name = g.V(v).out().values("name")
> >
> > Graph, Vertex, Edge, etc. no longer have Traversal methods off of them
> > (that is NOT DSL friendly). Therefore, everything is off of
> > TraversalContext. This is actually going to make DSL execution on
> > GraphComputer extremely easy and its going to simplify vendor strategy
> code
> > a lot -- strategies are simply cached with respects to MyGraph.class.
> >
> > Anywho… its a big deal. Functionally, things don't really change. Its
> just
> > a reorganization that is going to ultimately solve 1-3 in the beginning
> > which need solving before we release GA.
> >
> > If anyone has any thoughts/concerns with the desired change, please raise
> > them.
> >
> > Thanks,
> > Marko.
> >
> > http://markorodriguez.com
> >
> >
>
>
> --
> Matthias Broecheler
> http://www.matthiasb.com
>

Re: On the Concept of TraversalContext

Posted by Matthias Broecheler <me...@matthiasb.com>.

Hi Marko,

how does this play with Gremlin-Server? In Gremlin-Shell you would probably
have a shortcut of some sort:

g = GraphFactory.open(configuration).standardTraversal()

but how is this additional level accommodated inside the server?

Thanks,
Matthias

On Mon, Mar 9, 2015 at 10:13 AM, Marko Rodriguez <ok...@gmail.com>
wrote:

> Hi everyone,
>
> We have reached a wall regarding the following problems:
>
>         1. GraphStrategies via wrappers is not working for various vendors.
>         2. TraversalEngine is specific to a Graph and thus, global to all
> traversals spawned off that graph.
>         3. User defined Traversal DSLs are not easily created and are not
> susceptible to OLAP processing.
>
> As a solution, Stephen and I are bouncing around the idea of a
> TraversalContext:
>
> Graph graph = GraphFactory.open(configuration);
> GraphTraversalContext g = graph.traversal(GraphTraversal.of()
>                                 .engine(StandardTraversalEngine.instance())
>
> .strategy(ReadOnlyTraversalStrategy.instance()));
> g.V().out().values("name").iterate();
> g.V().values("age").iterate(); // spawn as many traversals as you want off
> of g
>
> In essence, we want to introduce one new level of indirection from the
> Graph to the Traversal. This new level is called a "TraversalContext" (no
> better name yet) and it bundles the following objects:
>
>         1. Graph (the raw data structure)
>         2. Traversal DSL (GraphTraversal, SocialTraversal, etc. etc.)
>         3. TraversalEngine (Spark, Giraph, Standard, etc.)
>         4. TraversalStrategies (ReadOnlyTraversalStrategy,
> IdTraversalStrategy, PartitionTraversalStrategy, etc.)
>
>  You can see a working implementation of GraphTraversalContext here:
>         https://gist.github.com/okram/e67252705a920cd34571
>
> What problems does this solve beyond 1,2, and 3 above?
>
>         1. Graph.Iterators.vertexIterator() -> Graph.vertices() // back to
> a "Blueprints"-style API for people wanting to work with the graph object
> directly
>         2. Graph.engine() goes away -- no more ThreadLocal hack.
>         3. Graph.of(Traversal.class) goes away -- we will be DSL friendly
> with TraversalContext.
>         4. Graph.V() goes away -- its all in terms of the traversal
> context. GraphTraversalContext.V() exists.
>
> One big pill that must be swallowed with this model -- Vertex.outE()
> doesn't exist:
>
> Vertex v = g.V().out().next()
> String name = g.V(v).out().values("name")
>
> Graph, Vertex, Edge, etc. no longer have Traversal methods off of them
> (that is NOT DSL friendly). Therefore, everything is off of
> TraversalContext. This is actually going to make DSL execution on
> GraphComputer extremely easy and its going to simplify vendor strategy code
> a lot -- strategies are simply cached with respects to MyGraph.class.
>
> Anywho… its a big deal. Functionally, things don't really change. Its just
> a reorganization that is going to ultimately solve 1-3 in the beginning
> which need solving before we release GA.
>
> If anyone has any thoughts/concerns with the desired change, please raise
> them.
>
> Thanks,
> Marko.
>
> http://markorodriguez.com
>
>


-- 
Matthias Broecheler
http://www.matthiasb.com