You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tinkerpop.apache.org by Matt Frantz <ma...@gmail.com> on 2015/06/05 18:40:26 UTC

Subroutines

In our application, we build complex traversals.  We use a modular
technique in which lower-level traversals are used to compose higher-level
traversals.  For example, we might have an API like this:

void computeFoo(GraphTraversal t);

The assumption is that the "root" traversal has some state in the form of
its traverser and path (and possible sack and side effects, although we're
not using those at the moment).  Importantly, all of this state is not
fully realized, as the goal is to build another traversal before executing
it all at once.  To use the "foo" traversal in the "bar" traversal, I might
do this:

void computeBar(GraphTraversal t) {
  // Perform some part of the "bar" calculation, including setting up the
parameters of "foo"
  t.out()...;
  // Attach the "foo" logic.
  computeFoo(t);
  // Continue with more "bar" logic.
  t.blah()...;
}

In the implementation of these modular traversals, we end up exposing
certain implementation details, especially in the form of labeled steps
(path keys).  It is often the case that we don't necessarily want to leak
this state.  We often want these traversals to operate as if they were
single steps, discarding any evidence of graph wandering that might have
occurred.

What if there were a step which acted as a "stack" for the path?  (It might
also act as a stack for the sack; we figured we would need that if we ever
decide to use sack.)  Not only would this provide encapsulation, it would
allow the path to be pruned and thus improve runtime efficiency.

We want traversal subroutines.  How about this signature?

GraphTraversal sub(Traversal subTraversal);

What this would do is guarantee that no path state (or sack state) would
escape.  The only effect would be on the traverser.  I think it could be
implemented using a MapStep, or possibly FlatMapStep in order to realize a
subTraversal that branches.

What state would subTraversal have access to?  By default, it could only
see the traverser.  It has its own key namespace for the path, so it cannot
see the outer path.  It has its own sack.  It should probably have its own
side effect key namespace, too.

If you want to opt-in state into the subTraversal, then perhaps you could
list the keys to form a pseudo-path:

GraphTraversal sub(Traversal subTraversal, String...stepOrSideEffectLabels);

One of the interesting things that this might provide is the ability to
hang OLTP traversal off of OLAP traversals.  I had discussed this need a
while ago in the context of vertices that act as OLAP "computation loci".
A "sub" could use a different engine, and thus embed a standard traversal
within a computer traversal.

This would change our modular programming paradigm.  We could safely
redesign each module to produce an anonymous traversal like so:

GraphTraversal computeFoo();
GraphTraversal computeBar() {
  // Perform some part of the "bar" calculation, including setting up the
parameters of "foo"
  GraphTraversal t = __.out()...;
  // Attach the "foo" logic.
  t.sub(computeFoo(t));
  // Continue with more "bar" logic.
  return t.in()...;
}

Thoughts?

Re: Subroutines

Posted by Matt Frantz <ma...@gmail.com>.

https://issues.apache.org/jira/browse/TINKERPOP3-716

On Fri, Jun 5, 2015 at 9:40 AM, Matt Frantz <ma...@gmail.com>
wrote:

> In our application, we build complex traversals.  We use a modular
> technique in which lower-level traversals are used to compose higher-level
> traversals.  For example, we might have an API like this:
>
> void computeFoo(GraphTraversal t);
>
> The assumption is that the "root" traversal has some state in the form of
> its traverser and path (and possible sack and side effects, although we're
> not using those at the moment).  Importantly, all of this state is not
> fully realized, as the goal is to build another traversal before executing
> it all at once.  To use the "foo" traversal in the "bar" traversal, I might
> do this:
>
> void computeBar(GraphTraversal t) {
>   // Perform some part of the "bar" calculation, including setting up the
> parameters of "foo"
>   t.out()...;
>   // Attach the "foo" logic.
>   computeFoo(t);
>   // Continue with more "bar" logic.
>   t.blah()...;
> }
>
> In the implementation of these modular traversals, we end up exposing
> certain implementation details, especially in the form of labeled steps
> (path keys).  It is often the case that we don't necessarily want to leak
> this state.  We often want these traversals to operate as if they were
> single steps, discarding any evidence of graph wandering that might have
> occurred.
>
> What if there were a step which acted as a "stack" for the path?  (It
> might also act as a stack for the sack; we figured we would need that if we
> ever decide to use sack.)  Not only would this provide encapsulation, it
> would allow the path to be pruned and thus improve runtime efficiency.
>
> We want traversal subroutines.  How about this signature?
>
> GraphTraversal sub(Traversal subTraversal);
>
> What this would do is guarantee that no path state (or sack state) would
> escape.  The only effect would be on the traverser.  I think it could be
> implemented using a MapStep, or possibly FlatMapStep in order to realize a
> subTraversal that branches.
>
> What state would subTraversal have access to?  By default, it could only
> see the traverser.  It has its own key namespace for the path, so it cannot
> see the outer path.  It has its own sack.  It should probably have its own
> side effect key namespace, too.
>
> If you want to opt-in state into the subTraversal, then perhaps you could
> list the keys to form a pseudo-path:
>
> GraphTraversal sub(Traversal subTraversal,
> String...stepOrSideEffectLabels);
>
> One of the interesting things that this might provide is the ability to
> hang OLTP traversal off of OLAP traversals.  I had discussed this need a
> while ago in the context of vertices that act as OLAP "computation loci".
> A "sub" could use a different engine, and thus embed a standard traversal
> within a computer traversal.
>
> This would change our modular programming paradigm.  We could safely
> redesign each module to produce an anonymous traversal like so:
>
> GraphTraversal computeFoo();
> GraphTraversal computeBar() {
>   // Perform some part of the "bar" calculation, including setting up the
> parameters of "foo"
>   GraphTraversal t = __.out()...;
>   // Attach the "foo" logic.
>   t.sub(computeFoo(t));
>   // Continue with more "bar" logic.
>   return t.in()...;
> }
>
> Thoughts?
>
>