You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Matt Frantz (JIRA)" <ji...@apache.org> on 2015/09/04 20:50:45 UTC
[jira] [Comment Edited] (TINKERPOP3-716) subroutine step

    [ https://issues.apache.org/jira/browse/TINKERPOP3-716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717084#comment-14717084 ] 

Matt Frantz edited comment on TINKERPOP3-716 at 9/4/15 6:50 PM:
----------------------------------------------------------------

The motivation for this RFE derives from my application design, which consists of traversals that are independently specified and tested, and then used to compose other traversals.

Suppose I compose the _Bar_ algorithm from the _Foo_ algorithm.  Each is implemented as a {{Traversal}}.  The Foo algorithm is the _sub-traversal_ and the Bar algorithm is the _super-traversal_.  The relationship is similar to a parent-child traversal relationship, but with additional restrictions on shared state.

There are four principal state variables that we'll consider in the following sections:
* Object (value of the traverser)
* Path
* Sack
* Side-Effects

We seek _encapsulation_ of this state, such that a sub-traversal's state does not blend with the super-traversal's state except as explicitly designated by the traversal author.

*Object*
Encapsulating the logic that modulates Object is already handled via {{map}} and {{flatMap}}.  The {{sub}} step would act similarly.  Since {{flatMap}} is more general, {{sub}} should probably behave as {{flatMap}}.

*Path*
Path encapsulation involves the step label namespace and the visibility of the path at both ends of the sub-traversal.  The guiding principles for proper encapsulation are as follows:
* A sub-traversal should (by default) have no visibility into the path of the super-traversal.
* A super-traversal should (by default) have no visibility into the path of the sub-traversal.
* A super-traversal MAY provide visibility into specific step labels of the super-traversal when invoking the sub-traversal.
* A sub-traversal MAY require visibility into specific step labels of the super-traversal through a binding/renaming option.
* A sub-traversal MAY provide visibility into specific step labels of the sub-traversal through an exporting option.
* A super-traversal MAY require visibility into specific step labels of the sub-traversal through a binding/renaming option.

These operations can be efficient, since they involve labeling steps, and possibly constructing a {{Path}} "view" with limited visibility into the underlying "real path".

_Path Importing_
No matter what Foo (the sub-traversal) does, it should not have access to the path of Bar (the super-traversal), unless Bar exports it.  That is, Foo could require explicitly importing a portion of Bar's path using something like a {{withPath}} modifier of the {{sub}} step.

{code}
Traversal computeFoo() {
  // Use path steps 'x', 'y', 'z' provided from super-traversal.
  Traversal t = select('x', 'y', 'z');
  ...
  return t;
}

Traversal computeBar() {
  Traversal t = ...;
  // Provide Foo with the x-y-z path it requires from our own a-b-c steps.
  t.sub(computeFoo())
    .withPath('x', 'y', 'z')
    .by(select('a'))
    .by(select('b'))
    .by(select('c'));
  ...
  return t;
}
{code}

_Path Exporting_
No matter what Bar (the super-traversal) does, it should not have access to labeled steps of Foo (the sub-traversal), unless Foo exports them.  That is, Foo could explicitly export a portion of its path using something like a {{deselect}} (a.k.a. {{makePath}}) step.

{code}
Traversal computeFoo() {
  Traversal t = ...;
  return t.deselect('x', 'y');
}

Traversal computeBar() {
  Traversal t = ...as('a');
  t.sub(computeFoo()).as('foo');
  t.path();  // would see steps 'a', 'foo.x', and 'foo.y'
  t.select('a', 'foo.x', 'foo.y');  // OK
  return t;
}
{code}

*Sack*
The sack feature is useful, but it is limited because the sack type can only be specified at the source.  Here is a use case that involves a "scoped sack" implementation.  The "Foo" algorithm uses sack with a {{FooSack}} instance.  However, "Bar" also wants to use sack, but wants to use its own {{BarSack}} instance.

{code}
Traversal computeFoo() {
  return __.withSack(new FooSack())...;
}

Traversal computeBar() {
  Traversal bar = __.withSack(new BarSack())...;
  bar.sub(computeFoo())...;
  return bar;
}
{code}

Thus, this issue is in part about being able to perform the {{withSack}} step on some interior traversal.

*Side-Effect*
As a global namespace, the side-effects are powerful, but it can also be difficult to coordinate partitioning of this global namespace in a modular traversal.  The {{sub}} step would enforce encapsulation of the side-effect namespace by default.  Each {{sub}} step would effectively provide a new {{SideEffects}} object.

{code}
Traversal computeFoo() {
  return __...store('a')...;
}

Traversal computeBar() {
  Traversal t = ...;
  t.sub(computeBar());
  ...
  t.store('a');  // This is NOT the same 'a' that Foo sees!
  ...
  return t;
}
{code}

You could selectively populate the side-effect of the sub-traversal using {{withSideEffect}} as a modifier.

{code}
Traversal computeFoo() {
   return __...select('a')...;  // Access side-effect 'a' which must be provided by super-traversal.
}

Traversal computeBar() {
  Traversal t = ...;
  // Provide Foo with its required side-effect key 'a' from our own 'x'.
  t.sub(computeBar())
    .withSideEffect('a', select('x'));
  ...
  return t;
}
{code}

If you want to allow exporting of side-effects directly from the sub-traversal into the super-traversal, then something like {{deselect}} or {{makeSideEffects}} could be used.

{code}
Traversal computeFoo() {
  Traversal t = ...;
  t.store('x')...store('y')...;
  return t.deselect('x', 'y');
}

Traversal computeBar() {
  Traversal t = ...as('a');
  t.sub(computeFoo()).as('foo');

  // OK, since we resolve from our own path for 'a', plus scoped, exported side-effect of Foo.
  t.select('a', 'foo.x', 'foo.y');

  return t;
}
{code}



was (Author: mhfrantz):
The motivation for this RFE derives from my application design, which consists of traversals that are independently specified and tested, and then used to compose other traversals.

Suppose I compose the _Bar_ algorithm from the _Foo_ algorithm.  Each is implemented as a {{Traversal}}.  The Foo algorithm is the _sub-traversal_ and the Bar algorithm is the _super-traversal_.  The relationship is similar to a parent-child traversal relationship, but with additional restrictions on shared state.

There are four principle state variables that we'll consider in the following sections:
* Object (value of the traverser)
* Path
* Sack
* Side-Effects

We seek _encapsulation_ of this state, such that a sub-traversal's state does not blend with the super-traversal's state except as explicitly designated by the traversal author.

*Object*
Encapsulating the logic that modulates Object is already handled via {{map}} and {{flatMap}}.  The {{sub}} step would act similarly.  Since {{flatMap}} is more general, {{sub}} should probably behave as {{flatMap}}.

*Path*
Path encapsulation involves the step label namespace and the visibility of the path at both ends of the sub-traversal.  The guiding principles for proper encapsulation are as follows:
* A sub-traversal should (by default) have no visibility into the path of the super-traversal.
* A super-traversal should (by default) have no visibility into the path of the sub-traversal.
* A super-traversal MAY provide visibility into specific step labels of the super-traversal when invoking the sub-traversal.
* A sub-traversal MAY require visibility into specific step labels of the super-traversal through a binding/renaming option.
* A sub-traversal MAY provide visibility into specific step labels of the sub-traversal through an exporting option.
* A super-traversal MAY require visibility into specific step labels of the sub-traversal through a binding/renaming option.

These operations can be efficient, since they involve labeling steps, and possibly constructing a {{Path}} "view" with limited visibility into the underlying "real path".

_Path Importing_
No matter what Foo (the sub-traversal) does, it should not have access to the path of Bar (the super-traversal), unless Bar exports it.  That is, Foo could require explicitly importing a portion of Bar's path using something like a {{withPath}} modifier of the {{sub}} step.

{code}
Traversal computeFoo() {
  // Use path steps 'x', 'y', 'z' provided from super-traversal.
  Traversal t = select('x', 'y', 'z');
  ...
  return t;
}

Traversal computeBar() {
  Traversal t = ...;
  // Provide Foo with the x-y-z path it requires from our own a-b-c steps.
  t.sub(computeFoo())
    .withPath('x', 'y', 'z')
    .by(select('a'))
    .by(select('b'))
    .by(select('c'));
  ...
  return t;
}
{code}

_Path Exporting_
No matter what Bar (the super-traversal) does, it should not have access to labeled steps of Foo (the sub-traversal), unless Foo exports them.  That is, Foo could explicitly export a portion of its path using something like a {{deselect}} (a.k.a. {{makePath}}) step.

{code}
Traversal computeFoo() {
  Traversal t = ...;
  return t.deselect('x', 'y');
}

Traversal computeBar() {
  Traversal t = ...as('a');
  t.sub(computeFoo()).as('foo');
  t.path();  // would see steps 'a', 'foo.x', and 'foo.y'
  t.select('a', 'foo.x', 'foo.y');  // OK
  return t;
}
{code}

*Sack*
The sack feature is useful, but it is limited because the sack type can only be specified at the source.  Here is a use case that involves a "scoped sack" implementation.  The "Foo" algorithm uses sack with a {{FooSack}} instance.  However, "Bar" also wants to use sack, but wants to use its own {{BarSack}} instance.

{code}
Traversal computeFoo() {
  return __.withSack(new FooSack())...;
}

Traversal computeBar() {
  Traversal bar = __.withSack(new BarSack())...;
  bar.sub(computeFoo())...;
  return bar;
}
{code}

Thus, this issue is in part about being able to perform the {{withSack}} step on some interior traversal.

*Side-Effect*
As a global namespace, the side-effects are powerful, but it can also be difficult to coordinate partitioning of this global namespace in a modular traversal.  The {{sub}} step would enforce encapsulation of the side-effect namespace by default.  Each {{sub}} step would effectively provide a new {{SideEffects}} object.

{code}
Traversal computeFoo() {
  return __...store('a')...;
}

Traversal computeBar() {
  Traversal t = ...;
  t.sub(computeBar());
  ...
  t.store('a');  // This is NOT the same 'a' that Foo sees!
  ...
  return t;
}
{code}

You could selectively populate the side-effect of the sub-traversal using {{withSideEffect}} as a modifier.

{code}
Traversal computeFoo() {
   return __...select('a')...;  // Access side-effect 'a' which must be provided by super-traversal.
}

Traversal computeBar() {
  Traversal t = ...;
  // Provide Foo with its required side-effect key 'a' from our own 'x'.
  t.sub(computeBar())
    .withSideEffect('a', select('x'));
  ...
  return t;
}
{code}

If you want to allow exporting of side-effects directly from the sub-traversal into the super-traversal, then something like {{deselect}} or {{makeSideEffects}} could be used.

{code}
Traversal computeFoo() {
  Traversal t = ...;
  t.store('x')...store('y')...;
  return t.deselect('x', 'y');
}

Traversal computeBar() {
  Traversal t = ...as('a');
  t.sub(computeFoo()).as('foo');

  // OK, since we resolve from our own path for 'a', plus scoped, exported side-effect of Foo.
  t.select('a', 'foo.x', 'foo.y');

  return t;
}
{code}


> subroutine step
> ---------------
>
>                 Key: TINKERPOP3-716
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP3-716
>             Project: TinkerPop 3
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.0.1-incubating
>            Reporter: Matt Frantz
>            Assignee: Marko A. Rodriguez
>
> In our application, we build complex traversals.  We use a modular technique in which lower-level traversals are used to compose higher-level traversals.  For example, we might have an API like this:
> {code}
> void computeFoo(GraphTraversal t);
> {code}
> The assumption is that the "root" traversal has some state in the form of its traverser and path (and possible sack and side effects, although we're not using those at the moment).  Importantly, all of this state is not fully realized, as the goal is to build another traversal before executing it all at once.  To use the "foo" traversal in the "bar" traversal, I might do this:
> {code}
> void computeBar(GraphTraversal t) {
>   // Perform some part of the "bar" calculation, including setting up the parameters of "foo"
>   t.out()...;
>   // Attach the "foo" logic.
>   computeFoo(t);
>   // Continue with more "bar" logic.
>   t.blah()...;
> }
> {code}
> In the implementation of these modular traversals, we end up exposing certain implementation details, especially in the form of labeled steps (path keys).  It is often the case that we don't necessarily want to leak this state.  We often want these traversals to operate as if they were single steps, discarding any evidence of graph wandering that might have occurred.
> What if there were a step which acted as a "stack" for the path?  (It might also act as a stack for the sack; we figured we would need that if we ever decide to use sack.)  Not only would this provide encapsulation, it would allow the path to be pruned and thus improve runtime efficiency.
> We want traversal subroutines.  How about this signature?
> {code}
> GraphTraversal sub(Traversal subTraversal);
> {code}
> What this would do is guarantee that no path state (or sack state) would escape.  The only effect would be on the traverser.  I think it could be implemented using a {{MapStep}}, or possibly {{FlatMapStep}} in order to realize a subTraversal that branches.
> What state would subTraversal have access to?  By default, it could only see the traverser.  It has its own key namespace for the path, so it cannot see the outer path.  It has its own sack.  It should probably have its own side effect key namespace, too.
> If you want to opt-in state into the subTraversal, then perhaps you could list the keys to form a pseudo-path:
> {code}
> GraphTraversal sub(Traversal subTraversal, String...stepOrSideEffectLabels);
> {code}
> One of the interesting things that this might provide (eventually) is the ability to hang OLTP traversal off of OLAP traversals.  I had discussed this need a while ago in the context of vertices that act as OLAP "computation loci".  A "sub" could use a different engine, and thus embed a standard traversal within a computer traversal.
> This would change our modular programming paradigm.  We could safely redesign each module to produce an anonymous traversal like so:
> {code}
> GraphTraversal computeFoo();
> GraphTraversal computeBar() {
>   // Perform some part of the "bar" calculation, including setting up the parameters of "foo"
>   GraphTraversal t = __.out()...;
>   // Attach the "foo" logic.
>   t.sub(computeFoo(t));
>   // Continue with more "bar" logic.
>   return t.in()...;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)