You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Ted Wilmes (JIRA)" <ji...@apache.org> on 2016/06/16 22:24:05 UTC

[jira] [Comment Edited] (TINKERPOP-1254) Support dropping traverser path information when it is no longer needed.

    [ https://issues.apache.org/jira/browse/TINKERPOP-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334745#comment-15334745 ] 

Ted Wilmes edited comment on TINKERPOP-1254 at 6/16/16 10:23 PM:
-----------------------------------------------------------------

[~okram] I pushed a very, very rough version.  Not ready for PR yet as I need to cleanup code and add more tests.  Having said that, if you're curious and want to take a look at it in its current state, I'd appreciate any feedback.  I think I've captured the basic ideas.  Diff can be seen at https://github.com/apache/tinkerpop/compare/master...TINKERPOP-1254  

I was wondering what your thoughts were on unit testing the strategy application.  It's different than the current strategy tests because I'm not rewriting the traversal, but instead setting the keep labels on certain steps.  My simple idea was to just add a {{getKeepLabels}} to the {{PathProcessor}} interface so I could introspect for my unit tests, but thought you might have a better idea.


was (Author: twilmes):
@okram I pushed a very, very rough version.  Not ready for PR yet as I need to cleanup code and add more tests.  Having said that, if you're curious and want to take a look at it in its current state, I'd appreciate any feedback.  I think I've captured the basic ideas.  Diff can be seen at https://github.com/apache/tinkerpop/compare/master...TINKERPOP-1254  

I was wondering what your thoughts were on unit testing the strategy application.  It's different than the current strategy tests because I'm not rewriting the traversal, but instead setting the keep labels on certain steps.  My simple idea was to just add a {{getKeepLabels}} to the {{PathProcessor}} interface so I could introspect for my unit tests, but thought you might have a better idea.

> Support dropping traverser path information when it is no longer needed.
> ------------------------------------------------------------------------
>
>                 Key: TINKERPOP-1254
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1254
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.1.1-incubating
>            Reporter: Marko A. Rodriguez
>            Assignee: Ted Wilmes
>
> The most expensive traversals (especially in OLAP) are those that can not be "bulked." There are various reasons why two traversers at the same object can not be bulked, but the primary reason is {{PATH}} or {{LABELED_PATH}}. That is, when the history of the traverser is required, the probability of two traversers having the same history is low.
> A key to making traversals more efficient is to do as a much as possible to remove historic information from a traverser so it can get bulked. How does one do this? 
> {code}
> g.V.as('a').out().as('b').out().where(neq('a').and().neq('b')).both().name
> {code}
> The {{LABELED_PATH}} of "a" and "b" are required up to the {{where()}} and at which point, at {{both()}}, they are no longer required. It would be smart to support:
> {code}
> traverser.dropLabels(Set<String>)
> traverser.dropPath()
> {code}
> We would then, via a {{TraversalOptimizationStrategy}} insert a step between {{where()}} and {{both()}} called {{PathPruneStep}} which would be a {{SideEffectStep}}. The strategy would know which labels were no longer needed (via forward lookahead) and then do:
> {code}
> public class PathPruneStep {
>   final Set<String> dropLabels = ...
>   final boolean dropPath = ...
>   public void sideEffect(final Traverser<S> traverser) {
>     final Traverser<S> start = this.starts.next();
>     if(this.dropPath) start.dropPath();
>     else start.dropLabels(labels); 
>   }
> }
> {code}
> Again, the more we can prune historic path data no longer needed, the higher the probability of bulking. Think about this in terms of {{match()}}.
> {code}
> g.V().match(
>   a.out.b,
>   b.out.c,
>   c.neq.a,
>   c.out.b,
> ).select("a")
> {code}
> All we need is "a" at the end. Thus, once a pattern has been passed and no future patterns require that label, drop it! 
> This idea is related to TINKERPOP-331, but I don't think we should deal with manipulating the species. Thus, I think 331 is too "low level."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)