You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Ran Magen (JIRA)" <ji...@apache.org> on 2015/08/13 11:49:45 UTC
[jira] [Commented] (TINKERPOP3-702) Buffer input to inner traversals

    [ https://issues.apache.org/jira/browse/TINKERPOP3-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14694969#comment-14694969 ] 

Ran Magen commented on TINKERPOP3-702:
--------------------------------------

Hey [~okram],  
I don't completely understand why local traversals are needed to reduce memory? Isn't this something that should be handled by the steps inside the traversal?   If the steps "over-reach" by doing something like:  
{code:java}
while(this.starts.hasNext()) { doSomething(this.starts.next()); }
{code}
then there will probably be a memory issue.  
But if the steps act "responsibly" by doing:  
{code:java}
while(this.starts.hasNext() && count < REASONABLE_BULK_SIZE) { doSomething(this.starts.next()); count++; }
{code}
then there shouldn't be a problem. What am I missing here?  

Why should something like {{g.V().repeat(out()).times(4)}} be more memory-limiting then {{g.V().out().out().out().out()}}?  Same goes for {{CoalesceStep}}, {{UnionStep}}, etc.
Bulking a reasonable amount of requests to the BE storage makes a HUGE difference in performance. I believe the current balance between bulking vs. a small memory footprint is skewed too much towards the small memory footprint. Shouldn't vendors have the option to decide whats more important for them?

> Buffer input to inner traversals
> --------------------------------
>
>                 Key: TINKERPOP3-702
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP3-702
>             Project: TinkerPop 3
>          Issue Type: Improvement
>          Components: process
>            Reporter: Ran Magen
>            Assignee: Marko A. Rodriguez
>             Fix For: 3.0.0-incubating
>
>
> In elastic-gremlin we implement an optimized VertexStep. Part of its job is to batch/buffer/bulk different traversers and query them together in-order to minizmize the number of queries. 
> You can see the implementation here: https://github.com/rmagen/elastic-gremlin/blob/master/src/main/java/org/elasticgremlin/process/optimize/ElasticVertexStep.java#L36
> This works great in regular traversals, the "starts" iterator returns as many traversers as the previous step gave out.
> But when the step is in an innerTraversal (e.g. g.V().repeat(__.out()).times(8)), the "starts" iterator only returns one traverser, and will return the next traverser only in the next call to processNextStart. Thus, there is no way to run a bulk query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)