You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Marko A. Rodriguez (JIRA)" <ji...@apache.org> on 2016/02/01 18:03:39 UTC

[jira] [Commented] (TINKERPOP-962) Provide "vertex query" selectivity when importing data in OLAP.

    [ https://issues.apache.org/jira/browse/TINKERPOP-962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126573#comment-15126573 ] 

Marko A. Rodriguez commented on TINKERPOP-962:
----------------------------------------------

This would be a lot easier/memory efficient if the submitted {{Traversal}}-filters could only analyze vertex properties/labels/ids for {{vertices()}} and edge properties/label/ids for {{edges()}}. Perhaps we make that a hard constraint? I'm already thing that for providers that want to use this, if the vertex filter is {{outE("know").count().is(gt(10))}} then its basically a full graph load :|.

> Provide "vertex query" selectivity when importing data in OLAP.
> ---------------------------------------------------------------
>
>                 Key: TINKERPOP-962
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-962
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.1.0-incubating
>            Reporter: Marko A. Rodriguez
>            Assignee: Marko A. Rodriguez
>              Labels: breaking
>             Fix For: 3.2.0-incubating
>
>
> Currently, when you do:
> {code}
> graph.compute().program(PageRankVertexProgram).submit()
> {code}
> We are pulling the entire {{graph}} into the OLAP engine. We should allow the user to limit the amount of data pulled via "vertex query"-type filter. For instance, we could support the following two new methods on {{GraphComputer}}.
> {code}
> graph.compute().program(PageRankVertexProgram).vertices(hasLabel('person')).edges(out, hasLabel('knows','friend').has('weight',gt(0.8)).submit()
> {code}
> The two methods would be defined as:
> {code}
> public interface GraphComputer {
> ...
> GraphComputer vertices(final Traversal<Vertex,Vertex> vertexFilter)
> GraphComputer edges(final Direction direction, final Traversal<Edge,Edge> edgeFilter)
> {code}
> If the user does NOT provide a {{vertices()}} (or {{edges()}}) call, then the {{Traversal}} is assumed to be {{IdentityTraversal}}. Finally, in terms of execution order, first {{vertices()}} is called and if "false" then don't call edges. Else, call edges on all the respective incoming and outgoing edges. Don't really like {{Direction}} there and perhaps its just:
> {code}
> GraphComputer edges(final Traversal<Vertex,Edge> edgeFilter)
> {code}
> And then all edges that pass through are added to OLAP vertex. You don't want {{both}}? Then its {{outE('knows',friend').has('weight',gt(0.8))}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)