You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2015/04/02 19:53:53 UTC

[jira] [Commented] (TEZ-1190) Allow multiple edges between two vertexes

    [ https://issues.apache.org/jira/browse/TEZ-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393033#comment-14393033 ] 

Rohini Palaniswamy commented on TEZ-1190:
-----------------------------------------

 I encountered a couple of queries in past two weeks which suffer from performance due to this. Currently we write out the data to another dummy vertex to avoid multiple edges and this adds overhead. The common patterns are
    1) People split the data, perform some foreach transformations/filter, union them and then do some operation like group by or join with other data
    2) People split the data, perform some foreach transformations/filter and self join them. No union in this case. 

Vertex groups accept multiple edges from same vertex.  So we can optimize the multi-query planning for 1) when we know there is a vertex group. I hope we can rely on that behavior and that does not change?

> Allow multiple edges between two vertexes
> -----------------------------------------
>
>                 Key: TEZ-1190
>                 URL: https://issues.apache.org/jira/browse/TEZ-1190
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Daniel Dai
>
> This will be helpful in some scenario. In particular example, we can merge two small pipelines together in one pair of vertex. Note it is possible the edge type between the two vertexes are different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)