You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Zhiyuan Yang (JIRA)" <ji...@apache.org> on 2017/04/04 21:34:41 UTC

[jira] [Commented] (TEZ-3654) CartesianProduct edge won't work with GroupInputEdge

    [ https://issues.apache.org/jira/browse/TEZ-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955891#comment-15955891 ] 

Zhiyuan Yang commented on TEZ-3654:
-----------------------------------

Thanks [~sseth] for review! I've uploaded new patch to address your comments.

{quote}
In VertexManager -> inputGroups.put(group.getGroupName(), group.getGroupVertices());
getGroupVertives should be cloned, to prevent plugins from changing these.
{quote}
I've changed getGroupVertices to return unmodifiable list.

{quote}
CartesianProductVertexManagerConfig: Why remove the getters and setters in favor of package level direct access?
{quote}
Unlike CartesianProductConfig available to end user, CartesianProductVertexManagerConfig is only used internally by vertex managers or tests. Removing getter/setter makes code more precise.

{quote}
Think we need to rename several parameters, and add a bunch of documentation.
{quote}
Massive renaming has been done in new patch to improve readability.


> CartesianProduct edge won't work with GroupInputEdge
> ----------------------------------------------------
>
>                 Key: TEZ-3654
>                 URL: https://issues.apache.org/jira/browse/TEZ-3654
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Zhiyuan Yang
>            Assignee: Zhiyuan Yang
>         Attachments: TEZ-3654.1.patch, TEZ-3654.2.patch, TEZ-3654.3.patch, TEZ-3654.4.patch
>
>
> If a vertex group is used as source of cartesian product, it expands into multiple vertices that share the same edge properties and each CP edge is taken as individual cartesian product source by CP vertex manager. CP vertex manager will find there are more CP edge than expected and abort the AM.  
> Ideally group edge semantic should be fixed: both task and vertex manager should see same number of sources; also not every edge can be simply duplicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)