You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2013/08/30 02:59:52 UTC

[jira] [Updated] (TEZ-410) Refactor EdgeProperties to be more clear

     [ https://issues.apache.org/jira/browse/TEZ-410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bikas Saha updated TEZ-410:
---------------------------

    Summary: Refactor EdgeProperties to be more clear  (was: Refactor Edge Connection Pattern to be more clear)
    
> Refactor EdgeProperties to be more clear
> ----------------------------------------
>
>                 Key: TEZ-410
>                 URL: https://issues.apache.org/jira/browse/TEZ-410
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>         Attachments: TEZ-410.1.patch, TEZ-410.2.patch, TEZ-410.3.patch, TEZ-410.4.patch, TEZ-410.5.patch
>
>
> During discussion with users there was feedback that edge properties need to be named better to make them more clear. There was a suggestion to look at MPI for inspiration. Based on that feedback, the proposal is to renamed ConnectionPattern to DataMovement as that is essentially what the property is defining. A Bipartite connection pattern can be constructed from both broadcast and scatter-gather data movement types. There will be 3 kinds of data movements initially. 
> ONE_TO_ONE - Defines an output produced by the ith upstream task is available the the ith downstream task.
> BROADCAST - Defines an output produced by any upstream task is available to all downstream tasks.
> SCATTER_GATHER - Defines that the ith output produced by all upstream tasks is available to the same downstream task. Upstream tasks scatter there outputs and they are gathered by designated downstream tasks.
> To be clear, output being available to the a task does not imply that the entire output is transferred/read by it. The task can choose to read any amount of the total data.
> Current users: In the EdgeProperty object
> Please change EdgeConnectionPattern.BIPARTITE -> DataMovementType.SCATTER_GATHER
> Please change SourceType.STABLE -> DataSourceType.PERSISTED
> Please add SchedulingType.SEQUENTIAL to EdgeProperty objects.
> The getter methods have similar name changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira