You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2014/07/24 07:38:38 UTC

[jira] [Updated] (TEZ-1157) Optimize broadcast :- Tasks pertaining to same job in same machine should not download multiple copies of broadcast data

     [ https://issues.apache.org/jira/browse/TEZ-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gopal V updated TEZ-1157:
-------------------------

    Attachment: TEZ-broadcast-shuffle+vertex-parallelism.patch

Vertex parallelism and shuffle payload changes.

The parallelism set up after -1 is resolved to the actual parallelism of the task.

> Optimize broadcast :- Tasks pertaining to same job in same machine should not download multiple copies of broadcast data
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: TEZ-1157
>                 URL: https://issues.apache.org/jira/browse/TEZ-1157
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>              Labels: performance
>         Attachments: TEZ-1152.WIP.patch, TEZ-broadcast-shuffle+vertex-parallelism.patch
>
>
> Currently tasks (belonging to same job) running in the same machine download its own copy of broadcast data.  Optimization could be to  download one copy in the machine, and the rest of the tasks can refer to this downloaded copy.



--
This message was sent by Atlassian JIRA
(v6.2#6252)