You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2015/11/25 19:40:10 UTC

[jira] [Commented] (TEZ-2962) Use per partition stats in shuffle vertex manager auto parallelism

    [ https://issues.apache.org/jira/browse/TEZ-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15027345#comment-15027345 ] 

Bikas Saha commented on TEZ-2962:
---------------------------------

E.g. Instead of extrapolating total size overall to get new numPartitions and diving equally among new partitions, we could coalesce partitions until desired per partition size is reached.

> Use per partition stats in shuffle vertex manager auto parallelism
> ------------------------------------------------------------------
>
>                 Key: TEZ-2962
>                 URL: https://issues.apache.org/jira/browse/TEZ-2962
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Bikas Saha
>
> The original code used output size sent by completed tasks. Recently per partition stats have been added that provide granular information. Using partition stats may be more accurate and also remove the duplicate counting of data size in partition stats and per task overall.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)