You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2014/08/09 20:52:11 UTC

[jira] [Commented] (TEZ-1396) Grouping should generate consistent groups when given the same set of Splits

    [ https://issues.apache.org/jira/browse/TEZ-1396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091851#comment-14091851 ] 

Bikas Saha commented on TEZ-1396:
---------------------------------

This is not something thats always desirable. In a busy cluster, when a data set is hot then there are equally good reasons to spread different consumers around to avoid hot spots. The intent of this jira mainly helps cases where there is some active service trying to cache data.

> Grouping should generate consistent groups when given the same set of Splits
> ----------------------------------------------------------------------------
>
>                 Key: TEZ-1396
>                 URL: https://issues.apache.org/jira/browse/TEZ-1396
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>
> Currently, it seems like Grouping can end up generating a different set of groups on different invocations of the same set of splits and target tasks.
> The order likely gets affected by the randomization in the block location report from HDFS.
> This should be consistent for better cache utilization.



--
This message was sent by Atlassian JIRA
(v6.2#6252)