You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@chukwa.apache.org by "Ari Rabkin (JIRA)" <ji...@apache.org> on 2012/07/17 01:51:34 UTC

[jira] [Resolved] (CHUKWA-647) Spread out intermediate data with the same ReduceType into different Reduce Tasks

     [ https://issues.apache.org/jira/browse/CHUKWA-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ari Rabkin resolved CHUKWA-647.
-------------------------------

       Resolution: Fixed
    Fix Version/s: 0.6.0
         Assignee: Ari Rabkin

I just committed this to Trunk. Thanks!

NOTE: made some slight changes to patch to apply correctly to Trunk.
                
> Spread out intermediate data with the same ReduceType into different Reduce Tasks
> ---------------------------------------------------------------------------------
>
>                 Key: CHUKWA-647
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-647
>             Project: Chukwa
>          Issue Type: Improvement
>          Components: Data Processors
>    Affects Versions: 0.4.0, 0.6.0
>            Reporter: Jie Huang
>            Assignee: Ari Rabkin
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: Chukwa-647-0_4.patch
>
>
> We have found that if we partitioned the map output data according to ReduceType, we can see the data skew in some HiTune cases. Then one or two Reduce Tasks slow down the whole Demux job somehow, since those reduce tasks have to process more input-data.    

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira