You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Allen Wittenauer (JIRA)" <ji...@apache.org> on 2014/07/17 23:36:06 UTC
[jira] [Resolved] (HADOOP-2960) A mapper should use some heuristics
to decide whether to run the combiner during spills
[ https://issues.apache.org/jira/browse/HADOOP-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer resolved HADOOP-2960.
--------------------------------------
Resolution: Won't Fix
Closing at won't fix, given the -1.
> A mapper should use some heuristics to decide whether to run the combiner during spills
> ---------------------------------------------------------------------------------------
>
> Key: HADOOP-2960
> URL: https://issues.apache.org/jira/browse/HADOOP-2960
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Runping Qi
>
> Right now, the combiner, if set, will be called for each spill, no mapper whether the combiner can actually reduce the values.
> The mapper should use some heuristics to decide whether to run the combiner during spills.
> One of such heuristics is to check the the ratio of the nymber of keys to the number of unique keys in the spill.
> The combiner will be called only if that ration exceeds certain threshold (say 2).
--
This message was sent by Atlassian JIRA
(v6.2#6252)