You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2008/09/24 06:13:44 UTC

[jira] Updated: (HADOOP-4246) Reduce task copy errors may not kill it eventually

     [ https://issues.apache.org/jira/browse/HADOOP-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amareshwari Sriramadasu updated HADOOP-4246:
--------------------------------------------

             Priority: Blocker  (was: Critical)
    Affects Version/s: 0.19.0

Currently only READ and CONNECT errors in Reducer copyOutput are count against failed fetches, other errors like disk out of space are not taken into consideration. These errors could just hang the reducer.

> Reduce task copy errors may not kill it eventually
> --------------------------------------------------
>
>                 Key: HADOOP-4246
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4246
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> maxFetchRetriesPerMap in reduce task can be zero some times (when maxMapRunTime is less than 4 seconds or mapred.reduce.copy.backoff is less than 4). This will not count reduce task copy errors to kill it eventually.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.