You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2013/07/16 01:20:49 UTC

[jira] [Updated] (MAPREDUCE-5251) Reducer should not implicate map attempt if it has insufficient space to fetch map output

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Lowe updated MAPREDUCE-5251:
----------------------------------

    Status: Open  (was: Patch Available)

Actually on second thought, we really need a way to report the actual error being encountered.  Otherwise an error will occur and the logs won't have any traceback or message indicating what the nature of the error is.

reportLocalError should probably take a Throwable argument so we can preserve the error information when we report it.  reportLocalError can then just pass that exception on to the exception reporter.
                
> Reducer should not implicate map attempt if it has insufficient space to fetch map output
> -----------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5251
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5251
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 2.0.4-alpha, 0.23.7
>            Reporter: Jason Lowe
>            Assignee: Ashwin Shankar
>         Attachments: MAPREDUCE-5251-2.txt, MAPREDUCE-5251-3.txt
>
>
> A job can fail if a reducer happens to run on a node with insufficient space to hold a map attempt's output.  The reducer keeps reporting the map attempt as bad, and if the map attempt ends up being re-launched too many times before the reducer decides maybe it is the real problem the job can fail.
> In that scenario it would be better to re-launch the reduce attempt and hopefully it will run on another node that has sufficient space to complete the shuffle.  Reporting the map attempt is bad and relaunching the map task doesn't change the fact that the reducer can't hold the output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira