You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org> on 2015/07/16 03:02:05 UTC

[jira] [Updated] (MAPREDUCE-6303) Read timeout when retrying a fetch error can be fatal to a reducer

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod Kumar Vavilapalli updated MAPREDUCE-6303:
-----------------------------------------------
    Labels: 2.6.1-candidate  (was: )

> Read timeout when retrying a fetch error can be fatal to a reducer
> ------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6303
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6303
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Blocker
>              Labels: 2.6.1-candidate
>             Fix For: 2.7.0
>
>         Attachments: MAPREDUCE-6303.001.patch
>
>
> If a reducer encounters an error trying to fetch from a node then encounters a read timeout when trying to re-establish the connection then the reducer can fail.  The read timeout exception can leak to the top of the Fetcher thread which will cause the reduce task to teardown.  This type of error can repeat across reducer attempts causing jobs to fail due to a single bad node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)