You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Thomas Graves (JIRA)" <ji...@apache.org> on 2011/05/26 22:36:47 UTC

[jira] [Commented] (MAPREDUCE-2529) Recognize Jetty bug 1342 and handle it

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039909#comment-13039909 ] 

Thomas Graves commented on MAPREDUCE-2529:
------------------------------------------

I'm proposing to add a new metric to the shuffle output metrics and increment it when it sees a configurable regex in the IOexception in the MapOutputServlet.  This metric can then be viewed by external systems or potentially the health_check script (HADOOP-7144 should make that easier).  Making it configurable will make it more useful in the future in case we see other Jetty/JVM exceptions/issues that need to be worked around.






> Recognize Jetty bug 1342 and handle it
> --------------------------------------
>
>                 Key: MAPREDUCE-2529
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2529
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.204.0, 0.23.0
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>
> We are seeing many instances of the Jetty-1342 (http://jira.codehaus.org/browse/JETTY-1342). The bug doesn't cause Jetty to stop responding altogether, some fetches go through but a lot of them throw exceptions and eventually fail. The only way we have found to get the TT out of this state is to restart the TT.  This jira is to catch this particular exception (or perhaps a configurable regex) and handle it in an automated way to either blacklist or shutdown the TT after seeing it a configurable number of them.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira