You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Thomas Graves (JIRA)" <ji...@apache.org> on 2011/05/04 15:43:03 UTC

[jira] [Commented] (MAPREDUCE-2451) Log the reason string of healthcheck script

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028747#comment-13028747 ] 

Thomas Graves commented on MAPREDUCE-2451:
------------------------------------------

For the 20.205 branch these are based on branch-0.20-security here are the results of test-patch.  The -1 for javadoc and eclipse both exists in the branch-20-security without my change. 

 I didn't include any tests because this is a trivial change to the log message.  Manual test steps are below.

     [exec] -1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified tests.
     [exec]                         Please justify why no tests are needed for this patch.
     [exec]
     [exec]     -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.     [exec]
     [exec]     -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from the contents of the lib directories.     [exec]     [exec]
     [exec]
     [exec]

Manual test steps: 
- modify mapred_site.xml to have the mapred.healthChecker.script.path configuration.  Have it point to a script like ~/health_check. 
- modify ~/health_check to contain just something like:
#!/bin/bash
exit 0

- start the cluster and make sure every is running fine.
- modify the ~/health_check script on a tasktracker and insert the a line like: echo -n "ERROR new string"  before the exit 0 line. 
- wait for tasktracker to send heartbeat back with updated health info
- look in the jobtracker log file and verify the log line looks similar to this. This bug added the "Reason details : ERROR new string" bit.
2011-05-04 13:30:31,926 INFO org.apache.hadoop.mapred.JobTracker: Blacklisting tracker : yourhost.com Reason for blacklisting is : NODE_UNHEALTHY Reason details : ERROR new string
- also verify the tasktracker got blacklisted.

> Log the reason string of healthcheck script
> -------------------------------------------
>
>                 Key: MAPREDUCE-2451
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2451
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>    Affects Versions: 0.20.204.0
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>            Priority: Trivial
>             Fix For: 0.20.205.0, 0.23.0
>
>         Attachments: MAPREDUCE-2451-20.205.patch, MAPREDUCE-2451-trunk.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The information on why a specific TaskTracker got blacklisted is not stored anywhere. The jobtracker web ui will show the detailed reason string until the TT gets unblacklisted.  After that it is lost.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira