You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2012/05/08 21:17:49 UTC

[jira] [Commented] (HADOOP-8353) hadoop-daemon.sh and yarn-daemon.sh can be misleading on stop

    [ https://issues.apache.org/jira/browse/HADOOP-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270733#comment-13270733 ] 

Hadoop QA commented on HADOOP-8353:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526026/HADOOP-8353.patch.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in hadoop-common-project/hadoop-common.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/960//testReport/
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/960//console

This message is automatically generated.
                
> hadoop-daemon.sh and yarn-daemon.sh can be misleading on stop
> -------------------------------------------------------------
>
>                 Key: HADOOP-8353
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8353
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: scripts
>    Affects Versions: 0.23.1
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>             Fix For: 2.0.0
>
>         Attachments: HADOOP-8353.patch.txt
>
>
> The way that stop actions is implemented is a simple SIGTERM sent to the JVM. There's a time delay between when the action is called and when the process actually exists. This can be misleading to the callers of the *-daemon.sh scripts since they expect stop action to return when process is actually stopped.
> I suggest we augment the stop action with a time-delay check for the process status and a SIGKILL once the delay has expired.
> I understand that sending SIGKILL is a measure of last resort and is generally frowned upon among init.d script writers, but the excuse we have for Hadoop is that it is engineered to be a fault tolerant system and thus there's not danger of putting system into an incontinent state by a violent SIGKILL. Of course, the time delay will be long enough to make SIGKILL event a rare condition.
> Finally, there's always an option of an exponential back-off type of solution if we decide that SIGKILL timeout is short.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira