You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Aaron T. Myers (JIRA)" <ji...@apache.org> on 2017/09/08 18:24:00 UTC

[jira] [Commented] (HADOOP-14855) Hadoop scripts may errantly believe a daemon is still running, preventing it from starting

    [ https://issues.apache.org/jira/browse/HADOOP-14855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16159068#comment-16159068 ] 

Aaron T. Myers commented on HADOOP-14855:
-----------------------------------------

[~aw] - does this ring any bells for you? Any thoughts on how to make this check more robust?

> Hadoop scripts may errantly believe a daemon is still running, preventing it from starting
> ------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-14855
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14855
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: scripts
>    Affects Versions: 3.0.0-alpha4
>            Reporter: Aaron T. Myers
>
> I encountered a case recently where the NN wouldn't start, with the error message "namenode is running as process 16769.  Stop it first." In fact the NN was not running at all, but rather another long-running process was running with this pid.
> It looks to me like our scripts just check to see if _any_ process is running with the pid that the NN (or any Hadoop daemon) most recently ran with. This is clearly not a fool-proof way of checking to see if a particular type of daemon is now running, as some other process could start running with the same pid since the daemon in question was previously shut down.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org