You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Allen Wittenauer (JIRA)" <ji...@apache.org> on 2015/03/10 04:44:40 UTC

[jira] [Comment Edited] (HADOOP-9086) Enforce process singleton rules through an exclusive write lock on a file, not a pid file +kill -0,

    [ https://issues.apache.org/jira/browse/HADOOP-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354220#comment-14354220 ] 

Allen Wittenauer edited comment on HADOOP-9086 at 3/10/15 3:44 AM:
-------------------------------------------------------------------

I'm going to set this as won't fix.  Introducing more dependencies at this level sounds like a bad thing, esp given that every ops person has their own preferences as to what to use here.


was (Author: aw):
I'm going to set this as won't fix.  Introducing more dependencies at this level sounds like a bad thing, esp given that every ops person has their own preferences as to what to user here.

> Enforce process singleton rules through an exclusive write lock on a file, not a pid file +kill -0,
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-9086
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9086
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: scripts, util
>    Affects Versions: 1.1.1, 2.0.3-alpha
>         Environment: Unix/Linux. 
>            Reporter: Steve Loughran
>
> the {{hadoop-daemon.sh}} script (and other liveness monitors) probe the existence of a daemon service by a {{kill -0}} of a process id picked up from a pid file. 
> This is flawed
> # pid file locations may change with installations.
> # Linux and Unix recycle pids, leading to false positives -the scripts think the process is running, when another process is.
> # doesn't work on windows.
> Having the processes acquire an exclusive write-lock on a known file would delegate lock management and implicitly liveness to the OS itself. when the process dies, the lock is relased (on Unixes)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)