You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Allen Wittenauer (JIRA)" <ji...@apache.org> on 2015/03/10 04:44:40 UTC
[jira] [Comment Edited] (HADOOP-9086) Enforce process singleton
rules through an exclusive write lock on a file, not a pid file +kill -0,
[ https://issues.apache.org/jira/browse/HADOOP-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354220#comment-14354220 ]
Allen Wittenauer edited comment on HADOOP-9086 at 3/10/15 3:44 AM:
-------------------------------------------------------------------
I'm going to set this as won't fix. Introducing more dependencies at this level sounds like a bad thing, esp given that every ops person has their own preferences as to what to use here.
was (Author: aw):
I'm going to set this as won't fix. Introducing more dependencies at this level sounds like a bad thing, esp given that every ops person has their own preferences as to what to user here.
> Enforce process singleton rules through an exclusive write lock on a file, not a pid file +kill -0,
> ---------------------------------------------------------------------------------------------------
>
> Key: HADOOP-9086
> URL: https://issues.apache.org/jira/browse/HADOOP-9086
> Project: Hadoop Common
> Issue Type: Improvement
> Components: scripts, util
> Affects Versions: 1.1.1, 2.0.3-alpha
> Environment: Unix/Linux.
> Reporter: Steve Loughran
>
> the {{hadoop-daemon.sh}} script (and other liveness monitors) probe the existence of a daemon service by a {{kill -0}} of a process id picked up from a pid file.
> This is flawed
> # pid file locations may change with installations.
> # Linux and Unix recycle pids, leading to false positives -the scripts think the process is running, when another process is.
> # doesn't work on windows.
> Having the processes acquire an exclusive write-lock on a known file would delegate lock management and implicitly liveness to the OS itself. when the process dies, the lock is relased (on Unixes)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)