You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Brett Eisenberg (JIRA)" <ji...@apache.org> on 2009/07/24 07:44:14 UTC

[jira] Commented: (ZOOKEEPER-485) need ops documentation that details supervision of ZK server processes

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734911#action_12734911 ] 

Brett Eisenberg commented on ZOOKEEPER-485:
-------------------------------------------

FWIW, Zookeeper works great under SMF (http://en.wikipedia.org/wiki/Service_Management_Facility)

> need ops documentation that details supervision of ZK server processes
> ----------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-485
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-485
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: documentation, server
>            Reporter: Patrick Hunt
>             Fix For: 3.2.1, 3.3.0
>
>
> We need ops documentation detailing what to do if the ZK server VM fails - by fail I mean the jvm process
> exits/dies/crashes/etc...
> In general a supervisor process should be used to start/stop/restart/etc... the ZK server vm.
> Something like daemontools http://cr.yp.to/daemontools.html could be used, or more simply a wrapper script
> should monitor the status of the pid and restart if the jvm fails. It's up to the operator, if this is not done
> automatically then it will have to be done manually, by operator restarting the ZK server jvm
> The inherent behavior of ZK wrt to failures - ie that it automatically recovers as long as quorum is maintained - 
> fits into this nicely.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.