You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Patrick Hunt (JIRA)" <ji...@apache.org> on 2010/02/20 01:32:27 UTC
[jira] Updated: (ZOOKEEPER-485) need ops documentation that details
supervision of ZK server processes
[ https://issues.apache.org/jira/browse/ZOOKEEPER-485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Patrick Hunt updated ZOOKEEPER-485:
-----------------------------------
Attachment: ZOOKEEPER-485.patch
this patch details having a supervisory process (also fills out the monitoring section)
> need ops documentation that details supervision of ZK server processes
> ----------------------------------------------------------------------
>
> Key: ZOOKEEPER-485
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-485
> Project: Zookeeper
> Issue Type: Bug
> Components: documentation, server
> Reporter: Patrick Hunt
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-485.patch
>
>
> We need ops documentation detailing what to do if the ZK server VM fails - by fail I mean the jvm process
> exits/dies/crashes/etc...
> In general a supervisor process should be used to start/stop/restart/etc... the ZK server vm.
> Something like daemontools http://cr.yp.to/daemontools.html could be used, or more simply a wrapper script
> should monitor the status of the pid and restart if the jvm fails. It's up to the operator, if this is not done
> automatically then it will have to be done manually, by operator restarting the ZK server jvm
> The inherent behavior of ZK wrt to failures - ie that it automatically recovers as long as quorum is maintained -
> fits into this nicely.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.