You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Jens Rantil <je...@tink.se> on 2018/09/19 09:04:46 UTC

Healthcheck

Hello,

We are in need for a shell command that we can execute on a specific node
to make sure that the node has come up and has synced up with the ensemble.
Is there any such command? Currently we are doing

curl --silent --show-error --fail http://localhost:8080/commands/stat |
grep -qE '"server_state" : "(follower|leader)"'

but I suspect that that only takes the master election into account and not
the fact that we've synced up.

Anyone that has a better solution? One idea would to wait for
`initLimit*tickTime+someDelta` ms and make sure that the same Java system
process is still running. I also notice that the official Kubernetes Helm
chart simply is doing an `ruok` 4-letter command for ready and liveness
check.

Any input appreciated - thanks,
Jens
-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>

Re: Healthcheck

Posted by Patrick Hunt <ph...@apache.org>.
I created this a few years ago, afaik it still works:
https://github.com/phunt/zk-smoketest

Regards,

Patrick


On Wed, Sep 19, 2018 at 2:05 AM Jens Rantil <je...@tink.se> wrote:

> Hello,
>
> We are in need for a shell command that we can execute on a specific node
> to make sure that the node has come up and has synced up with the ensemble.
> Is there any such command? Currently we are doing
>
> curl --silent --show-error --fail http://localhost:8080/commands/stat |
> grep -qE '"server_state" : "(follower|leader)"'
>
> but I suspect that that only takes the master election into account and not
> the fact that we've synced up.
>
> Anyone that has a better solution? One idea would to wait for
> `initLimit*tickTime+someDelta` ms and make sure that the same Java system
> process is still running. I also notice that the official Kubernetes Helm
> chart simply is doing an `ruok` 4-letter command for ready and liveness
> check.
>
> Any input appreciated - thanks,
> Jens
> --
> Jens Rantil
> Backend engineer
> Tink AB
>
> Email: jens.rantil@tink.se
> Phone: +46 708 84 18 32
> Web: www.tink.se
>
> Facebook <https://www.facebook.com/#!/tink.se> Linkedin
> <
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> >
>  Twitter <https://twitter.com/tink>
>