You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@curator.apache.org by Jędrzej Dąbrowa <na...@gmail.com> on 2016/09/13 07:15:18 UTC

ConnectionStateListener not called on lost quorum

I connect through Curator to an ensemble of 3 zk (testing) servers. Any
time zk connection is lost I would like to return appropriate error code to
the user instead of calling zk. I do this by monitoring connection state
with ConnectionStateListener. It works with various test scenarios, but
when 2 out of 3 servers are killed (and quorum is lost) Curator emits no
such events and the first call to ZK after quorum loss results in timeout
with org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
ConnectionLoss. Is there a possibility to get notified by Curator about
quorum loss prior to executing any call, prevent long timeout and use
fail-fast approach?

Thank you,
Jed

Re: ConnectionStateListener not called on lost quorum

Posted by Henrik Nordvik <he...@gmail.com>.
What we did was to do some cheap operation periodically in a separate
thread. This also acted as a small health check.

-
Henrik

On Sep 13, 2016 09:41, "Jędrzej Dąbrowa" <na...@gmail.com> wrote:

> I use 2.10.0 because I need to connect to zk 3.4.8. Retry loop does not
> seem be executed - I've tested that with 'new RetryNTimes(0, 0)' and it
> still timeouts after about one minute, which I reckon is default session
> timeout. That seems to be caused by zk CP property, so in face of quorum
> loss instead of failing it just awaits majority to be restored. So my exact
> question is: given circumstances described, is there any possibility to get
> notified about quorum loss asynchronously?
>
> On Tuesday13/09/16 09:2459, Cameron McKenzie wrote:
>
> Which version of curator are you using? In 2.x a LOST even will not occur
> until the retries specified by your retry policy occur. In 3.x the default
> behavior is to simulate the LOST state after being in a suspended state for
> longer than the session timeout.
>
> On 13 Sep 2016 5:15 PM, "Jędrzej Dąbrowa" <na...@gmail.com> wrote:
>
> I connect through Curator to an ensemble of 3 zk (testing) servers. Any
> time zk connection is lost I would like to return appropriate error code to
> the user instead of calling zk. I do this by monitoring connection state
> with ConnectionStateListener. It works with various test scenarios, but
> when 2 out of 3 servers are killed (and quorum is lost) Curator emits no
> such events and the first call to ZK after quorum loss results in timeout
> with org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
> ConnectionLoss. Is there a possibility to get notified by Curator about
> quorum loss prior to executing any call, prevent long timeout and use
> fail-fast approach?
>
> Thank you,
> Jed
>
>
>
>

Re: ConnectionStateListener not called on lost quorum

Posted by Jędrzej Dąbrowa <na...@gmail.com>.
I use 2.10.0 because I need to connect to zk 3.4.8. Retry loop does not 
seem be executed - I've tested that with 'new RetryNTimes(0, 0)' and it 
still timeouts after about one minute, which I reckon is default session 
timeout. That seems to be caused by zk CP property, so in face of quorum 
loss instead of failing it just awaits majority to be restored. So my 
exact question is: given circumstances described, is there any 
possibility to get notified about quorum loss asynchronously?

On Tuesday13/09/16 09:2459, Cameron McKenzie wrote:
>
> Which version of curator are you using? In 2.x a LOST even will not 
> occur until the retries specified by your retry policy occur. In 3.x 
> the default behavior is to simulate the LOST state after being in a 
> suspended state for longer than the session timeout.
>
>
> On 13 Sep 2016 5:15 PM, "J\u0119drzej D\u0105browa" <nachteil@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     I connect through Curator to an ensemble of 3 zk (testing)
>     servers. Any time zk connection is lost I would like to return
>     appropriate error code to the user instead of calling zk. I do
>     this by monitoring connection state with ConnectionStateListener.
>     It works with various test scenarios, but when 2 out of 3 servers
>     are killed (and quorum is lost) Curator emits no such events and
>     the first call to ZK after quorum loss results in timeout with
>     org.apache.curator.CuratorConnectionLossException: KeeperErrorCode
>     = ConnectionLoss. Is there a possibility to get notified by
>     Curator about quorum loss prior to executing any call, prevent
>     long timeout and use fail-fast approach?
>
>     Thank you,
>     Jed
>
>


Re: ConnectionStateListener not called on lost quorum

Posted by Cameron McKenzie <mc...@gmail.com>.
Which version of curator are you using? In 2.x a LOST even will not occur
until the retries specified by your retry policy occur. In 3.x the default
behavior is to simulate the LOST state after being in a suspended state for
longer than the session timeout.

On 13 Sep 2016 5:15 PM, "Jędrzej Dąbrowa" <na...@gmail.com> wrote:

I connect through Curator to an ensemble of 3 zk (testing) servers. Any
time zk connection is lost I would like to return appropriate error code to
the user instead of calling zk. I do this by monitoring connection state
with ConnectionStateListener. It works with various test scenarios, but
when 2 out of 3 servers are killed (and quorum is lost) Curator emits no
such events and the first call to ZK after quorum loss results in timeout
with org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
ConnectionLoss. Is there a possibility to get notified by Curator about
quorum loss prior to executing any call, prevent long timeout and use
fail-fast approach?

Thank you,
Jed