You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@zookeeper.apache.org by Sergei Babovich <sb...@demandware.com> on 2012/09/27 22:34:35 UTC

Zookeeper delay to reconnect

Hi,
Zookeeper implements a delay of up to 1 second before trying to reconnect.

ClientCnxn$SendThread
         @Override
         public void run() {
             ...
             while (state.isAlive()) {
                 try {
                     if (!clientCnxnSocket.isConnected()) {
                         if(!isFirstConnect){
                             try {
                                 Thread.sleep(r.nextInt(1000));
                             } catch (InterruptedException e) {
                                 LOG.warn("Unexpected exception", e);
                             }

This creates "outages" (even with simple retry on ConnectionLoss) up to 
1s even with perfectly healthy cluster like in scenario of rolling 
restart. In our scenario it might be a problem under high load creating 
a spike in a number of requests waiting on zk operation.
Would it be a better strategy to perform reconnect attempt immediately 
at least one time? Or there is more to it?

Regards,
Sergei



This e-mail message and all attachments transmitted with it may contain privileged and/or confidential information intended solely for the use of the addressee(s). If the reader of this message is not the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, forwarding or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately and delete this message, all attachments and all copies and backups thereof.

Re: Zookeeper delay to reconnect

Posted by Sergei Babovich <sb...@demandware.com>.

Thanks, Patrick!
On 09/27/2012 07:55 PM, Patrick Hunt wrote:
> The random sleep was explicitly added to reduce herd effects and
> general "spinning client" problems iirc. Keep in mind that ZK
> generally trades of performance for availability.
That's exactly my concern - it is not about performance - from the 
client's point of view having reconnect delay makes cluster effectively 
unavailable for up to a second. In a scenarios when you have relatively 
low number of sessions (herding is not a concern) with each session 
processing a lot of requests such strategy potentially causes 
instability - there is no way to gracefully handle intermittent errors 
caused by normal operation procedures without risking client's stability.
> It wouldn't be a
> good idea to remove it in general. If anything we should have a more
> aggressive backoff policy in the case where clients are just spinning.
>
> Perhaps a plug-able approach here? Where the default is something like
> what we already have, but allow users to implement their own policy if
> they like. We could have a few implementations "out of the box"; 1)
> current, 2) no wait, 3) exponential backoff after trying each server
> in the ensemble, etc... This would also allow for experimentation.
Totally agree - customizable strategy should be an answer to facilitate 
different requirements.
Just curious: does randomized delay make a real difference here? Was it 
a real issue somebody hit? I'd expect that randomizing server address to 
reconnect should be enough - the load will be evenly distributed across 
the rest of the cluster node and should not create a problem assuming 
enough zookeeper cluster capacity.
>
> Patrick
>
> On Thu, Sep 27, 2012 at 2:28 PM, Michi Mutsuzaki <mi...@cs.stanford.edu> wrote:
>> Hi Sergei,
>>
>> Your suggestion sounds reasonable to me. I think the sleep was added
>> so that the client doesn't spin when the entire zookeeper is down. The
>> client could try to connect to each server without sleep, and sleep
>> for 1 second only after failing to connect to all the servers in the
>> cluster.
>>
>> Thanks!
>> --Michi
>>
>> On Thu, Sep 27, 2012 at 1:34 PM, Sergei Babovich
>> <sb...@demandware.com> wrote:
>>> Hi,
>>> Zookeeper implements a delay of up to 1 second before trying to reconnect.
>>>
>>> ClientCnxn$SendThread
>>>          @Override
>>>          public void run() {
>>>              ...
>>>              while (state.isAlive()) {
>>>                  try {
>>>                      if (!clientCnxnSocket.isConnected()) {
>>>                          if(!isFirstConnect){
>>>                              try {
>>>                                  Thread.sleep(r.nextInt(1000));
>>>                              } catch (InterruptedException e) {
>>>                                  LOG.warn("Unexpected exception", e);
>>>                              }
>>>
>>> This creates "outages" (even with simple retry on ConnectionLoss) up to 1s
>>> even with perfectly healthy cluster like in scenario of rolling restart. In
>>> our scenario it might be a problem under high load creating a spike in a
>>> number of requests waiting on zk operation.
>>> Would it be a better strategy to perform reconnect attempt immediately at
>>> least one time? Or there is more to it?

This e-mail message and all attachments transmitted with it may contain privileged and/or confidential information intended solely for the use of the addressee(s). If the reader of this message is not the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, forwarding or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately and delete this message, all attachments and all copies and backups thereof.

Re: Zookeeper delay to reconnect

Posted by Brian Tarbox <br...@gmail.com>.

What I see is ClientCnxn.processEvent hits the bottom catch block and
immediately retries causing the spin.
I can make this happen by suspending an event thread...or randomly under
conditions that I have not pinned down yet.

Brian

On Thu, Sep 27, 2012 at 8:07 PM, Patrick Hunt <ph...@apache.org> wrote:

> Hi Brian, well, in my proposal the default would be the current
> behavior. With the discretion of the zk operator to change, so it
> shouldn't be any worse.
>
> You've piqued my interest - a single client attempting to connect is
> responsible for bringing down the entire cluster? Could you provide
> more details?
>
> Patrick
>
> On Thu, Sep 27, 2012 at 4:58 PM, Brian Tarbox <br...@gmail.com>
> wrote:
> > I would lobby not to change this...I'm still occasionally dealing with
> > clients spinning trying to connect...which brings down the whole cluster
> > until that one client is killed.
> >
> > Brian
> >
> > On Thu, Sep 27, 2012 at 7:55 PM, Patrick Hunt <ph...@apache.org> wrote:
> >
> >> The random sleep was explicitly added to reduce herd effects and
> >> general "spinning client" problems iirc. Keep in mind that ZK
> >> generally trades of performance for availability. It wouldn't be a
> >> good idea to remove it in general. If anything we should have a more
> >> aggressive backoff policy in the case where clients are just spinning.
> >>
> >> Perhaps a plug-able approach here? Where the default is something like
> >> what we already have, but allow users to implement their own policy if
> >> they like. We could have a few implementations "out of the box"; 1)
> >> current, 2) no wait, 3) exponential backoff after trying each server
> >> in the ensemble, etc... This would also allow for experimentation.
> >>
> >> Patrick
> >>
> >> On Thu, Sep 27, 2012 at 2:28 PM, Michi Mutsuzaki <michi@cs.stanford.edu
> >
> >> wrote:
> >> > Hi Sergei,
> >> >
> >> > Your suggestion sounds reasonable to me. I think the sleep was added
> >> > so that the client doesn't spin when the entire zookeeper is down. The
> >> > client could try to connect to each server without sleep, and sleep
> >> > for 1 second only after failing to connect to all the servers in the
> >> > cluster.
> >> >
> >> > Thanks!
> >> > --Michi
> >> >
> >> > On Thu, Sep 27, 2012 at 1:34 PM, Sergei Babovich
> >> > <sb...@demandware.com> wrote:
> >> >> Hi,
> >> >> Zookeeper implements a delay of up to 1 second before trying to
> >> reconnect.
> >> >>
> >> >> ClientCnxn$SendThread
> >> >>         @Override
> >> >>         public void run() {
> >> >>             ...
> >> >>             while (state.isAlive()) {
> >> >>                 try {
> >> >>                     if (!clientCnxnSocket.isConnected()) {
> >> >>                         if(!isFirstConnect){
> >> >>                             try {
> >> >>                                 Thread.sleep(r.nextInt(1000));
> >> >>                             } catch (InterruptedException e) {
> >> >>                                 LOG.warn("Unexpected exception", e);
> >> >>                             }
> >> >>
> >> >> This creates "outages" (even with simple retry on ConnectionLoss) up
> to
> >> 1s
> >> >> even with perfectly healthy cluster like in scenario of rolling
> >> restart. In
> >> >> our scenario it might be a problem under high load creating a spike
> in a
> >> >> number of requests waiting on zk operation.
> >> >> Would it be a better strategy to perform reconnect attempt
> immediately
> >> at
> >> >> least one time? Or there is more to it?
> >>
> >
> >
> >
> > --
> > http://about.me/BrianTarbox
>



-- 
http://about.me/BrianTarbox

Re: Zookeeper delay to reconnect

Posted by Patrick Hunt <ph...@apache.org>.

Hi Brian, well, in my proposal the default would be the current
behavior. With the discretion of the zk operator to change, so it
shouldn't be any worse.

You've piqued my interest - a single client attempting to connect is
responsible for bringing down the entire cluster? Could you provide
more details?

Patrick

On Thu, Sep 27, 2012 at 4:58 PM, Brian Tarbox <br...@gmail.com> wrote:
> I would lobby not to change this...I'm still occasionally dealing with
> clients spinning trying to connect...which brings down the whole cluster
> until that one client is killed.
>
> Brian
>
> On Thu, Sep 27, 2012 at 7:55 PM, Patrick Hunt <ph...@apache.org> wrote:
>
>> The random sleep was explicitly added to reduce herd effects and
>> general "spinning client" problems iirc. Keep in mind that ZK
>> generally trades of performance for availability. It wouldn't be a
>> good idea to remove it in general. If anything we should have a more
>> aggressive backoff policy in the case where clients are just spinning.
>>
>> Perhaps a plug-able approach here? Where the default is something like
>> what we already have, but allow users to implement their own policy if
>> they like. We could have a few implementations "out of the box"; 1)
>> current, 2) no wait, 3) exponential backoff after trying each server
>> in the ensemble, etc... This would also allow for experimentation.
>>
>> Patrick
>>
>> On Thu, Sep 27, 2012 at 2:28 PM, Michi Mutsuzaki <mi...@cs.stanford.edu>
>> wrote:
>> > Hi Sergei,
>> >
>> > Your suggestion sounds reasonable to me. I think the sleep was added
>> > so that the client doesn't spin when the entire zookeeper is down. The
>> > client could try to connect to each server without sleep, and sleep
>> > for 1 second only after failing to connect to all the servers in the
>> > cluster.
>> >
>> > Thanks!
>> > --Michi
>> >
>> > On Thu, Sep 27, 2012 at 1:34 PM, Sergei Babovich
>> > <sb...@demandware.com> wrote:
>> >> Hi,
>> >> Zookeeper implements a delay of up to 1 second before trying to
>> reconnect.
>> >>
>> >> ClientCnxn$SendThread
>> >>         @Override
>> >>         public void run() {
>> >>             ...
>> >>             while (state.isAlive()) {
>> >>                 try {
>> >>                     if (!clientCnxnSocket.isConnected()) {
>> >>                         if(!isFirstConnect){
>> >>                             try {
>> >>                                 Thread.sleep(r.nextInt(1000));
>> >>                             } catch (InterruptedException e) {
>> >>                                 LOG.warn("Unexpected exception", e);
>> >>                             }
>> >>
>> >> This creates "outages" (even with simple retry on ConnectionLoss) up to
>> 1s
>> >> even with perfectly healthy cluster like in scenario of rolling
>> restart. In
>> >> our scenario it might be a problem under high load creating a spike in a
>> >> number of requests waiting on zk operation.
>> >> Would it be a better strategy to perform reconnect attempt immediately
>> at
>> >> least one time? Or there is more to it?
>>
>
>
>
> --
> http://about.me/BrianTarbox

Re: Zookeeper delay to reconnect

Posted by Brian Tarbox <br...@gmail.com>.

I would lobby not to change this...I'm still occasionally dealing with
clients spinning trying to connect...which brings down the whole cluster
until that one client is killed.

Brian

On Thu, Sep 27, 2012 at 7:55 PM, Patrick Hunt <ph...@apache.org> wrote:

> The random sleep was explicitly added to reduce herd effects and
> general "spinning client" problems iirc. Keep in mind that ZK
> generally trades of performance for availability. It wouldn't be a
> good idea to remove it in general. If anything we should have a more
> aggressive backoff policy in the case where clients are just spinning.
>
> Perhaps a plug-able approach here? Where the default is something like
> what we already have, but allow users to implement their own policy if
> they like. We could have a few implementations "out of the box"; 1)
> current, 2) no wait, 3) exponential backoff after trying each server
> in the ensemble, etc... This would also allow for experimentation.
>
> Patrick
>
> On Thu, Sep 27, 2012 at 2:28 PM, Michi Mutsuzaki <mi...@cs.stanford.edu>
> wrote:
> > Hi Sergei,
> >
> > Your suggestion sounds reasonable to me. I think the sleep was added
> > so that the client doesn't spin when the entire zookeeper is down. The
> > client could try to connect to each server without sleep, and sleep
> > for 1 second only after failing to connect to all the servers in the
> > cluster.
> >
> > Thanks!
> > --Michi
> >
> > On Thu, Sep 27, 2012 at 1:34 PM, Sergei Babovich
> > <sb...@demandware.com> wrote:
> >> Hi,
> >> Zookeeper implements a delay of up to 1 second before trying to
> reconnect.
> >>
> >> ClientCnxn$SendThread
> >>         @Override
> >>         public void run() {
> >>             ...
> >>             while (state.isAlive()) {
> >>                 try {
> >>                     if (!clientCnxnSocket.isConnected()) {
> >>                         if(!isFirstConnect){
> >>                             try {
> >>                                 Thread.sleep(r.nextInt(1000));
> >>                             } catch (InterruptedException e) {
> >>                                 LOG.warn("Unexpected exception", e);
> >>                             }
> >>
> >> This creates "outages" (even with simple retry on ConnectionLoss) up to
> 1s
> >> even with perfectly healthy cluster like in scenario of rolling
> restart. In
> >> our scenario it might be a problem under high load creating a spike in a
> >> number of requests waiting on zk operation.
> >> Would it be a better strategy to perform reconnect attempt immediately
> at
> >> least one time? Or there is more to it?
>



-- 
http://about.me/BrianTarbox

Re: Zookeeper delay to reconnect

Posted by Brian Tarbox <br...@gmail.com>.

I would lobby not to change this...I'm still occasionally dealing with
clients spinning trying to connect...which brings down the whole cluster
until that one client is killed.

Brian

On Thu, Sep 27, 2012 at 7:55 PM, Patrick Hunt <ph...@apache.org> wrote:

> The random sleep was explicitly added to reduce herd effects and
> general "spinning client" problems iirc. Keep in mind that ZK
> generally trades of performance for availability. It wouldn't be a
> good idea to remove it in general. If anything we should have a more
> aggressive backoff policy in the case where clients are just spinning.
>
> Perhaps a plug-able approach here? Where the default is something like
> what we already have, but allow users to implement their own policy if
> they like. We could have a few implementations "out of the box"; 1)
> current, 2) no wait, 3) exponential backoff after trying each server
> in the ensemble, etc... This would also allow for experimentation.
>
> Patrick
>
> On Thu, Sep 27, 2012 at 2:28 PM, Michi Mutsuzaki <mi...@cs.stanford.edu>
> wrote:
> > Hi Sergei,
> >
> > Your suggestion sounds reasonable to me. I think the sleep was added
> > so that the client doesn't spin when the entire zookeeper is down. The
> > client could try to connect to each server without sleep, and sleep
> > for 1 second only after failing to connect to all the servers in the
> > cluster.
> >
> > Thanks!
> > --Michi
> >
> > On Thu, Sep 27, 2012 at 1:34 PM, Sergei Babovich
> > <sb...@demandware.com> wrote:
> >> Hi,
> >> Zookeeper implements a delay of up to 1 second before trying to
> reconnect.
> >>
> >> ClientCnxn$SendThread
> >>         @Override
> >>         public void run() {
> >>             ...
> >>             while (state.isAlive()) {
> >>                 try {
> >>                     if (!clientCnxnSocket.isConnected()) {
> >>                         if(!isFirstConnect){
> >>                             try {
> >>                                 Thread.sleep(r.nextInt(1000));
> >>                             } catch (InterruptedException e) {
> >>                                 LOG.warn("Unexpected exception", e);
> >>                             }
> >>
> >> This creates "outages" (even with simple retry on ConnectionLoss) up to
> 1s
> >> even with perfectly healthy cluster like in scenario of rolling
> restart. In
> >> our scenario it might be a problem under high load creating a spike in a
> >> number of requests waiting on zk operation.
> >> Would it be a better strategy to perform reconnect attempt immediately
> at
> >> least one time? Or there is more to it?
>



-- 
http://about.me/BrianTarbox

Re: Zookeeper delay to reconnect

Posted by Patrick Hunt <ph...@apache.org>.

The random sleep was explicitly added to reduce herd effects and
general "spinning client" problems iirc. Keep in mind that ZK
generally trades of performance for availability. It wouldn't be a
good idea to remove it in general. If anything we should have a more
aggressive backoff policy in the case where clients are just spinning.

Perhaps a plug-able approach here? Where the default is something like
what we already have, but allow users to implement their own policy if
they like. We could have a few implementations "out of the box"; 1)
current, 2) no wait, 3) exponential backoff after trying each server
in the ensemble, etc... This would also allow for experimentation.

Patrick

On Thu, Sep 27, 2012 at 2:28 PM, Michi Mutsuzaki <mi...@cs.stanford.edu> wrote:
> Hi Sergei,
>
> Your suggestion sounds reasonable to me. I think the sleep was added
> so that the client doesn't spin when the entire zookeeper is down. The
> client could try to connect to each server without sleep, and sleep
> for 1 second only after failing to connect to all the servers in the
> cluster.
>
> Thanks!
> --Michi
>
> On Thu, Sep 27, 2012 at 1:34 PM, Sergei Babovich
> <sb...@demandware.com> wrote:
>> Hi,
>> Zookeeper implements a delay of up to 1 second before trying to reconnect.
>>
>> ClientCnxn$SendThread
>>         @Override
>>         public void run() {
>>             ...
>>             while (state.isAlive()) {
>>                 try {
>>                     if (!clientCnxnSocket.isConnected()) {
>>                         if(!isFirstConnect){
>>                             try {
>>                                 Thread.sleep(r.nextInt(1000));
>>                             } catch (InterruptedException e) {
>>                                 LOG.warn("Unexpected exception", e);
>>                             }
>>
>> This creates "outages" (even with simple retry on ConnectionLoss) up to 1s
>> even with perfectly healthy cluster like in scenario of rolling restart. In
>> our scenario it might be a problem under high load creating a spike in a
>> number of requests waiting on zk operation.
>> Would it be a better strategy to perform reconnect attempt immediately at
>> least one time? Or there is more to it?

Re: Zookeeper delay to reconnect

Posted by Ben Bangert <be...@groovie.org>.

On Sep 27, 2012, at 2:28 PM, Michi Mutsuzaki <mi...@cs.stanford.edu> wrote:

> Your suggestion sounds reasonable to me. I think the sleep was added
> so that the client doesn't spin when the entire zookeeper is down. The
> client could try to connect to each server without sleep, and sleep
> for 1 second only after failing to connect to all the servers in the
> cluster.

Just for comparison, the kazoo Python client uses an exponential back-off thats customizable and applied after it tries each server in the provided list. It also add's a small jitter to each attempt to prevent herd effects. This also applies to the background read-write searching that takes effect during read-only mode.

- Ben

Re: Zookeeper delay to reconnect

Posted by Michi Mutsuzaki <mi...@cs.stanford.edu>.

Hi Sergei,

Your suggestion sounds reasonable to me. I think the sleep was added
so that the client doesn't spin when the entire zookeeper is down. The
client could try to connect to each server without sleep, and sleep
for 1 second only after failing to connect to all the servers in the
cluster.

Thanks!
--Michi

On Thu, Sep 27, 2012 at 1:34 PM, Sergei Babovich
<sb...@demandware.com> wrote:
> Hi,
> Zookeeper implements a delay of up to 1 second before trying to reconnect.
>
> ClientCnxn$SendThread
>         @Override
>         public void run() {
>             ...
>             while (state.isAlive()) {
>                 try {
>                     if (!clientCnxnSocket.isConnected()) {
>                         if(!isFirstConnect){
>                             try {
>                                 Thread.sleep(r.nextInt(1000));
>                             } catch (InterruptedException e) {
>                                 LOG.warn("Unexpected exception", e);
>                             }
>
> This creates "outages" (even with simple retry on ConnectionLoss) up to 1s
> even with perfectly healthy cluster like in scenario of rolling restart. In
> our scenario it might be a problem under high load creating a spike in a
> number of requests waiting on zk operation.
> Would it be a better strategy to perform reconnect attempt immediately at
> least one time? Or there is more to it?