You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by Venkateswara Rao Jujjuri <ju...@gmail.com> on 2016/06/06 16:42:26 UTC

BK Client connection loss with ZK

If a bookie looses connection with ZK, connection gets reestablished and
life goes on. How are we handling it on the client case? Should we retry at
library level?
or leave it up to the application? Any discussion/thoughts on this?

-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi

Re: BK Client connection loss with ZK

Posted by Uma Maheswara Rao G <ha...@gmail.com>.
Hi Arun,

 As I remember, the fix in HDFS-3562 was pretty straightforward. It just
retry ZK ops on connection loss/disconnect.
 When we want to make that utility as generic, Yes, as you said we need to
cover some special cases like you mentioned. I think in such scenarios,
that may be one way to reconstruct the state, but at this point I am not so
strong on that, need more thinking. Also some other scenarios to consider
is, what if zk succeed on some op at server side and  connection loss, here
simply retry will end up node already exist kind of issues right? So we may
need to identify which ops are correct to simply retry etc.
Rakesh do you better thoughts on this scenarios, considering ZK connection
loss etc?

Regards,
Uma

On Wed, Jun 8, 2016 at 12:46 AM, Arun M. Krishnakumar <ar...@gmail.com>
wrote:

> Thanks for the pointer, Uma Gangumalla.
>
> Could you please give an overview of the fix in HDFS-3562.
>
> In the case of Bookkeeper-client, the ReadOnlyLedgerHandle constructs a
> watcher on the relevant Zookeeper nodes. The interesting things are the
> watches created by the ReadOnlyLedgerHandle on the relevant zookeeper
> nodes. We would lose the notifications that happen during the timeout. What
> would be the best way to proceed in such scenarios ? Should we reconstruct
> the state ? Is there any other such state that needs to be considered ?
>
> Thanks,
> Arun
>
> On Tue, Jun 7, 2016 at 3:40 PM, Uma gangumalla <um...@apache.org>
> wrote:
>
> > Good point, Venkateswara Rao.
> >
> > Some time ago, we worked on this scenarios. Here is a patch
> > available. HDFS-3562
> > Here we just tried to keep at application side. But as a long term
> solution
> > this could be placed at BK side as utility module? So that all
> applications
> > can benefit.
> >
> >
> > Note: As I remember RetryableZookeeper idea was taken from HBase.
> >
> > Regards,
> > Uma
> >
> > On Mon, Jun 6, 2016 at 9:42 AM, Venkateswara Rao Jujjuri <
> > jujjuri@gmail.com>
> > wrote:
> >
> > > If a bookie looses connection with ZK, connection gets reestablished
> and
> > > life goes on. How are we handling it on the client case? Should we
> retry
> > at
> > > library level?
> > > or leave it up to the application? Any discussion/thoughts on this?
> > >
> > > --
> > > Jvrao
> > > ---
> > > First they ignore you, then they laugh at you, then they fight you,
> then
> > > you win. - Mahatma Gandhi
> > >
> >
>

Re: BK Client connection loss with ZK

Posted by Sijie Guo <si...@apache.org>.
On Tuesday, June 14, 2016, Venkateswara Rao Jujjuri <ju...@gmail.com>
wrote:

> So Sijie, what happens in the following scenario ?
>
> - AddEntry failed. Timeout or something.
> - Client received the error and tried to get another bookie from ZK.
> - Now ZK connection failed.
>
> Ideally it should do the following to avoid write errors:
>
> - Try to renew lease with the ZK, if succeeds , get a new Bookie, update
> ensemble, proceed with write.


> - If it fails to renew ZK session lease, reestablish a new session with ZK,
> go through the recovery process of updating watchers,
>   get the list of new bookies, update ensemble, send write to new bookie,
> and then send success to client?


The zookeeper wrapper does retries on session loss, expires. So if session
expired when ensemble changing, it would retry until succeed or exhausting
retries. You could check ZooKeeperClient in the util package.



>
> Does this happen with Twitter code?
>
> Also while updating watchers, does it handle transient error conditions?
> Like the middle of establishing watchers,
> some client process may miss watch notifications etc.


Yes. The retry ends at a successful getData to set watcher. So there is no
notification missed.

Sijie


>
>
> On Tue, Jun 14, 2016 at 3:21 PM, Sijie Guo <sijie@apache.org
> <javascript:;>> wrote:
>
> > On Tue, Jun 14, 2016 at 2:30 PM, Arun M. Krishnakumar <
> arunmk95@gmail.com <javascript:;>>
> > wrote:
> >
> > > Hi Sijie,
> > >
> > > I believe the ZooKeeperClient class handles the server connections and
> we
> > > haven't faced issues with that. Could you please confirm ?
> > >
> >
> > Yes. It handles session expires and recreates the connections.
> >
> >
> > >
> > > The issue was with the client connection in the AbstractZkLedgerManager
> > > class as you mentioned above. The twitter branch fix seems to recreate
> > the
> > > listeners and reestablish state. Could you please push it to the
> > community
> > > ?
> > >
> >
> > Yes. I will do.
> >
> >
> > >
> > > Thanks,
> > > Arun
> > >
> > > On Tue, Jun 14, 2016 at 2:17 PM, Sijie Guo <sijie@apache.org
> <javascript:;>> wrote:
> > >
> > > > Arun, what did you observe?
> > > >
> > > > I think we already handle session expires and zookeeper connection
> > > > recreation on ZooKeeperClient wrapper:
> > > >
> > > >
> > >
> >
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/zookeeper/ZooKeeperClient.java
> > > >
> > > >
> > > > We need to uncomment the code in Line 168.
> > > >
> > > >
> > > >
> > >
> >
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L168
> > > >
> > > > (The change in twitter's branch does that retries:
> > > >
> > > >
> > >
> >
> https://github.com/twitter/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L211
> > > > )
> > > >
> > > > - Sijie
> > > >
> > > >
> > > >
> > > > On Wed, Jun 8, 2016 at 12:46 AM, Arun M. Krishnakumar <
> > > arunmk95@gmail.com <javascript:;>>
> > > > wrote:
> > > >
> > > > > Thanks for the pointer, Uma Gangumalla.
> > > > >
> > > > > Could you please give an overview of the fix in HDFS-3562.
> > > > >
> > > > > In the case of Bookkeeper-client, the ReadOnlyLedgerHandle
> > constructs a
> > > > > watcher on the relevant Zookeeper nodes. The interesting things are
> > the
> > > > > watches created by the ReadOnlyLedgerHandle on the relevant
> zookeeper
> > > > > nodes. We would lose the notifications that happen during the
> > timeout.
> > > > What
> > > > > would be the best way to proceed in such scenarios ? Should we
> > > > reconstruct
> > > > > the state ? Is there any other such state that needs to be
> > considered ?
> > > > >
> > > > > Thanks,
> > > > > Arun
> > > > >
> > > > > On Tue, Jun 7, 2016 at 3:40 PM, Uma gangumalla <
> umamahesh@apache.org <javascript:;>
> > >
> > > > > wrote:
> > > > >
> > > > > > Good point, Venkateswara Rao.
> > > > > >
> > > > > > Some time ago, we worked on this scenarios. Here is a patch
> > > > > > available. HDFS-3562
> > > > > > Here we just tried to keep at application side. But as a long
> term
> > > > > solution
> > > > > > this could be placed at BK side as utility module? So that all
> > > > > applications
> > > > > > can benefit.
> > > > > >
> > > > > >
> > > > > > Note: As I remember RetryableZookeeper idea was taken from HBase.
> > > > > >
> > > > > > Regards,
> > > > > > Uma
> > > > > >
> > > > > > On Mon, Jun 6, 2016 at 9:42 AM, Venkateswara Rao Jujjuri <
> > > > > > jujjuri@gmail.com <javascript:;>>
> > > > > > wrote:
> > > > > >
> > > > > > > If a bookie looses connection with ZK, connection gets
> > > reestablished
> > > > > and
> > > > > > > life goes on. How are we handling it on the client case? Should
> > we
> > > > > retry
> > > > > > at
> > > > > > > library level?
> > > > > > > or leave it up to the application? Any discussion/thoughts on
> > this?
> > > > > > >
> > > > > > > --
> > > > > > > Jvrao
> > > > > > > ---
> > > > > > > First they ignore you, then they laugh at you, then they fight
> > you,
> > > > > then
> > > > > > > you win. - Mahatma Gandhi
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi
>

Re: BK Client connection loss with ZK

Posted by Venkateswara Rao Jujjuri <ju...@gmail.com>.
So Sijie, what happens in the following scenario ?

- AddEntry failed. Timeout or something.
- Client received the error and tried to get another bookie from ZK.
- Now ZK connection failed.

Ideally it should do the following to avoid write errors:

- Try to renew lease with the ZK, if succeeds , get a new Bookie, update
ensemble, proceed with write.

- If it fails to renew ZK session lease, reestablish a new session with ZK,
go through the recovery process of updating watchers,
  get the list of new bookies, update ensemble, send write to new bookie,
and then send success to client?

Does this happen with Twitter code?

Also while updating watchers, does it handle transient error conditions?
Like the middle of establishing watchers,
some client process may miss watch notifications etc.


On Tue, Jun 14, 2016 at 3:21 PM, Sijie Guo <si...@apache.org> wrote:

> On Tue, Jun 14, 2016 at 2:30 PM, Arun M. Krishnakumar <ar...@gmail.com>
> wrote:
>
> > Hi Sijie,
> >
> > I believe the ZooKeeperClient class handles the server connections and we
> > haven't faced issues with that. Could you please confirm ?
> >
>
> Yes. It handles session expires and recreates the connections.
>
>
> >
> > The issue was with the client connection in the AbstractZkLedgerManager
> > class as you mentioned above. The twitter branch fix seems to recreate
> the
> > listeners and reestablish state. Could you please push it to the
> community
> > ?
> >
>
> Yes. I will do.
>
>
> >
> > Thanks,
> > Arun
> >
> > On Tue, Jun 14, 2016 at 2:17 PM, Sijie Guo <si...@apache.org> wrote:
> >
> > > Arun, what did you observe?
> > >
> > > I think we already handle session expires and zookeeper connection
> > > recreation on ZooKeeperClient wrapper:
> > >
> > >
> >
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/zookeeper/ZooKeeperClient.java
> > >
> > >
> > > We need to uncomment the code in Line 168.
> > >
> > >
> > >
> >
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L168
> > >
> > > (The change in twitter's branch does that retries:
> > >
> > >
> >
> https://github.com/twitter/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L211
> > > )
> > >
> > > - Sijie
> > >
> > >
> > >
> > > On Wed, Jun 8, 2016 at 12:46 AM, Arun M. Krishnakumar <
> > arunmk95@gmail.com>
> > > wrote:
> > >
> > > > Thanks for the pointer, Uma Gangumalla.
> > > >
> > > > Could you please give an overview of the fix in HDFS-3562.
> > > >
> > > > In the case of Bookkeeper-client, the ReadOnlyLedgerHandle
> constructs a
> > > > watcher on the relevant Zookeeper nodes. The interesting things are
> the
> > > > watches created by the ReadOnlyLedgerHandle on the relevant zookeeper
> > > > nodes. We would lose the notifications that happen during the
> timeout.
> > > What
> > > > would be the best way to proceed in such scenarios ? Should we
> > > reconstruct
> > > > the state ? Is there any other such state that needs to be
> considered ?
> > > >
> > > > Thanks,
> > > > Arun
> > > >
> > > > On Tue, Jun 7, 2016 at 3:40 PM, Uma gangumalla <umamahesh@apache.org
> >
> > > > wrote:
> > > >
> > > > > Good point, Venkateswara Rao.
> > > > >
> > > > > Some time ago, we worked on this scenarios. Here is a patch
> > > > > available. HDFS-3562
> > > > > Here we just tried to keep at application side. But as a long term
> > > > solution
> > > > > this could be placed at BK side as utility module? So that all
> > > > applications
> > > > > can benefit.
> > > > >
> > > > >
> > > > > Note: As I remember RetryableZookeeper idea was taken from HBase.
> > > > >
> > > > > Regards,
> > > > > Uma
> > > > >
> > > > > On Mon, Jun 6, 2016 at 9:42 AM, Venkateswara Rao Jujjuri <
> > > > > jujjuri@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > If a bookie looses connection with ZK, connection gets
> > reestablished
> > > > and
> > > > > > life goes on. How are we handling it on the client case? Should
> we
> > > > retry
> > > > > at
> > > > > > library level?
> > > > > > or leave it up to the application? Any discussion/thoughts on
> this?
> > > > > >
> > > > > > --
> > > > > > Jvrao
> > > > > > ---
> > > > > > First they ignore you, then they laugh at you, then they fight
> you,
> > > > then
> > > > > > you win. - Mahatma Gandhi
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi

Re: BK Client connection loss with ZK

Posted by Sijie Guo <si...@apache.org>.
On Tue, Jun 14, 2016 at 2:30 PM, Arun M. Krishnakumar <ar...@gmail.com>
wrote:

> Hi Sijie,
>
> I believe the ZooKeeperClient class handles the server connections and we
> haven't faced issues with that. Could you please confirm ?
>

Yes. It handles session expires and recreates the connections.


>
> The issue was with the client connection in the AbstractZkLedgerManager
> class as you mentioned above. The twitter branch fix seems to recreate the
> listeners and reestablish state. Could you please push it to the community
> ?
>

Yes. I will do.


>
> Thanks,
> Arun
>
> On Tue, Jun 14, 2016 at 2:17 PM, Sijie Guo <si...@apache.org> wrote:
>
> > Arun, what did you observe?
> >
> > I think we already handle session expires and zookeeper connection
> > recreation on ZooKeeperClient wrapper:
> >
> >
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/zookeeper/ZooKeeperClient.java
> >
> >
> > We need to uncomment the code in Line 168.
> >
> >
> >
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L168
> >
> > (The change in twitter's branch does that retries:
> >
> >
> https://github.com/twitter/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L211
> > )
> >
> > - Sijie
> >
> >
> >
> > On Wed, Jun 8, 2016 at 12:46 AM, Arun M. Krishnakumar <
> arunmk95@gmail.com>
> > wrote:
> >
> > > Thanks for the pointer, Uma Gangumalla.
> > >
> > > Could you please give an overview of the fix in HDFS-3562.
> > >
> > > In the case of Bookkeeper-client, the ReadOnlyLedgerHandle constructs a
> > > watcher on the relevant Zookeeper nodes. The interesting things are the
> > > watches created by the ReadOnlyLedgerHandle on the relevant zookeeper
> > > nodes. We would lose the notifications that happen during the timeout.
> > What
> > > would be the best way to proceed in such scenarios ? Should we
> > reconstruct
> > > the state ? Is there any other such state that needs to be considered ?
> > >
> > > Thanks,
> > > Arun
> > >
> > > On Tue, Jun 7, 2016 at 3:40 PM, Uma gangumalla <um...@apache.org>
> > > wrote:
> > >
> > > > Good point, Venkateswara Rao.
> > > >
> > > > Some time ago, we worked on this scenarios. Here is a patch
> > > > available. HDFS-3562
> > > > Here we just tried to keep at application side. But as a long term
> > > solution
> > > > this could be placed at BK side as utility module? So that all
> > > applications
> > > > can benefit.
> > > >
> > > >
> > > > Note: As I remember RetryableZookeeper idea was taken from HBase.
> > > >
> > > > Regards,
> > > > Uma
> > > >
> > > > On Mon, Jun 6, 2016 at 9:42 AM, Venkateswara Rao Jujjuri <
> > > > jujjuri@gmail.com>
> > > > wrote:
> > > >
> > > > > If a bookie looses connection with ZK, connection gets
> reestablished
> > > and
> > > > > life goes on. How are we handling it on the client case? Should we
> > > retry
> > > > at
> > > > > library level?
> > > > > or leave it up to the application? Any discussion/thoughts on this?
> > > > >
> > > > > --
> > > > > Jvrao
> > > > > ---
> > > > > First they ignore you, then they laugh at you, then they fight you,
> > > then
> > > > > you win. - Mahatma Gandhi
> > > > >
> > > >
> > >
> >
>

Re: BK Client connection loss with ZK

Posted by "Arun M. Krishnakumar" <ar...@gmail.com>.
Hi Sijie,

I believe the ZooKeeperClient class handles the server connections and we
haven't faced issues with that. Could you please confirm ?

The issue was with the client connection in the AbstractZkLedgerManager
class as you mentioned above. The twitter branch fix seems to recreate the
listeners and reestablish state. Could you please push it to the community ?

Thanks,
Arun

On Tue, Jun 14, 2016 at 2:17 PM, Sijie Guo <si...@apache.org> wrote:

> Arun, what did you observe?
>
> I think we already handle session expires and zookeeper connection
> recreation on ZooKeeperClient wrapper:
>
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/zookeeper/ZooKeeperClient.java
>
>
> We need to uncomment the code in Line 168.
>
>
> https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L168
>
> (The change in twitter's branch does that retries:
>
> https://github.com/twitter/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L211
> )
>
> - Sijie
>
>
>
> On Wed, Jun 8, 2016 at 12:46 AM, Arun M. Krishnakumar <ar...@gmail.com>
> wrote:
>
> > Thanks for the pointer, Uma Gangumalla.
> >
> > Could you please give an overview of the fix in HDFS-3562.
> >
> > In the case of Bookkeeper-client, the ReadOnlyLedgerHandle constructs a
> > watcher on the relevant Zookeeper nodes. The interesting things are the
> > watches created by the ReadOnlyLedgerHandle on the relevant zookeeper
> > nodes. We would lose the notifications that happen during the timeout.
> What
> > would be the best way to proceed in such scenarios ? Should we
> reconstruct
> > the state ? Is there any other such state that needs to be considered ?
> >
> > Thanks,
> > Arun
> >
> > On Tue, Jun 7, 2016 at 3:40 PM, Uma gangumalla <um...@apache.org>
> > wrote:
> >
> > > Good point, Venkateswara Rao.
> > >
> > > Some time ago, we worked on this scenarios. Here is a patch
> > > available. HDFS-3562
> > > Here we just tried to keep at application side. But as a long term
> > solution
> > > this could be placed at BK side as utility module? So that all
> > applications
> > > can benefit.
> > >
> > >
> > > Note: As I remember RetryableZookeeper idea was taken from HBase.
> > >
> > > Regards,
> > > Uma
> > >
> > > On Mon, Jun 6, 2016 at 9:42 AM, Venkateswara Rao Jujjuri <
> > > jujjuri@gmail.com>
> > > wrote:
> > >
> > > > If a bookie looses connection with ZK, connection gets reestablished
> > and
> > > > life goes on. How are we handling it on the client case? Should we
> > retry
> > > at
> > > > library level?
> > > > or leave it up to the application? Any discussion/thoughts on this?
> > > >
> > > > --
> > > > Jvrao
> > > > ---
> > > > First they ignore you, then they laugh at you, then they fight you,
> > then
> > > > you win. - Mahatma Gandhi
> > > >
> > >
> >
>

Re: BK Client connection loss with ZK

Posted by Sijie Guo <si...@apache.org>.
Arun, what did you observe?

I think we already handle session expires and zookeeper connection
recreation on ZooKeeperClient wrapper:
https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/zookeeper/ZooKeeperClient.java


We need to uncomment the code in Line 168.

https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L168

(The change in twitter's branch does that retries:
https://github.com/twitter/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/meta/AbstractZkLedgerManager.java#L211
)

- Sijie



On Wed, Jun 8, 2016 at 12:46 AM, Arun M. Krishnakumar <ar...@gmail.com>
wrote:

> Thanks for the pointer, Uma Gangumalla.
>
> Could you please give an overview of the fix in HDFS-3562.
>
> In the case of Bookkeeper-client, the ReadOnlyLedgerHandle constructs a
> watcher on the relevant Zookeeper nodes. The interesting things are the
> watches created by the ReadOnlyLedgerHandle on the relevant zookeeper
> nodes. We would lose the notifications that happen during the timeout. What
> would be the best way to proceed in such scenarios ? Should we reconstruct
> the state ? Is there any other such state that needs to be considered ?
>
> Thanks,
> Arun
>
> On Tue, Jun 7, 2016 at 3:40 PM, Uma gangumalla <um...@apache.org>
> wrote:
>
> > Good point, Venkateswara Rao.
> >
> > Some time ago, we worked on this scenarios. Here is a patch
> > available. HDFS-3562
> > Here we just tried to keep at application side. But as a long term
> solution
> > this could be placed at BK side as utility module? So that all
> applications
> > can benefit.
> >
> >
> > Note: As I remember RetryableZookeeper idea was taken from HBase.
> >
> > Regards,
> > Uma
> >
> > On Mon, Jun 6, 2016 at 9:42 AM, Venkateswara Rao Jujjuri <
> > jujjuri@gmail.com>
> > wrote:
> >
> > > If a bookie looses connection with ZK, connection gets reestablished
> and
> > > life goes on. How are we handling it on the client case? Should we
> retry
> > at
> > > library level?
> > > or leave it up to the application? Any discussion/thoughts on this?
> > >
> > > --
> > > Jvrao
> > > ---
> > > First they ignore you, then they laugh at you, then they fight you,
> then
> > > you win. - Mahatma Gandhi
> > >
> >
>

Re: BK Client connection loss with ZK

Posted by "Arun M. Krishnakumar" <ar...@gmail.com>.
Thanks for the pointer, Uma Gangumalla.

Could you please give an overview of the fix in HDFS-3562.

In the case of Bookkeeper-client, the ReadOnlyLedgerHandle constructs a
watcher on the relevant Zookeeper nodes. The interesting things are the
watches created by the ReadOnlyLedgerHandle on the relevant zookeeper
nodes. We would lose the notifications that happen during the timeout. What
would be the best way to proceed in such scenarios ? Should we reconstruct
the state ? Is there any other such state that needs to be considered ?

Thanks,
Arun

On Tue, Jun 7, 2016 at 3:40 PM, Uma gangumalla <um...@apache.org> wrote:

> Good point, Venkateswara Rao.
>
> Some time ago, we worked on this scenarios. Here is a patch
> available. HDFS-3562
> Here we just tried to keep at application side. But as a long term solution
> this could be placed at BK side as utility module? So that all applications
> can benefit.
>
>
> Note: As I remember RetryableZookeeper idea was taken from HBase.
>
> Regards,
> Uma
>
> On Mon, Jun 6, 2016 at 9:42 AM, Venkateswara Rao Jujjuri <
> jujjuri@gmail.com>
> wrote:
>
> > If a bookie looses connection with ZK, connection gets reestablished and
> > life goes on. How are we handling it on the client case? Should we retry
> at
> > library level?
> > or leave it up to the application? Any discussion/thoughts on this?
> >
> > --
> > Jvrao
> > ---
> > First they ignore you, then they laugh at you, then they fight you, then
> > you win. - Mahatma Gandhi
> >
>

Re: BK Client connection loss with ZK

Posted by Uma gangumalla <um...@apache.org>.
Good point, Venkateswara Rao.

Some time ago, we worked on this scenarios. Here is a patch
available. HDFS-3562
Here we just tried to keep at application side. But as a long term solution
this could be placed at BK side as utility module? So that all applications
can benefit.


Note: As I remember RetryableZookeeper idea was taken from HBase.

Regards,
Uma

On Mon, Jun 6, 2016 at 9:42 AM, Venkateswara Rao Jujjuri <ju...@gmail.com>
wrote:

> If a bookie looses connection with ZK, connection gets reestablished and
> life goes on. How are we handling it on the client case? Should we retry at
> library level?
> or leave it up to the application? Any discussion/thoughts on this?
>
> --
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi
>