You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@zookeeper.apache.org by Grant Ingersoll <gs...@apache.org> on 2009/06/06 05:56:49 UTC

Newbie Questions

What's the overhead of connecting to a Server?  In other words, if I'm  
in a multi-threaded web-app server environment, should I cache my  
ZooKeeper instance and set a larger timeout value or should I just  
construct them as I need them?



Thanks,
Grant

Re: Newbie Questions

Posted by Ted Dunning <te...@gmail.com>.

This is a serious question and a real problem.  It isn't, however, imposed
by ZK.  You have to have a clean strategy for loss of coordination in your
application if you want it to be robust.

I have seen various strategies work reasonably well.  The approach used in
Katta is to build a listener abstraction over the top of ZK.  The central
object maintains a ZK reference and sends change events to a list of
listeners.  This central object can handle the reconnection after expiration
and the dispatch of notifications to listeners.  This is essentially what
you suggest except that the Zk reference is hidden away a bit.  This can be
a bit dangerous if you hide too much of the ZK semantics, but it is not all
that bad if you leave some of those semantics intact.  In particular, you
must not indiscriminantly do retries on failure since that can lead you into
really deep and murky waters very quickly.

On Sun, Jun 7, 2009 at 4:58 PM, Satish Bhatti <ct...@gmail.com> wrote:

> However, if receive
> a Watcher.Event.KeeperState.Expired event, a fresh ZooKeeper client
> instance
> has to be created.  If I were sharing the ZooKeeper instance, then somehow
> my objects would have to be notified that they should switch to using the
> new ZooKeeper instance.  That means somewhere in my app I would need to
> maintain a list of all objects using ZooKeeper.  Is this the recommended
> approach?  Is there some other more elegant way to do this?
>

-- 
Ted Dunning, CTO
DeepDyve

Re: Newbie Questions

Posted by Ted Dunning <te...@gmail.com>.

Injecting the ZK instance is just plain good practice because it allows you
to inject a mock Zookeeper during testing.  Without doing that, it is almost
impossible to test various failure scenarios.

On Sun, Jun 7, 2009 at 4:58 PM, Satish Bhatti <ct...@gmail.com> wrote:

> In principle I could create a single ZooKeeper
> client object and pass it in to the objects, so then I would only have a
> single ZooKeeper instance.
>

-- 
Ted Dunning, CTO
DeepDyve

Re: Newbie Questions

Posted by Mahadev Konar <ma...@yahoo-inc.com>.

Hi Satish,
   My suggestion in the last email was just to prevent unnecessary
instantiation of zookeeper client objects. Creating a new zookeeper client
for different modules doing different things should be fine. I don't have
any recommemded approach for either of your suggestions. It really depends
on your encapsulation, application API's and your usage model.

thanks
mahadev


On 6/7/09 4:58 PM, "Satish Bhatti" <ct...@gmail.com> wrote:

> Hey Mahadev,
> I had a question about that.  In my application I am using ZooKeeper for
> several different unrelated purposes, e.g. to generate unique ids,  for
> distributed locks, and as a property store.  I have implemented generic
> black box classes that use ZooKeeper to provide that functionality, and when
> an object of one of those classes is instantiated it internally creates a
> ZooKeeper instance for its personal use.  So, for example, one of my apps
> has a property store and an id generator, and so it ends up using 2
> ZooKeeper client objects.  In principle I could create a single ZooKeeper
> client object and pass it in to the objects, so then I would only have a
> single ZooKeeper instance.  However, if receive
> a Watcher.Event.KeeperState.Expired event, a fresh ZooKeeper client instance
> has to be created.  If I were sharing the ZooKeeper instance, then somehow
> my objects would have to be notified that they should switch to using the
> new ZooKeeper instance.  That means somewhere in my app I would need to
> maintain a list of all objects using ZooKeeper.  Is this the recommended
> approach?  Is there some other more elegant way to do this?
> 
> Satish
> 
> 
> On Sun, Jun 7, 2009 at 11:58 AM, Mahadev Konar <ma...@yahoo-inc.com>wrote:
> 
>> HI Grant,
>>  I agree with Ted but just to elaborate a little more.
>> 
>>  Its good to have a single zookeeper instance connected to the server.
>> Zookeeper client are supposed to be long lived client and the expected
>> idiom
>> to use a zookeeper client is to have a  long lived single zookeeper client
>> per application instance. Most of the zookeeper recipes use zookeeper
>> session capabilities for implementing those recipes. So in that case, it
>> becomes necessary to have just a single client per app instance. Even if
>> you
>> don't plan to use zookeeper session capabilities (like ephemeral nodes and
>> watches) it would be good to just use a single zookeeper instance.
>> 
>> A zookeeper client if not being used in an application would just be
>> sending
>> pings every one third of the timeout values you set. We are working on an
>> opimization in 3.3 wherein we wont be even sending these pings if the
>> client
>> does not use ephemeral nodes and watches. ZOOKEEPER-321 is the jira if you
>> want to track that. Hope this helps.
>> 
>> mahadev
>> 
>> 
>> On 6/6/09 12:09 AM, "Ted Dunning" <te...@gmail.com> wrote:
>> 
>>> It is a common idiom to have a single Zookeeper instance.  One reason for
>>> this is that it can be hard to keep track of which instance has which
>>> watches if you have lots of them around.
>>> 
>>> Instantiating several Zookeeper structures and then discarding them also
>>> eliminates the utility of ephemeral philes.
>>> 
>>> Watches and ephemerals are two of the key characteristics of ZK, so they
>> are
>>> quite a loss.
>>> 
>>> That said, keeping a single zookeeper as a static in a single class isn't
>>> such a strange thing to do, especially if you can't imagine closing the
>> ZK
>>> instance.  That gives you some scope but can keep the existence and use
>> of
>>> ZK a secret.
>>> 
>>> You do have to worry a bit about how to initialize the ZK.  For that
>> reason
>>> and for mocking purposes, it is pretty good practice to always inject the
>> ZK
>>> instance into your classes a la spring.
>>> 
>>> On Fri, Jun 5, 2009 at 8:56 PM, Grant Ingersoll <gs...@apache.org>
>> wrote:
>>> 
>>>> What's the overhead of connecting to a Server?  In other words, if I'm
>> in a
>>>> multi-threaded web-app server environment, should I cache my ZooKeeper
>>>> instance and set a larger timeout value or should I just construct them
>> as I
>>>> need them?
>>>> 
>>>> 
>>>> 
>>> 
>> 
>>

Re: Newbie Questions

Posted by Satish Bhatti <ct...@gmail.com>.

Hey Mahadev,
I had a question about that.  In my application I am using ZooKeeper for
several different unrelated purposes, e.g. to generate unique ids,  for
distributed locks, and as a property store.  I have implemented generic
black box classes that use ZooKeeper to provide that functionality, and when
an object of one of those classes is instantiated it internally creates a
ZooKeeper instance for its personal use.  So, for example, one of my apps
has a property store and an id generator, and so it ends up using 2
ZooKeeper client objects.  In principle I could create a single ZooKeeper
client object and pass it in to the objects, so then I would only have a
single ZooKeeper instance.  However, if receive
a Watcher.Event.KeeperState.Expired event, a fresh ZooKeeper client instance
has to be created.  If I were sharing the ZooKeeper instance, then somehow
my objects would have to be notified that they should switch to using the
new ZooKeeper instance.  That means somewhere in my app I would need to
maintain a list of all objects using ZooKeeper.  Is this the recommended
approach?  Is there some other more elegant way to do this?

Satish

On Sun, Jun 7, 2009 at 11:58 AM, Mahadev Konar <ma...@yahoo-inc.com>wrote:

> HI Grant,
>  I agree with Ted but just to elaborate a little more.
>
>  Its good to have a single zookeeper instance connected to the server.
> Zookeeper client are supposed to be long lived client and the expected
> idiom
> to use a zookeeper client is to have a  long lived single zookeeper client
> per application instance. Most of the zookeeper recipes use zookeeper
> session capabilities for implementing those recipes. So in that case, it
> becomes necessary to have just a single client per app instance. Even if
> you
> don't plan to use zookeeper session capabilities (like ephemeral nodes and
> watches) it would be good to just use a single zookeeper instance.
>
> A zookeeper client if not being used in an application would just be
> sending
> pings every one third of the timeout values you set. We are working on an
> opimization in 3.3 wherein we wont be even sending these pings if the
> client
> does not use ephemeral nodes and watches. ZOOKEEPER-321 is the jira if you
> want to track that. Hope this helps.
>
> mahadev
>
>
> On 6/6/09 12:09 AM, "Ted Dunning" <te...@gmail.com> wrote:
>
> > It is a common idiom to have a single Zookeeper instance.  One reason for
> > this is that it can be hard to keep track of which instance has which
> > watches if you have lots of them around.
> >
> > Instantiating several Zookeeper structures and then discarding them also
> > eliminates the utility of ephemeral philes.
> >
> > Watches and ephemerals are two of the key characteristics of ZK, so they
> are
> > quite a loss.
> >
> > That said, keeping a single zookeeper as a static in a single class isn't
> > such a strange thing to do, especially if you can't imagine closing the
> ZK
> > instance.  That gives you some scope but can keep the existence and use
> of
> > ZK a secret.
> >
> > You do have to worry a bit about how to initialize the ZK.  For that
> reason
> > and for mocking purposes, it is pretty good practice to always inject the
> ZK
> > instance into your classes a la spring.
> >
> > On Fri, Jun 5, 2009 at 8:56 PM, Grant Ingersoll <gs...@apache.org>
> wrote:
> >
> >> What's the overhead of connecting to a Server?  In other words, if I'm
> in a
> >> multi-threaded web-app server environment, should I cache my ZooKeeper
> >> instance and set a larger timeout value or should I just construct them
> as I
> >> need them?
> >>
> >>
> >>
> >
>
>

Re: Newbie Questions

Posted by Mahadev Konar <ma...@yahoo-inc.com>.

HI Grant,
  I agree with Ted but just to elaborate a little more.

  Its good to have a single zookeeper instance connected to the server.
Zookeeper client are supposed to be long lived client and the expected idiom
to use a zookeeper client is to have a  long lived single zookeeper client
per application instance. Most of the zookeeper recipes use zookeeper
session capabilities for implementing those recipes. So in that case, it
becomes necessary to have just a single client per app instance. Even if you
don't plan to use zookeeper session capabilities (like ephemeral nodes and
watches) it would be good to just use a single zookeeper instance.

A zookeeper client if not being used in an application would just be sending
pings every one third of the timeout values you set. We are working on an
opimization in 3.3 wherein we wont be even sending these pings if the client
does not use ephemeral nodes and watches. ZOOKEEPER-321 is the jira if you
want to track that. Hope this helps.

mahadev

On 6/6/09 12:09 AM, "Ted Dunning" <te...@gmail.com> wrote:

> It is a common idiom to have a single Zookeeper instance.  One reason for
> this is that it can be hard to keep track of which instance has which
> watches if you have lots of them around.
> 
> Instantiating several Zookeeper structures and then discarding them also
> eliminates the utility of ephemeral philes.
> 
> Watches and ephemerals are two of the key characteristics of ZK, so they are
> quite a loss.
> 
> That said, keeping a single zookeeper as a static in a single class isn't
> such a strange thing to do, especially if you can't imagine closing the ZK
> instance.  That gives you some scope but can keep the existence and use of
> ZK a secret.
> 
> You do have to worry a bit about how to initialize the ZK.  For that reason
> and for mocking purposes, it is pretty good practice to always inject the ZK
> instance into your classes a la spring.
> 
> On Fri, Jun 5, 2009 at 8:56 PM, Grant Ingersoll <gs...@apache.org> wrote:
> 
>> What's the overhead of connecting to a Server?  In other words, if I'm in a
>> multi-threaded web-app server environment, should I cache my ZooKeeper
>> instance and set a larger timeout value or should I just construct them as I
>> need them?
>> 
>> 
>> 
>

Re: Newbie Questions

Posted by Ted Dunning <te...@gmail.com>.

It is a common idiom to have a single Zookeeper instance.  One reason for
this is that it can be hard to keep track of which instance has which
watches if you have lots of them around.

Instantiating several Zookeeper structures and then discarding them also
eliminates the utility of ephemeral philes.

Watches and ephemerals are two of the key characteristics of ZK, so they are
quite a loss.

That said, keeping a single zookeeper as a static in a single class isn't
such a strange thing to do, especially if you can't imagine closing the ZK
instance.  That gives you some scope but can keep the existence and use of
ZK a secret.

You do have to worry a bit about how to initialize the ZK.  For that reason
and for mocking purposes, it is pretty good practice to always inject the ZK
instance into your classes a la spring.

On Fri, Jun 5, 2009 at 8:56 PM, Grant Ingersoll <gs...@apache.org> wrote:

> What's the overhead of connecting to a Server?  In other words, if I'm in a
> multi-threaded web-app server environment, should I cache my ZooKeeper
> instance and set a larger timeout value or should I just construct them as I
> need them?
>
>
>

-- 
Ted Dunning, CTO
DeepDyve