You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Martin Waite <wa...@googlemail.com> on 2010/02/24 12:40:19 UTC

is there a good pattern for leases ?

Hi,

Is there a good model for implementing leases in Zookeeper ?

What I want to achieve is for a client to create a lock, and for that lock
to disappear two minutes later - regardless of whether the client is still
connected to zk.   Like ephemeral nodes - but with a time delay.

regards,
Martin

Re: is there a good pattern for leases ?

Posted by Patrick Hunt <ph...@apache.org>.
Not sure if there is a jira on this but I have long been lobbying the 
team to add a new API/feature that allows clients to get a view of the 
"global clock". I think this would be a very useful (although admittedly 
potentially damaging) feature to add, esp around leases. If we think 
leases are something that we want to provide in future we may want to 
think about how we could more easily support this type of recipe.

Patrick

Henry Robinson wrote:
> A cautionary note with this problem - who says when 2 minutes is up? Clocks
> will go forward at different rates and with different offsets. You cannot
> rely on two machines having the same perception of what 2 minutes means. In
> general, in distributed systems, it's a good design principle to minimise
> any dependence on a common notion of real time.
> 
> That said the best way is to pick some machine, like Mahadev says, to retire
> old locks by polling every N seconds, where N is the slop you can afford.
> 
> What problem are you actually trying to solve?
> 
> cheers,
> Henry
> 
> On 24 February 2010 03:40, Martin Waite <wa...@googlemail.com> wrote:
> 
>> Hi,
>>
>> Is there a good model for implementing leases in Zookeeper ?
>>
>> What I want to achieve is for a client to create a lock, and for that lock
>> to disappear two minutes later - regardless of whether the client is still
>> connected to zk.   Like ephemeral nodes - but with a time delay.
>>
>> regards,
>> Martin
>>
> 
> 
> 

Re: is there a good pattern for leases ?

Posted by Ted Dunning <te...@gmail.com>.
That is one of the strengths of ZK.  Your client would do this:

1) create node, if success client has lock
2) get current node (you get the current version when you do this), if lease
is current and ours, we have the lock, if lease is current and not ours, we
have failed to get the lock
3) try to overwrite node+version from step 2 with our lease claim, if
success, client has the lock
4) if failure in step (3), somebody else jumped ahead of us and updated the
document between steps 2 and 3.  They therefore have the lease or it was the
GC who did it.
5) client has failed to get the lock.  We could repeat step 1 here if we
suspect the GC caused our lossage, but I think that would be of vanishingly
small benefit.

Steps 1, 2 and 3 are not atomic, but they are guaranteed to either get the
lock correctly or fail.  The only surprise waiting for you is in step 3 if
you get a connection loss event between sending the update and getting
confirmation of the update.  That can be recovered after the connection is
automatically restored by going back to step 2.

On Thu, Feb 25, 2010 at 12:07 AM, Martin Waite <wa...@googlemail.com>wrote:

> I cannot see how clients could safely inspect a node and overwrite the
> expiry time if it had expired.  That would involve multiple steps, and so
> not be atomic.
>

Re: is there a good pattern for leases ?

Posted by Martin Waite <wa...@googlemail.com>.
Hmm,

Thanks Ted.  I am going to have to do some more reading.

regards,
Martin

On 25 February 2010 19:11, Ted Dunning <te...@gmail.com> wrote:

> Not really.  You have ordering guarantees and you can avoid the whole mess
> by using the version numbers when you update the file.  See my other email.
>
> On Thu, Feb 25, 2010 at 2:50 AM, Martin Waite <waite.134@googlemail.com
> >wrote:
>
> > But to do this, would I need to call sync between steps 2 and 3 to ensure
> > the node "FN" was up-to-date - assuming I do not know if I am connected
> to
> > a
> > primary ZK instance ?   Would 10K sync calls within a 2 minute period be
> > excessive ?
> >
>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>

Re: is there a good pattern for leases ?

Posted by Ted Dunning <te...@gmail.com>.
Not really.  You have ordering guarantees and you can avoid the whole mess
by using the version numbers when you update the file.  See my other email.

On Thu, Feb 25, 2010 at 2:50 AM, Martin Waite <wa...@googlemail.com>wrote:

> But to do this, would I need to call sync between steps 2 and 3 to ensure
> the node "FN" was up-to-date - assuming I do not know if I am connected to
> a
> primary ZK instance ?   Would 10K sync calls within a 2 minute period be
> excessive ?
>



-- 
Ted Dunning, CTO
DeepDyve

Re: is there a good pattern for leases ?

Posted by Martin Waite <wa...@googlemail.com>.
Hi,

Another possible approach:

   1. client generates hash code to be locked.  use the first N hex digits
   (eg. first 2 digits) as a filename "FN".
   2. attempt to create (ephemeral) node "FN.lock".  loop until this
   succeeds, or abort when time budget exhausted.
   3. read node "FN" containing list of hash-codes with associated expiry
   times.  if full hash-code does not exist, or has expired, then update list
   with full hash-code and expiry time.  Otherwise, you didn't get the lock.
   4. release "FN.lock"

For 10K locked cards, this would give 256 nodes containing approx. 400
hash-codes.  Obviously this can be scaled by increasing the number of digits
in "FN".

But to do this, would I need to call sync between steps 2 and 3 to ensure
the node "FN" was up-to-date - assuming I do not know if I am connected to a
primary ZK instance ?   Would 10K sync calls within a 2 minute period be
excessive ?

regards,
Martin


On 25 February 2010 08:07, Martin Waite <wa...@googlemail.com> wrote:

> Hi
>
> Usually, this would hold about 2k items, pushing to 10k peaks.
>
> My current understanding is that I cannot lock a node while I consider its
> contents, and so only the garbage remover would be allowed to remove locks
> (currently lock-clients can claim expired locks).
>
> The clients would simply do this:
>
>    1. try to create a node
>    2. if success, you have the lock, otherwise you don't
>
> The garbage remover would do this:
>
>    1. inspect node.
>    2. if expired, delete it.
>
> The length of time that locks are held then becomes a function of the
> length of time between garbage-removing sweeps. Perhaps that is OK.
>
> I cannot see how clients could safely inspect a node and overwrite the
> expiry time if it had expired.  That would involve multiple steps, and so
> not be atomic.
>
> regards,
> Martin
>
>
>
> On 24 February 2010 22:35, Ted Dunning <te...@gmail.com> wrote:
>
>> You can simply implement the current system if you like by keeping a file
>> per card in ZK that contains your lock expiration time.  The garbage
>> collector would work the same way.  In order to make the getchildren
>> operation in the garbage collector work well, I would recommend a
>> hierarchical naming scheme for card lock files.
>>
>> My question would be how many elements you expect to be in that card lock
>> table.  If it is less than 100K, ZK should work pretty well.
>>
>> If you need more than that, you might consider putting locks for many
>> cards
>> in a single file.
>>
>> On Wed, Feb 24, 2010 at 11:10 AM, Martin Waite <waite.134@googlemail.com
>> >wrote:
>>
>> >   2. card-locking - to reduce the risk of payments being taken twice in
>> >   quick succession from the same card, a timed lock is placed on a hash
>> of
>> > the
>> >   card number for a number of seconds (0, 30, 60, 120, as required).  No
>> > other
>> >   payment can be taken on that card while the lock is in place.
>> >
>> > Our current way of implementing (2) is to insert into a table a row
>> > containing the card-hash and the expiry time of the lock.  Another
>> process
>> > can overwrite the lock if the expiry has been exceeded.  A periodic
>> garbage
>> > remover process deletes all expired locks to keep the size of the lock
>> > table
>> > small.
>> >
>> > The trouble with managing these locks in a database is that the tables
>> are
>> > getting "hot" and becoming one of the main sources of contention.  Also,
>> > SQL
>> > is not necessarily fast for doing the required updates.
>> >
>>
>>
>>
>> --
>> Ted Dunning, CTO
>> DeepDyve
>>
>
>

Re: is there a good pattern for leases ?

Posted by Martin Waite <wa...@googlemail.com>.
Hi

Usually, this would hold about 2k items, pushing to 10k peaks.

My current understanding is that I cannot lock a node while I consider its
contents, and so only the garbage remover would be allowed to remove locks
(currently lock-clients can claim expired locks).

The clients would simply do this:

   1. try to create a node
   2. if success, you have the lock, otherwise you don't

The garbage remover would do this:

   1. inspect node.
   2. if expired, delete it.

The length of time that locks are held then becomes a function of the length
of time between garbage-removing sweeps. Perhaps that is OK.

I cannot see how clients could safely inspect a node and overwrite the
expiry time if it had expired.  That would involve multiple steps, and so
not be atomic.

regards,
Martin


On 24 February 2010 22:35, Ted Dunning <te...@gmail.com> wrote:

> You can simply implement the current system if you like by keeping a file
> per card in ZK that contains your lock expiration time.  The garbage
> collector would work the same way.  In order to make the getchildren
> operation in the garbage collector work well, I would recommend a
> hierarchical naming scheme for card lock files.
>
> My question would be how many elements you expect to be in that card lock
> table.  If it is less than 100K, ZK should work pretty well.
>
> If you need more than that, you might consider putting locks for many cards
> in a single file.
>
> On Wed, Feb 24, 2010 at 11:10 AM, Martin Waite <waite.134@googlemail.com
> >wrote:
>
> >   2. card-locking - to reduce the risk of payments being taken twice in
> >   quick succession from the same card, a timed lock is placed on a hash
> of
> > the
> >   card number for a number of seconds (0, 30, 60, 120, as required).  No
> > other
> >   payment can be taken on that card while the lock is in place.
> >
> > Our current way of implementing (2) is to insert into a table a row
> > containing the card-hash and the expiry time of the lock.  Another
> process
> > can overwrite the lock if the expiry has been exceeded.  A periodic
> garbage
> > remover process deletes all expired locks to keep the size of the lock
> > table
> > small.
> >
> > The trouble with managing these locks in a database is that the tables
> are
> > getting "hot" and becoming one of the main sources of contention.  Also,
> > SQL
> > is not necessarily fast for doing the required updates.
> >
>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>

Re: is there a good pattern for leases ?

Posted by Ted Dunning <te...@gmail.com>.
You can simply implement the current system if you like by keeping a file
per card in ZK that contains your lock expiration time.  The garbage
collector would work the same way.  In order to make the getchildren
operation in the garbage collector work well, I would recommend a
hierarchical naming scheme for card lock files.

My question would be how many elements you expect to be in that card lock
table.  If it is less than 100K, ZK should work pretty well.

If you need more than that, you might consider putting locks for many cards
in a single file.

On Wed, Feb 24, 2010 at 11:10 AM, Martin Waite <wa...@googlemail.com>wrote:

>   2. card-locking - to reduce the risk of payments being taken twice in
>   quick succession from the same card, a timed lock is placed on a hash of
> the
>   card number for a number of seconds (0, 30, 60, 120, as required).  No
> other
>   payment can be taken on that card while the lock is in place.
>
> Our current way of implementing (2) is to insert into a table a row
> containing the card-hash and the expiry time of the lock.  Another process
> can overwrite the lock if the expiry has been exceeded.  A periodic garbage
> remover process deletes all expired locks to keep the size of the lock
> table
> small.
>
> The trouble with managing these locks in a database is that the tables are
> getting "hot" and becoming one of the main sources of contention.  Also,
> SQL
> is not necessarily fast for doing the required updates.
>



-- 
Ted Dunning, CTO
DeepDyve

Re: is there a good pattern for leases ?

Posted by Martin Waite <wa...@googlemail.com>.
Hi Henry,

I can appreciate your concerns about clocks, but there are no safety issues
with our system.  If locks are held for a few seconds too long, no-one will
be concerned.

I am currently looking at moving two locking mechanisms from within a SQL
database into a distributed locking mechanism of some sort.  The application
is a credit-card processing engine.  The two locks are:


   1. terminal locking - payment messages carrying terminal-ids (TIDs) are
   sent to acquiring banks.  The acquiring systems assume that terminals are
   physical devices and get confused if more than one message arrives from the
   same TID at the same time.   This is currently handled by updating a lock
   field in a table containing the TIDs. This was covered by my "how do I lock
   one-of-many" question yesterday.
   2. card-locking - to reduce the risk of payments being taken twice in
   quick succession from the same card, a timed lock is placed on a hash of the
   card number for a number of seconds (0, 30, 60, 120, as required).  No other
   payment can be taken on that card while the lock is in place.

Our current way of implementing (2) is to insert into a table a row
containing the card-hash and the expiry time of the lock.  Another process
can overwrite the lock if the expiry has been exceeded.  A periodic garbage
remover process deletes all expired locks to keep the size of the lock table
small.

The trouble with managing these locks in a database is that the tables are
getting "hot" and becoming one of the main sources of contention.  Also, SQL
is not necessarily fast for doing the required updates.

regards,
Martin

On 24 February 2010 18:21, Henry Robinson <he...@cloudera.com> wrote:

> A cautionary note with this problem - who says when 2 minutes is up? Clocks
> will go forward at different rates and with different offsets. You cannot
> rely on two machines having the same perception of what 2 minutes means. In
> general, in distributed systems, it's a good design principle to minimise
> any dependence on a common notion of real time.
>
> That said the best way is to pick some machine, like Mahadev says, to
> retire
> old locks by polling every N seconds, where N is the slop you can afford.
>
> What problem are you actually trying to solve?
>
> cheers,
> Henry
>
> On 24 February 2010 03:40, Martin Waite <wa...@googlemail.com> wrote:
>
> > Hi,
> >
> > Is there a good model for implementing leases in Zookeeper ?
> >
> > What I want to achieve is for a client to create a lock, and for that
> lock
> > to disappear two minutes later - regardless of whether the client is
> still
> > connected to zk.   Like ephemeral nodes - but with a time delay.
> >
> > regards,
> > Martin
> >
>
>
>
> --
> Henry Robinson
> Software Engineer
> Cloudera
> 415-994-6679
>

Re: is there a good pattern for leases ?

Posted by Henry Robinson <he...@cloudera.com>.
A cautionary note with this problem - who says when 2 minutes is up? Clocks
will go forward at different rates and with different offsets. You cannot
rely on two machines having the same perception of what 2 minutes means. In
general, in distributed systems, it's a good design principle to minimise
any dependence on a common notion of real time.

That said the best way is to pick some machine, like Mahadev says, to retire
old locks by polling every N seconds, where N is the slop you can afford.

What problem are you actually trying to solve?

cheers,
Henry

On 24 February 2010 03:40, Martin Waite <wa...@googlemail.com> wrote:

> Hi,
>
> Is there a good model for implementing leases in Zookeeper ?
>
> What I want to achieve is for a client to create a lock, and for that lock
> to disappear two minutes later - regardless of whether the client is still
> connected to zk.   Like ephemeral nodes - but with a time delay.
>
> regards,
> Martin
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

Re: is there a good pattern for leases ?

Posted by Mahadev Konar <ma...@yahoo-inc.com>.
I am not sure if I was clear enoguh in my last message.

What is suggested was this:

Create a client with a timeout of lets say 10 seconds!

Zookeeper zk = new ZooKeeper(10000); (for brevity ignoring other parameters)

Zk.create("/parent/ephemeral", data, EPEMERAL);

//create a another thread that triggeers at 120 seconds

On a trigger from this thread call zk.delete("/parent/ephemeral");

That's how lease can be done at the application side.

Obviously your lease expires on a session close and other events as well,
you need to be monitoring.

Thanks
mahadev


On 2/24/10 11:09 AM, "Martin Waite" <wa...@googlemail.com> wrote:

> Hi Mahadev,
> 
> That is interesting.  All I need to do is hold the connection for the
> required time of a session that created an ephemeral node.
> 
> Zookeeper is an interesting tool.
> 
> Thanks again,
> Martin
> 
> On 24 February 2010 17:00, Mahadev Konar <ma...@yahoo-inc.com> wrote:
> 
>> Hi Martin,
>>  There isnt an inherent model for leases in the zookeeper library itself.
>> To implement leases you will have to implement them at your application
>> side
>> with timeouts triggers (lease triggers) leading to session close at the
>> client.
>> 
>> 
>> Thanks
>> mahadev
>> 
>> 
>> On 2/24/10 3:40 AM, "Martin Waite" <wa...@googlemail.com> wrote:
>> 
>>> Hi,
>>> 
>>> Is there a good model for implementing leases in Zookeeper ?
>>> 
>>> What I want to achieve is for a client to create a lock, and for that
>> lock
>>> to disappear two minutes later - regardless of whether the client is
>> still
>>> connected to zk.   Like ephemeral nodes - but with a time delay.
>>> 
>>> regards,
>>> Martin
>> 
>> 


Re: is there a good pattern for leases ?

Posted by Martin Waite <wa...@googlemail.com>.
Hi Mahadev,

That is interesting.  All I need to do is hold the connection for the
required time of a session that created an ephemeral node.

Zookeeper is an interesting tool.

Thanks again,
Martin

On 24 February 2010 17:00, Mahadev Konar <ma...@yahoo-inc.com> wrote:

> Hi Martin,
>  There isnt an inherent model for leases in the zookeeper library itself.
> To implement leases you will have to implement them at your application
> side
> with timeouts triggers (lease triggers) leading to session close at the
> client.
>
>
> Thanks
> mahadev
>
>
> On 2/24/10 3:40 AM, "Martin Waite" <wa...@googlemail.com> wrote:
>
> > Hi,
> >
> > Is there a good model for implementing leases in Zookeeper ?
> >
> > What I want to achieve is for a client to create a lock, and for that
> lock
> > to disappear two minutes later - regardless of whether the client is
> still
> > connected to zk.   Like ephemeral nodes - but with a time delay.
> >
> > regards,
> > Martin
>
>

Re: is there a good pattern for leases ?

Posted by Mahadev Konar <ma...@yahoo-inc.com>.
Hi Martin,
  There isnt an inherent model for leases in the zookeeper library itself.
To implement leases you will have to implement them at your application side
with timeouts triggers (lease triggers) leading to session close at the
client.


Thanks
mahadev


On 2/24/10 3:40 AM, "Martin Waite" <wa...@googlemail.com> wrote:

> Hi,
> 
> Is there a good model for implementing leases in Zookeeper ?
> 
> What I want to achieve is for a client to create a lock, and for that lock
> to disappear two minutes later - regardless of whether the client is still
> connected to zk.   Like ephemeral nodes - but with a time delay.
> 
> regards,
> Martin