You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@curator.apache.org by Foolish Ewe <fo...@hotmail.com> on 2017/01/25 23:03:42 UTC

Can Curator's recipes for synchronization be used when the releasing entity is not the locking entity?

Hello All:


I would like to use Curator to synchronize mutually exclusive access to a shared resource, however the entity that wants to release a lock is distinct from the locking entity (i.e. they are in different JVMS on different machines).    Such cases can occur in practice (e.g. producer/consumer synchronization, but this isn't quite my use case).   Informally I would like to have operations that behave like the following in a JVM based language:

  1.  Strict requirements:
     *   acquire(resourceId, taskId) - Have the task waiting for the resource suspend until it has mutually exclusive access (i.e. acquires the lock) or throw an exception if the request is somehow invalid (i.e. bad resource Id, bad task Id, internal error, etc).
     *   release(resourceId) - Given a resource, if there is an acquired lock, release that lock and wake up the next task (in FCFS order) waiting to acquire the lock if it exists
  2.  Nice to have (useful for maintenance, etc).
     *   status(resourceId) - Report if the resource is locked, the current taskId of the acquirer if the lock is acquired and the (potentially empty)  FCFS list of tasks waiting to acquire the lock.
     *   releaseAll(resourceId)  - remove all pending locks on this resource

However, the semantics of the recipes I've looked at seem to indicate that the releasing entity must have a handle (either explicit or implicit) of the lease/lock, e.g.


  *   http://curator.apache.org/curator-recipes/shared-reentrant-lock.html states
  *

public void release()
Perform one release of the mutex if the calling thread is the same thread that acquired it. If the
thread had made multiple calls to acquire, the mutex will still be held when this method returns.



  *   http://curator.apache.org/curator-recipes/shared-semaphore.html states:
  *   Lease instances can either be closed directly or you can use these convenience methods:

public void returnAll(Collection<Lease> leases)
public void returnLease(Lease lease)

So it appears on the surface the the expectation is that the same entity that acquires a mutex or a semaphore lease is expected to release the mutex or return the lease.
My questions are:

  1.  Am I misunderstanding how Curator works?
  2.  Is there a more appropriate abstraction in Curator for my use case?
  3.  Can I use one of the existing recipes?  Could a releasing entity return a lease if they had a serialized copy of the lease but weren't the entity acquiring the lease?
  4.  If I need to roll my own, should the Curator Framework be able to help here or should I work at the raw zookeeper level for this use case?

Thanks for your help with this:

Bill

Re: Can Curator's recipes for synchronization be used when the releasing entity is not the locking entity?

Posted by Vitalii Tymchyshyn <vi...@tym.im>.
Hi.

So, basically you are saying you dont need failure recovery. At least I did
not see any failure recovery scenarios.
In this case you would need something much simplier than a lock. What you
need is simple CAS operation with notifications and this is basic zookeeper
functionality, you dont need curator to do this (it still has nicer API to
use).

Best regards, Vitalii Tymchyshyn

On Mon, Jan 30, 2017, 4:33 PM Foolish Ewe <fo...@hotmail.com> wrote:

> Hello Jordan:
>
>
> Thank you for your thoughtful reply and also thanks to Vitalii
> Tymchyshyn, whose response may be addressing some of my questions.  Tl;
> dr  if I understand correctly, the Curator api design constrains the
> client java process that unlocks or returns a lease to be the same client
> (and hence in the same java process) that acquired the lock/lease.
>
>
> Let's consider the problem and try to develop some intuition and if needed
> formalism. First let's consider the problem outside the Curator context and
> then ask if we can express it in Curator/Zookeeper.
>
>
> Suppose we have the following logic before we decorate it with
> synchronziation/mutual exclusion, we are given a collection of parallel
> workflows where they all do
>
>
>
> Step B) update SharedResource
>
> Step C) read SharedResource (and other inputs) and Write Computed Results (to
> HDFS)
> Step E) ProcessResults
>
>
> It happens that for our use case,  Step 2) takes considerable time in our
> use case and if some work flow, say i is in Step B) or Step C) while
> another work flow, say j, does Step B), then job i will either fail and
> stop (if we are lucky) or have (potentially undetectable) corrupted output.
>
> Thus we would like to employ to guard the critical section, which is Step
> A) and Step B) with mutual exclusion/synchronization.   Let w denote the
> workflow id, then the revised job workflow would seem to look like the
> following:
>
> Step A) Acquire exclusive access to the Shared resource for workflow w
> (reserve/lock the shared resource)
>
> Step B) update SharedResource
>
> Step C) read SharedResource (and other inputs) and Write Computed Results (to
> HDFS)
> Step D) Release/unlock the reservation of the Shared Resource of workflow
> w making the Shared Resource available for access by other workflows
> Step E) ProcessResults
>
> Since we aren't asking all the workflows to get to reach a particular
> point in execution, it is unclear why I would try a synchronization
> barrier. To me,  this looks like a traditional mutual exclusion problem
> (i.e. at most one workflow is active in the critical section of Step B or
> Step C).
>
> The twist in my use case is that Step B) and Step C) are collections of
> one or more different jobs scheduled by Yarn,  so we don't currently
> support a continuously running client side process that can host a
> listener for our use case.  I was looking to see if the off-the-shelf
> recipes in Curator support this.  My current understanding is (if I
> understand Vitali's remarks and the documentation) is that Curator's design
> assumes that locking entity should be in the same Java process as the
> unlocking entity and that the Curator design advocates for a client side
> process running with a listener for correctness (e.g. recovery in the case
> of client failure, perhaps other cases too?).   But in our current system,  Step
> A) and Step D) are different jobs and share no JVMs (i.e. are distinct Java
> processes) and I was looking for an appropriate approach for
> the unlock/returnLease in Step D) given that constraint.
>
> Please correct me if I'm wrong, but my understanding I looked at the
> following  candidate approaches with the constraint of not having a
> continuously running java process that both acquires and releases a lock
> (or acquires/returnLease a semaphore):
>
>
>    - Please correct me if I'm wrong, but my understanding is that for
>    revocation, the lock holder needs to be listening for revocation requests
>    and then needs to release it's lock (or Revocation appears to be
>    cooperative, so I would need a client side listener in the locking entity's
>    java process, which would require some (potentially non-trivial)
>    refactoring of the workflow to accommodate this, in order to have correct
>    revocation request detection followed by lock release.
>    - http://curator.apache.org/curator-recipes/shared-reentrant-lock.html -
>    The unlock mechanism requires that the jvm has a valid InterProcessMutex
>    that has already acquired the lock before doing a release() operation. So
>    we have a chicken and egg situation here.
>    - http://curator.apache.org/curator-recipes/shared-semaphore.html -
>    The Lease (obtained via the acquire method) parameter in the
>    returnLease method (on first glance) appears to requires that the same java
>    process perform both the locking and unlocking (unless the lease can be
>    serialized and transmitted from the locking entity and received and
>    deserialized by the unlocking entity). Although the lease provides a
>    way to mitigate crashed locking entities, there appears to be a tradeoff,
>    where the lease improves recovery from crashed or failed clients but makes
>    the Curator semaphores seem less expressive than the  traditional
>    semaphore definition does not have any analog of the lease. E.g. in
>    producer consumer problems, the unlocking entity is distinct from the
>    locking entity (which is why I mentioned it as a motivating example).
>
> This seems to imply that I need to look at the cost of modifying the
> workflow design and see if I can meet the constraint or consider other
> approaches.
>
> With best regards:
>
> Bill
>
>    -
>    Apache Curator Recipes
>    <http://curator.apache.org/curator-recipes/shared-semaphore.html>
>    curator.apache.org
>    A counting semaphore that works across JVMs. All processes in all JVMs
>    that use the same lock path will achieve an inter-process limited set of
>    leases.
>
>
>    Shared ReEntrant Lock - Apache Curator
>    <http://curator.apache.org/curator-recipes/shared-reentrant-lock.html>
>    curator.apache.org
>    Fully distributed locks that are globally synchronous, meaning at any
>    snapshot in time no two clients think they hold the same lock.
>
>
>
>
>
> ------------------------------
> *From:* Jordan Zimmerman <jo...@jordanzimmerman.com>
> *Sent:* Thursday, January 26, 2017 5:05 AM
> *To:* user@curator.apache.org
> *Subject:* Re: Can Curator's recipes for synchronization be used when the
> releasing entity is not the locking entity?
>
> I read the description several times and, sadly, don’t understand. Maybe
> someone else? At first blush it almost sounds like a barrier or double
> barrier: http://curator.apache.org/curator-recipes/barrier.html or
> http://curator.apache.org/curator-recipes/double-barrier.html. But, then,
> I don’t totally understand. Another thing: Curator InterProcessMutex can be
> revoked from another process. See
> http://curator.apache.org/curator-recipes/shared-reentrant-lock.html “Revoking”
> - maybe that’s what you want? Other than that, maybe you can restate the
> problem or give more details.
> Apache Curator Recipes
> <http://curator.apache.org/curator-recipes/shared-reentrant-lock.html>
> curator.apache.org
> Fully distributed locks that are globally synchronous, meaning at any
> snapshot in time no two clients think they hold the same lock.
>
> Apache Curator Recipes
> <http://curator.apache.org/curator-recipes/double-barrier.html>
> curator.apache.org
> An implementation of the Distributed Double Barrier ZK recipe. Double
> barriers enable clients to synchronize the beginning and the end of a
> computation.
>
> Apache Curator Recipes
> <http://curator.apache.org/curator-recipes/barrier.html>
> curator.apache.org
> An implementation of the Distributed Barrier ZK recipe. Distributed
> systems use barriers to block processing of a set of nodes until a
> condition is met at which time ...
>
>
> -Jordan
>
> On Jan 25, 2017, at 6:03 PM, Foolish Ewe <fo...@hotmail.com> wrote:
>
> Hello All:
>
> I would like to use Curator to synchronize mutually exclusive access to a
> shared resource, however the entity that wants to release a lock is
> distinct from the locking entity (i.e. they are in different JVMS on
> different machines).    Such cases can occur in practice (e.g.
> producer/consumer synchronization, but this isn't quite my use case).
> Informally I would like to have operations that behave like the following
> in a JVM based language:
>
>
>    1. Strict requirements:
>       1. acquire(resourceId, taskId) - Have the task waiting for the
>       resource suspend until it has mutually exclusive access (i.e. acquires the
>       lock) or throw an exception if the request is somehow invalid (i.e. bad
>       resource Id, bad task Id, internal error, etc).
>       2. release(resourceId) - Given a resource, if there is an acquired
>       lock, release that lock and wake up the next task (in FCFS order) waiting
>       to acquire the lock if it exists
>    2. Nice to have (useful for maintenance, etc).
>       1. status(resourceId) - Report if the resource is locked, the
>       current taskId of the acquirer if the lock is acquired and the (potentially
>       empty)  FCFS list of tasks waiting to acquire the lock.
>       2. releaseAll(resourceId)  - remove all pending locks on this
>       resource
>
> However, the semantics of the recipes I've looked at seem to indicate that
> the releasing entity must have a handle (either explicit or implicit) of
> the lease/lock, e.g.
>
>
>
>    - http://curator.apache.org/curator-recipes/shared-reentrant-lock.html
>     states
>    -
>
>    public void release()
>    Perform one release of the mutex if the calling thread is the same thread that acquired it. If the
>    thread had made multiple calls to acquire, the mutex will still be held when this method returns.
>
>
>
>
>    - http://curator.apache.org/curator-recipes/shared-semaphore.html
>     states:
>    -
>
>    Lease instances can either be closed directly or you can use these
>    convenience methods:
>
>    public void returnAll(Collection<Lease> leases)
>    public void returnLease(Lease lease)
>
>
> So it appears on the surface the the expectation is that the same entity
> that acquires a mutex or a semaphore lease is expected to release the mutex
> or return the lease.
> My questions are:
>
>    1. Am I misunderstanding how Curator works?
>    2. Is there a more appropriate abstraction in Curator for my use case?
>    3. Can I use one of the existing recipes?  Could a releasing entity
>    return a lease if they had a serialized copy of the lease but weren't the
>    entity acquiring the lease?
>    4. If I need to roll my own, should the Curator Framework be able to
>    help here or should I work at the raw zookeeper level for this use case?
>
> Thanks for your help with this:
>
> Bill
>
>
>

Re: Can Curator's recipes for synchronization be used when the releasing entity is not the locking entity?

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
There is something that is continually confusing to me about your explanations. What JVMs are acquiring the exclusive access? Is it a single thread? What does “exclusive access” mean here? If, as you assert, "this looks like a traditional mutual exclusion” you’d have to identify which thread or threads in your system are getting the critical section.

> Curator's design assumes that locking entity should be in the same Java process as the unlocking entity
This is not merely Curator’s design - nothing else would make sense. A thread holds a lock. Who else could release that lock other than the thread holding it? These kinds of comments just confuse me I’m afraid.

> Step A) and Step D) are different jobs and share no JVMs (i.e. are distinct Java processes) and I was looking for an appropriate approach for the unlock/returnLease in Step D) given that constraint.
Then what does “lock” mean in this context? Who is the lock holder. Let’s be clear - threads hold locks - nothing else. If you’re looking for code that allows 1 thread to hold a lock then that’s a Curator InterProcessMutex. If you’re looking for something that allows multiple threads to hold the same lock then that’s a Curator InterProcessSemaphoreV2. However, you keep referring to locking the SharedResource and that confuses me.

It sounds like you’re asking if there is a single Curator class that does everything you want. It seems not. But, you might use a combination of Curator locks and caches, etc. Perhaps each process that wants to participate in this workflow could create a PersistentNode using EPHEMERAL_SEQUENTIAL. Any participant in the workflow can know that a given SharedResource is being operated on. Or you could use a LeaderSelector whereby one process leads the workflow but the others can know that they are participants. Or, maybe you could combine them: use a LeaderSelector to control the master in the workflow but also use a PersistentNode to denote that a given SharedResource is in process.

FWIW: I wrote a distributed task/workflow system a while back. Maybe you could look at it for some ideas? https://github.com/NirmataOSS/workflow <https://github.com/NirmataOSS/workflow>

-Jordan

> On Jan 30, 2017, at 4:33 PM, Foolish Ewe <fo...@hotmail.com> wrote:
> 
> Hello Jordan:
> 
> Thank you for your thoughtful reply and also thanks to Vitalii Tymchyshyn, whose response may be addressing some of my questions.  Tl; dr  if I understand correctly, the Curator api design constrains the client java process that unlocks or returns a lease to be the same client (and hence in the same java process) that acquired the lock/lease.
> 
> Let's consider the problem and try to develop some intuition and if needed formalism. First let's consider the problem outside the Curator context and then ask if we can express it in Curator/Zookeeper.
> 
> Suppose we have the following logic before we decorate it with synchronziation/mutual exclusion, we are given a collection of parallel workflows where they all do
> 
> 
> Step B) update SharedResource
> Step C) read SharedResource (and other inputs) and Write Computed Results (to HDFS)
> Step E) ProcessResults
> 
> 
> It happens that for our use case,  Step 2) takes considerable time in our use case and if some work flow, say i is in Step B) or Step C) while another work flow, say j, does Step B), then job i will either fail and stop (if we are lucky) or have (potentially undetectable) corrupted output.
> 
> Thus we would like to employ to guard the critical section, which is Step A) and Step B) with mutual exclusion/synchronization.   Let w denote the workflow id, then the revised job workflow would seem to look like the following:
> 
> Step A) Acquire exclusive access to the Shared resource for workflow w (reserve/lock the shared resource)
> Step B) update SharedResource
> Step C) read SharedResource (and other inputs) and Write Computed Results (to HDFS)
> Step D) Release/unlock the reservation of the Shared Resource of workflow w making the Shared Resource available for access by other workflows
> Step E) ProcessResults
> 
> Since we aren't asking all the workflows to get to reach a particular point in execution, it is unclear why I would try a synchronization barrier. To me,  this looks like a traditional mutual exclusion problem (i.e. at most one workflow is active in the critical section of Step B or Step C).
> 
> The twist in my use case is that Step B) and Step C) are collections of one or more different jobs scheduled by Yarn,  so we don't currently support a continuously running client side process that can host a listener for our use case.  I was looking to see if the off-the-shelf recipes in Curator support this.  My current understanding is (if I understand Vitali's remarks and the documentation) is that Curator's design assumes that locking entity should be in the same Java process as the unlocking entity and that the Curator design advocates for a client side process running with a listener for correctness (e.g. recovery in the case of client failure, perhaps other cases too?).   But in our current system,  Step A) and Step D) are different jobs and share no JVMs (i.e. are distinct Java processes) and I was looking for an appropriate approach for the unlock/returnLease in Step D) given that constraint.
> 
> Please correct me if I'm wrong, but my understanding I looked at the following  candidate approaches with the constraint of not having a continuously running java process that both acquires and releases a lock (or acquires/returnLease a semaphore):
> 
> Please correct me if I'm wrong, but my understanding is that for revocation, the lock holder needs to be listening for revocation requests and then needs to release it's lock (or Revocation appears to be cooperative, so I would need a client side listener in the locking entity's java process, which would require some (potentially non-trivial) refactoring of the workflow to accommodate this, in order to have correct revocation request detection followed by lock release.
> http://curator.apache.org/curator-recipes/shared-reentrant-lock.html <http://curator.apache.org/curator-recipes/shared-reentrant-lock.html> - The unlock mechanism requires that the jvm has a valid InterProcessMutex
>  that has already acquired the lock before doing a release() operation. So we have a chicken and egg situation here.
> http://curator.apache.org/curator-recipes/shared-semaphore.html <http://curator.apache.org/curator-recipes/shared-semaphore.html>
>  - The Lease (obtained via the acquire method) parameter in the returnLease method (on first glance) appears to requires that the same java process perform both the locking and unlocking (unless the
>  lease can be serialized and transmitted from the locking entity and received and deserialized by the unlocking entity). Although the lease provides a way to mitigate crashed locking entities, there appears to be a tradeoff, where the lease improves recovery from crashed or failed clients but makes the Curator semaphores seem less expressive than the  traditional semaphore definition does not have any analog of the lease. E.g. in producer consumer
>  problems, the unlocking entity is distinct from the locking entity (which is why I mentioned it as a motivating example).
> This seems to imply that I need to look at the cost of modifying the workflow design and see if I can meet the constraint or consider other approaches.
> 
> 
> With best regards:
> 
> 
> Bill
> 
> Apache Curator Recipes <http://curator.apache.org/curator-recipes/shared-semaphore.html>
> curator.apache.org <http://curator.apache.org/>
> A counting semaphore that works across JVMs. All processes in all JVMs that use the same lock path will achieve an inter-process limited set of leases.
> 
> 
> 
> Shared ReEntrant Lock - Apache Curator <http://curator.apache.org/curator-recipes/shared-reentrant-lock.html>
> curator.apache.org <http://curator.apache.org/>
> Fully distributed locks that are globally synchronous, meaning at any snapshot in time no two clients think they hold the same lock.
> 
> 
> 
> 
> From: Jordan Zimmerman <jordan@jordanzimmerman.com <ma...@jordanzimmerman.com>>
> Sent: Thursday, January 26, 2017 5:05 AM
> To: user@curator.apache.org <ma...@curator.apache.org>
> Subject: Re: Can Curator's recipes for synchronization be used when the releasing entity is not the locking entity?
>  
> I read the description several times and, sadly, don’t understand. Maybe someone else? At first blush it almost sounds like a barrier or double barrier: http://curator.apache.org/curator-recipes/barrier.html <http://curator.apache.org/curator-recipes/barrier.html> or http://curator.apache.org/curator-recipes/double-barrier.html <http://curator.apache.org/curator-recipes/double-barrier.html>. But, then, I don’t totally understand. Another thing: Curator InterProcessMutex can be revoked from another process. See http://curator.apache.org/curator-recipes/shared-reentrant-lock.html <http://curator.apache.org/curator-recipes/shared-reentrant-lock.html> “Revoking” - maybe that’s what you want? Other than that, maybe you can restate the problem or give more details.
> Apache Curator Recipes <http://curator.apache.org/curator-recipes/shared-reentrant-lock.html>
> curator.apache.org <http://curator.apache.org/>
> Fully distributed locks that are globally synchronous, meaning at any snapshot in time no two clients think they hold the same lock.
> 
> Apache Curator Recipes <http://curator.apache.org/curator-recipes/double-barrier.html>
> curator.apache.org <http://curator.apache.org/>
> An implementation of the Distributed Double Barrier ZK recipe. Double barriers enable clients to synchronize the beginning and the end of a computation.
> 
> Apache Curator Recipes <http://curator.apache.org/curator-recipes/barrier.html>
> curator.apache.org <http://curator.apache.org/>
> An implementation of the Distributed Barrier ZK recipe. Distributed systems use barriers to block processing of a set of nodes until a condition is met at which time ...
> 
> 
> -Jordan
> 
>> On Jan 25, 2017, at 6:03 PM, Foolish Ewe <foolishewe@hotmail.com <ma...@hotmail.com>> wrote:
>> 
>> Hello All:
>> 
>> I would like to use Curator to synchronize mutually exclusive access to a shared resource, however the entity that wants to release a lock is distinct from the locking entity (i.e. they are in different JVMS on different machines).    Such cases can occur in practice (e.g. producer/consumer synchronization, but this isn't quite my use case).   Informally I would like to have operations that behave like the following in a JVM based language:
>> Strict requirements:
>> acquire(resourceId, taskId) - Have the task waiting for the resource suspend until it has mutually exclusive access (i.e. acquires the lock) or throw an exception if the request is somehow invalid (i.e. bad resource Id, bad task Id, internal error, etc). 
>> release(resourceId) - Given a resource, if there is an acquired lock, release that lock and wake up the next task (in FCFS order) waiting to acquire the lock if it exists
>> Nice to have (useful for maintenance, etc).
>> status(resourceId) - Report if the resource is locked, the current taskId of the acquirer if the lock is acquired and the (potentially empty)  FCFS list of tasks waiting to acquire the lock.
>> releaseAll(resourceId)  - remove all pending locks on this resource
>> However, the semantics of the recipes I've looked at seem to indicate that the releasing entity must have a handle (either explicit or implicit) of the lease/lock, e.g.
>> 
>> http://curator.apache.org/curator-recipes/shared-reentrant-lock.html <http://curator.apache.org/curator-recipes/shared-reentrant-lock.html> states 
>> public void release()
>> Perform one release of the mutex if the calling thread is the same thread that acquired it. If the
>> thread had made multiple calls to acquire, the mutex will still be held when this method returns.
>> 
>> http://curator.apache.org/curator-recipes/shared-semaphore.html <http://curator.apache.org/curator-recipes/shared-semaphore.html> states:
>> Lease instances can either be closed directly or you can use these convenience methods:
>> 
>> public void returnAll(Collection<Lease> leases)
>> public void returnLease(Lease lease)
>> So it appears on the surface the the expectation is that the same entity that acquires a mutex or a semaphore lease is expected to release the mutex or return the lease.
>> My questions are:
>> Am I misunderstanding how Curator works?
>> Is there a more appropriate abstraction in Curator for my use case?
>> Can I use one of the existing recipes?  Could a releasing entity return a lease if they had a serialized copy of the lease but weren't the entity acquiring the lease?
>> If I need to roll my own, should the Curator Framework be able to help here or should I work at the raw zookeeper level for this use case?
>> Thanks for your help with this:
>> 
>> Bill
> 
> 


Re: Can Curator's recipes for synchronization be used when the releasing entity is not the locking entity?

Posted by Foolish Ewe <fo...@hotmail.com>.
Hello Jordan:


Thank you for your thoughtful reply and also thanks to Vitalii Tymchyshyn, whose response may be addressing some of my questions.  Tl; dr  if I understand correctly, the Curator api design constrains the client java process that unlocks or returns a lease to be the same client (and hence in the same java process) that acquired the lock/lease.


Let's consider the problem and try to develop some intuition and if needed formalism. First let's consider the problem outside the Curator context and then ask if we can express it in Curator/Zookeeper.


Suppose we have the following logic before we decorate it with synchronziation/mutual exclusion, we are given a collection of parallel workflows where they all do



Step B) update SharedResource

Step C) read SharedResource (and other inputs) and Write Computed Results (to HDFS)

Step E) ProcessResults


It happens that for our use case,  Step 2) takes considerable time in our use case and if some work flow, say i is in Step B) or Step C) while another work flow, say j, does Step B), then job i will either fail and stop (if we are lucky) or have (potentially undetectable) corrupted output.

Thus we would like to employ to guard the critical section, which is Step A) and Step B) with mutual exclusion/synchronization.   Let w denote the workflow id, then the revised job workflow would seem to look like the following:

Step A) Acquire exclusive access to the Shared resource for workflow w (reserve/lock the shared resource)

Step B) update SharedResource

Step C) read SharedResource (and other inputs) and Write Computed Results (to HDFS)

Step D) Release/unlock the reservation of the Shared Resource of workflow w making the Shared Resource available for access by other workflows
Step E) ProcessResults

Since we aren't asking all the workflows to get to reach a particular point in execution, it is unclear why I would try a synchronization barrier. To me,  this looks like a traditional mutual exclusion problem (i.e. at most one workflow is active in the critical section of Step B or Step C).

The twist in my use case is that Step B) and Step C) are collections of one or more different jobs scheduled by Yarn,  so we don't currently support a continuously running client side process that can host a listener for our use case.  I was looking to see if the off-the-shelf recipes in Curator support this.  My current understanding is (if I understand Vitali's remarks and the documentation) is that Curator's design assumes that locking entity should be in the same Java process as the unlocking entity and that the Curator design advocates for a client side process running with a listener for correctness (e.g. recovery in the case of client failure, perhaps other cases too?).   But in our current system,  Step A) and Step D) are different jobs and share no JVMs (i.e. are distinct Java processes) and I was looking for an appropriate approach for the unlock/returnLease in Step D) given that constraint.

Please correct me if I'm wrong, but my understanding I looked at the following  candidate approaches with the constraint of not having a continuously running java process that both acquires and releases a lock (or acquires/returnLease a semaphore):


  *   Please correct me if I'm wrong, but my understanding is that for revocation, the lock holder needs to be listening for revocation requests and then needs to release it's lock (or Revocation appears to be cooperative, so I would need a client side listener in the locking entity's java process, which would require some (potentially non-trivial) refactoring of the workflow to accommodate this, in order to have correct revocation request detection followed by lock release.
  *   http://curator.apache.org/curator-recipes/shared-reentrant-lock.html - The unlock mechanism requires that the jvm has a valid InterProcessMutex that has already acquired the lock before doing a release() operation. So we have a chicken and egg situation here.
  *   http://curator.apache.org/curator-recipes/shared-semaphore.html - The Lease (obtained via the acquire method) parameter in the returnLease method (on first glance) appears to requires that the same java process perform both the locking and unlocking (unless the lease can be serialized and transmitted from the locking entity and received and deserialized by the unlocking entity). Although the lease provides a way to mitigate crashed locking entities, there appears to be a tradeoff, where the lease improves recovery from crashed or failed clients but makes the Curator semaphores seem less expressive than the  traditional semaphore definition does not have any analog of the lease. E.g. in producer consumer problems, the unlocking entity is distinct from the locking entity (which is why I mentioned it as a motivating example).

This seems to imply that I need to look at the cost of modifying the workflow design and see if I can meet the constraint or consider other approaches.

With best regards:

Bill

  *
Apache Curator Recipes<http://curator.apache.org/curator-recipes/shared-semaphore.html>
curator.apache.org
A counting semaphore that works across JVMs. All processes in all JVMs that use the same lock path will achieve an inter-process limited set of leases.



Shared ReEntrant Lock - Apache Curator<http://curator.apache.org/curator-recipes/shared-reentrant-lock.html>
curator.apache.org
Fully distributed locks that are globally synchronous, meaning at any snapshot in time no two clients think they hold the same lock.





________________________________
From: Jordan Zimmerman <jo...@jordanzimmerman.com>
Sent: Thursday, January 26, 2017 5:05 AM
To: user@curator.apache.org
Subject: Re: Can Curator's recipes for synchronization be used when the releasing entity is not the locking entity?

I read the description several times and, sadly, don’t understand. Maybe someone else? At first blush it almost sounds like a barrier or double barrier: http://curator.apache.org/curator-recipes/barrier.html or http://curator.apache.org/curator-recipes/double-barrier.html. But, then, I don’t totally understand. Another thing: Curator InterProcessMutex can be revoked from another process. See http://curator.apache.org/curator-recipes/shared-reentrant-lock.html “Revoking” - maybe that’s what you want? Other than that, maybe you can restate the problem or give more details.
Apache Curator Recipes<http://curator.apache.org/curator-recipes/shared-reentrant-lock.html>
curator.apache.org
Fully distributed locks that are globally synchronous, meaning at any snapshot in time no two clients think they hold the same lock.


Apache Curator Recipes<http://curator.apache.org/curator-recipes/double-barrier.html>
curator.apache.org
An implementation of the Distributed Double Barrier ZK recipe. Double barriers enable clients to synchronize the beginning and the end of a computation.


Apache Curator Recipes<http://curator.apache.org/curator-recipes/barrier.html>
curator.apache.org
An implementation of the Distributed Barrier ZK recipe. Distributed systems use barriers to block processing of a set of nodes until a condition is met at which time ...



-Jordan

On Jan 25, 2017, at 6:03 PM, Foolish Ewe <fo...@hotmail.com>> wrote:

Hello All:

I would like to use Curator to synchronize mutually exclusive access to a shared resource, however the entity that wants to release a lock is distinct from the locking entity (i.e. they are in different JVMS on different machines).    Such cases can occur in practice (e.g. producer/consumer synchronization, but this isn't quite my use case).   Informally I would like to have operations that behave like the following in a JVM based language:

  1.  Strict requirements:
     *   acquire(resourceId, taskId) - Have the task waiting for the resource suspend until it has mutually exclusive access (i.e. acquires the lock) or throw an exception if the request is somehow invalid (i.e. bad resource Id, bad task Id, internal error, etc).
     *   release(resourceId) - Given a resource, if there is an acquired lock, release that lock and wake up the next task (in FCFS order) waiting to acquire the lock if it exists
  2.  Nice to have (useful for maintenance, etc).
     *   status(resourceId) - Report if the resource is locked, the current taskId of the acquirer if the lock is acquired and the (potentially empty)  FCFS list of tasks waiting to acquire the lock.
     *   releaseAll(resourceId)  - remove all pending locks on this resource

However, the semantics of the recipes I've looked at seem to indicate that the releasing entity must have a handle (either explicit or implicit) of the lease/lock, e.g.


  *   http://curator.apache.org/curator-recipes/shared-reentrant-lock.html states
  *

public void release()
Perform one release of the mutex if the calling thread is the same thread that acquired it. If the
thread had made multiple calls to acquire, the mutex will still be held when this method returns.



  *   http://curator.apache.org/curator-recipes/shared-semaphore.html states:
  *   Lease instances can either be closed directly or you can use these convenience methods:

public void returnAll(Collection<Lease> leases)
public void returnLease(Lease lease)

So it appears on the surface the the expectation is that the same entity that acquires a mutex or a semaphore lease is expected to release the mutex or return the lease.
My questions are:

  1.  Am I misunderstanding how Curator works?
  2.  Is there a more appropriate abstraction in Curator for my use case?
  3.  Can I use one of the existing recipes?  Could a releasing entity return a lease if they had a serialized copy of the lease but weren't the entity acquiring the lease?
  4.  If I need to roll my own, should the Curator Framework be able to help here or should I work at the raw zookeeper level for this use case?

Thanks for your help with this:

Bill


Re: Can Curator's recipes for synchronization be used when the releasing entity is not the locking entity?

Posted by Jordan Zimmerman <jo...@jordanzimmerman.com>.
I read the description several times and, sadly, don’t understand. Maybe someone else? At first blush it almost sounds like a barrier or double barrier: http://curator.apache.org/curator-recipes/barrier.html <http://curator.apache.org/curator-recipes/barrier.html> or http://curator.apache.org/curator-recipes/double-barrier.html <http://curator.apache.org/curator-recipes/double-barrier.html>. But, then, I don’t totally understand. Another thing: Curator InterProcessMutex can be revoked from another process. See http://curator.apache.org/curator-recipes/shared-reentrant-lock.html <http://curator.apache.org/curator-recipes/shared-reentrant-lock.html> “Revoking” - maybe that’s what you want? Other than that, maybe you can restate the problem or give more details.

-Jordan

> On Jan 25, 2017, at 6:03 PM, Foolish Ewe <fo...@hotmail.com> wrote:
> 
> Hello All:
> 
> I would like to use Curator to synchronize mutually exclusive access to a shared resource, however the entity that wants to release a lock is distinct from the locking entity (i.e. they are in different JVMS on different machines).    Such cases can occur in practice (e.g. producer/consumer synchronization, but this isn't quite my use case).   Informally I would like to have operations that behave like the following in a JVM based language:
> Strict requirements:
> acquire(resourceId, taskId) - Have the task waiting for the resource suspend until it has mutually exclusive access (i.e. acquires the lock) or throw an exception if the request is somehow invalid (i.e. bad resource Id, bad task Id, internal error, etc). 
> release(resourceId) - Given a resource, if there is an acquired lock, release that lock and wake up the next task (in FCFS order) waiting to acquire the lock if it exists
> Nice to have (useful for maintenance, etc).
> status(resourceId) - Report if the resource is locked, the current taskId of the acquirer if the lock is acquired and the (potentially empty)  FCFS list of tasks waiting to acquire the lock.
> releaseAll(resourceId)  - remove all pending locks on this resource
> However, the semantics of the recipes I've looked at seem to indicate that the releasing entity must have a handle (either explicit or implicit) of the lease/lock, e.g.
> 
> http://curator.apache.org/curator-recipes/shared-reentrant-lock.html <http://curator.apache.org/curator-recipes/shared-reentrant-lock.html> states 
> public void release()
> Perform one release of the mutex if the calling thread is the same thread that acquired it. If the
> thread had made multiple calls to acquire, the mutex will still be held when this method returns.
> 
> http://curator.apache.org/curator-recipes/shared-semaphore.html <http://curator.apache.org/curator-recipes/shared-semaphore.html> states:
> Lease instances can either be closed directly or you can use these convenience methods:
> 
> public void returnAll(Collection<Lease> leases)
> public void returnLease(Lease lease)
> So it appears on the surface the the expectation is that the same entity that acquires a mutex or a semaphore lease is expected to release the mutex or return the lease.
> My questions are:
> Am I misunderstanding how Curator works?
> Is there a more appropriate abstraction in Curator for my use case?
> Can I use one of the existing recipes?  Could a releasing entity return a lease if they had a serialized copy of the lease but weren't the entity acquiring the lease?
> If I need to roll my own, should the Curator Framework be able to help here or should I work at the raw zookeeper level for this use case?
> Thanks for your help with this:
> 
> Bill


Re: Can Curator's recipes for synchronization be used when the releasing entity is not the locking entity?

Posted by Vitalii Tymchyshyn <vi...@tym.im>.
Hi.

One of the main functions of Zookeeper and Curator is to handle client
failures properly with Ephemeral nodes. In your example its not clear what
are requirements for failure handling as now lock owner is not tied to the
client.


Best regards, Vitalii Tymchyshyn

On Wed, Jan 25, 2017, 6:04 PM Foolish Ewe <fo...@hotmail.com> wrote:

Hello All:


I would like to use Curator to synchronize mutually exclusive access to a
shared resource, however the entity that wants to release a lock is
distinct from the locking entity (i.e. they are in different JVMS on
different machines).    Such cases can occur in practice (e.g.
producer/consumer synchronization, but this isn't quite my use case).
Informally I would like to have operations that behave like the following
in a JVM based language:


   1. Strict requirements:
      1. acquire(resourceId, taskId) - Have the task waiting for the
      resource suspend until it has mutually exclusive access (i.e.
acquires the
      lock) or throw an exception if the request is somehow invalid (i.e. bad
      resource Id, bad task Id, internal error, etc).
      2. release(resourceId) - Given a resource, if there is an acquired
      lock, release that lock and wake up the next task (in FCFS order) waiting
      to acquire the lock if it exists
   2. Nice to have (useful for maintenance, etc).
      1. status(resourceId) - Report if the resource is locked, the current
      taskId of the acquirer if the lock is acquired and the
(potentially empty)
       FCFS list of tasks waiting to acquire the lock.
      2. releaseAll(resourceId)  - remove all pending locks on this resource

However, the semantics of the recipes I've looked at seem to indicate that
the releasing entity must have a handle (either explicit or implicit) of
the lease/lock, e.g.



   - http://curator.apache.org/curator-recipes/shared-reentrant-lock.html
    states
   -

   public void release()
   Perform one release of the mutex if the calling thread is the same
thread that acquired it. If the
   thread had made multiple calls to acquire, the mutex will still be
held when this method returns.




   - http://curator.apache.org/curator-recipes/shared-semaphore.html states:
   -

   Lease instances can either be closed directly or you can use these
   convenience methods:

   public void returnAll(Collection<Lease> leases)
   public void returnLease(Lease lease)


So it appears on the surface the the expectation is that the same entity
that acquires a mutex or a semaphore lease is expected to release the mutex
or return the lease.
My questions are:

   1. Am I misunderstanding how Curator works?
   2. Is there a more appropriate abstraction in Curator for my use case?
   3. Can I use one of the existing recipes?  Could a releasing entity
   return a lease if they had a serialized copy of the lease but weren't the
   entity acquiring the lease?
   4. If I need to roll my own, should the Curator Framework be able to
   help here or should I work at the raw zookeeper level for this use case?

Thanks for your help with this:

Bill