You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@river.apache.org by Patricia Shanahan <pa...@acm.org> on 2013/04/02 17:01:06 UTC

Re: test failure repeatability - TaskManager

My concern with RetryTask is related to your point about "If a task 
completes before another task which it's supposed to runAfter but isn't 
present in the queue; that could explain some issues."

A RetryTask puts itself back on the end of the queue when it needs to 
retry. Suppose taskA is a RetryTask, taskB is supposed to runAfter 
taskA, and originally appears on the queue after taskA. The first 
attempt at taskA fails, and it puts itself back on the TaskManager 
queue, using the normal add call. Now taskA is after taskB.

Patricia

On 4/2/2013 12:38 AM, Peter Firmstone wrote:
> The formatting didn't work out, I'll create a Jira issue to discuss.
>
> Patricia's done a great job detailing the dependencies and issues with
> TaskManager's Task implementations.
>
> I recall a list discussion from the original Sun developers who had
> intended to replace TaskManager, the runAfter method has issues.
>
> Being so prevalent, it's quite possible that TaskManager is causing
> issues and it might also explain why as performance improves more issues
> arise.
>
> If a task completes before another task which it's supposed to runAfter
> but isn't present in the queue; that could explain some issues.
>
> I much prefer idempotent code myself.
>
> This could take some effort to fix, any volunteers?
>
> Dennis are you able to continue with your 2.2.1 branch release?
>
> Regards,
>
> Peter.
>
> On 2/04/2013 5:17 PM, Peter Firmstone wrote:
>> I've appended Patricia's notes in html so we don't lose the table
>> formatting, hopefully it will be accepted by the mailer.
>>
>> On 2/04/2013 1:38 PM, Patricia Shanahan wrote:
>>> I've sent Peter some notes that I hope he can make available - I
>>> don't think I can send attachments to the list.
>>>
>>> Rereading my notes has reminded me that I had special concerns with
>>> RetryTask. Is that still used? If so, I'll explain the problem.
>>>
>>>
>> *TaskManager notes*
>>
>>
>>  Classes That Reference TaskManager
>>
>> Class
>>
>>
>>
>> Package
>>
>>
>>
>> Notes
>>
>> AbortJob
>>
>>
>>
>> com.sun.jini.mahalo
>>
>>
>>
>> Subclass of Job. Passed a TaskManager as parameter. Uses
>> ParticipantTask, no dependencies.
>>
>> CommitJob
>>
>>
>>
>> com.sun.jini.mahalo
>>
>>
>>
>> Subclass of Job. Passed a TaskManager as parameter. Uses
>> ParticipantTask, no dependencies.
>>
>> EventType
>>
>>
>>
>> com.sun.jini.norm.event
>>
>>
>>
>> Task type SendTask, subclass of RetryTask, no dependencies.
>>
>> EventTypeGenerator
>>
>>
>>
>> com.sun.jini.norm.event
>>
>>
>>
>> Supplies a TaskManager for use by the EventType objects it generates.
>>
>> FiddlerImpl
>>
>>
>>
>> com.sun.jini.fiddler
>>
>>
>>
>> Extensive use of TaskManager, with many different Task subtypes. No
>> dependencies.
>>
>> Job
>>
>>
>>
>> com.sun.jini.mahalo
>>
>>
>>
>> Manage performance of a job as a set of tasks all of which need to be
>> created by the Job subclass. There is some dubious code in performWork
>> that silently throws away an exception that would indicate internal
>> inconsistency.
>>
>> JoinManager
>>
>>
>>
>> net.jini.lookup
>>
>>
>>
>> Uses ProxyRegTask, which extends RetryTask. Special problem - making
>> sure a service gets exactly one ID. If the ID has already been
>> allocated, no dependencies. If not, runAfter any ProxyRegTask with
>> lower sequence number, ensuring that only the lowest sequence number
>> ProxyRegTask in the TaskManager can run. Safe if, and only if, tasks
>> are submitted in sequence number order, and there are no retries.
>>
>>
>> LeaseRenewalManager
>>
>>
>>
>> net.jini.lease
>>
>>
>>
>> Uses QueuerTask and RenewTask. No dependencies.
>>
>> LookupDiscovery
>>
>>
>>
>> net.jini.discovery
>>
>>
>>
>> Uses DecodeAnnouncementTask and UnicastDiscoveryTask. No dependencies.
>>
>> LookupLocatorDiscovery
>>
>>
>>
>> net.jini.discovery
>>
>>
>>
>> Uses DiscoveryTask. No dependencies.
>>
>> MailboxImpl
>>
>>
>>
>> com.sun.jini.mercury
>>
>>
>>
>> Uses a NotifyTask, subclass of RetryTask, no dependencies.
>>
>> Notifier
>>
>>
>>
>> com.sun.jini.outrigger
>>
>>
>>
>> Uses its own NotifyTask, subclass of RetryTask. Dependency based on
>> EventSender runAfter test. EventSender has two implementations. An
>> EventRegistrationWatcher.BasicEventSender waits for any
>> BasicEventSender belonging to the same EventRegistrationWatcher.
>> VisibilityEventSender has no dependencies.
>>
>> ParticipantTask
>>
>>
>>
>> com.sun.jini.mahalo
>>
>>
>>
>> No dependencies.
>>
>> PrepareAndCommitJob
>>
>>
>>
>> com.sun.jini.mahalo
>>
>>
>>
>> Subclass of Job. Passed a TaskManager as parameter. Uses
>> ParticipantTask, no dependencies.
>>
>> PrepareJob
>>
>>
>>
>> com.sun.jini.mahalo
>>
>>
>>
>> Subclass of Job. Passed a TaskManager as parameter. Uses
>> ParticipantTask, no dependencies.
>>
>> RegistrarImpl
>>
>>
>>
>> com.sun.jini.reggie
>>
>>
>>
>> Uses multiple Task types: AddressTask - no dependencies;
>> DecodeRequestTask - no dependencies; EventTask - run after EventTask
>> for same listener, "Keep events going to the same listener ordered";
>> SocketTask - no dependencies.
>>
>> RetryTask
>>
>>
>>
>> com.sun.jini.thread
>>
>>
>>
>> Abstract class implementing Task. It provides for automatic retry of
>> failed attempts, where an attempt is a call to tryOnce.
>>
>> ServiceDiscoveryManager
>>
>>
>>
>> net.jini.lookup
>>
>>
>>
>> Uses CacheTask - no dependencies; ServiceIdTask - run after
>> ServiceIdTask with same ServiceId and lower sequence number. Its
>> subclasses NewOldServiceTask and UnmapProxyTask inherit runAfter.
>> ServiceIdTask's subclass NotifyEventTask runs after
>> RegisterListenerTask or LookupTask with same ProxyReg and lower
>> sequence, and also calls the ServiceId runAfter. Bug ID 6291851.
>> Comment suggests the writer thought it was necessary to do a sequence
>> number check to find the queue order: " and if those tasks were queued
>> prior to this task (have lower sequence numbers)".
>>
>>
>> /** Whenever a ServiceIdTask is created in this cache, it is assigned
>>
>> * a unique sequence number to allow such tasks associated with the
>>
>> * same ServiceID to be executed in the order in which they were
>>
>> * queued in the TaskManager. This field contains the value of
>>
>> * the sequence number assigned to the most recently created
>>
>> * ServiceIdTask.
>>
>> */
>>
>> *private**long*taskSeqN= 0;
>>
>>
>> Synchronization window needs fixing. taskSeqN is protected by
>> serviceIdMap synchronization, but it is released before calling
>> cacheTaskMgr.add in addProxyReg
>>
>>
>> SettlerTask
>>
>>
>>
>> com.sun.jini.mahalo
>>
>>
>>
>> Subclass of RetryTask. No dependencies. Used in TxnManagerImpl.
>>
>> TxnManagerImpl
>>
>>
>>
>> com.sun.jini.mahalo
>>
>>
>>
>> Uses SettlerTask and ParticipantTask. No dependencies.
>>
>> TxnManagerTransaction
>>
>>
>>
>> com.sun.jini.mahalo
>>
>>
>>
>> Creates a TaskManager, threadpool, and passes it around to e.g. Job
>> and AbortJob.
>>
>> TxnMonitor
>>
>>
>>
>> com.sun.jini.outrigger
>>
>>
>>
>> Uses TxnMonitorTask.
>>
>> TxnMonitorTask
>>
>>
>>
>> com.sun.jini.outrigger
>>
>>
>>
>> Subclass of RetryTask. No dependencies.
>>
>>
>>  Issues
>>
>>
>>    RetryTask
>>
>> RetryTask is a Task implementation whose run method tries a subclass
>> supplied method with a boolean result. If the method returns false,
>> indicating failure, the RetryTask's run method schedules another try
>> in the future, using a WakeupManager supplied to the RetryTask
>> constructor.
>>
>> During the time between a failed attempt and its retry, there does not
>> seem to be any control to prevent conflicting tasks from entering the
>> same TaskManager. Some of those tasks would have waited for the task
>> being retried, if it had been in the TaskManager at their time of
>> arrival. Delayed retry and dependence on sequence number seem
>> incompatible. Notifier.NotifyTask and JoinManager.ProxyRegTask both
>> extend RetryTask and have dependencies. JoinManager.ProxyRegTask uses
>> a sequence number, but probably does not need to, and should not. The
>> intent seems to be to run tasks for a given service one-at-a-time
>> until its ServiceId has been set.
>>
>>
>>    ServiceDiscoveryManager.CacheTask
>>
>> Most subclasses inherit a "return false;" runAfter. The exceptions are
>> ServiceIdTask, its subclasses, and LookupTask. Both have sequence
>> number dependencies. It is not yet clear whether
>> ServiceDiscoveryManager is ensuring that tasks enter the TaskManager
>> in sequence number order. If it does, the code is correct, but wastes
>> time with a trivially true check. If not, the code is incorrect
>> relative to the comments, which seem to expect order.
>>
>>
>>
>>
>>
>>
>

Re: test failure repeatability - TaskManager

Posted by Peter <ji...@zeus.net.au>.

----- Original message -----
> One possibility is that some cases may just be "these tasks must not run
> in parallel" rather than an actual ordering.
>
> Also, I'm not sure all ordering constraints that are needed are
> necessarily implemented. The whole thing feels messy to me.
>

Agreed, we should probably consider each case individually, I noticed there's a configuration property that allows a TaskManager instance to be injected on a number of occasions too, which suggests there might be some sharing.

Peter.

> On 4/3/2013 2:04 PM, Dan Creswell wrote:
> > I'm with you. My first step was going to be reviewing where runAfter is
> > used, how often etc. I'd first like to be convinced that all the ordering
> > constraints are actually required and can't be circumvented/dropped.
> >
> > On 3 April 2013 22:01, Patricia Shanahan <pa...@acm.org> wrote:
> >
> > > I agree with the idea of understanding the use cases before designing the
> > > solution, and with using standard API classes as much as possible. The
> > > table I sent you was intended as a first step towards that.
> > >
> > > I'm not convinced that the right solution is a single TaskManager
> > > successor. Different TaskManager instances may have different use cases,
> > > and separating them may lead to several simpler solutions, each of which as
> > > a narrower set of requirements.
>

Re: test failure repeatability - TaskManager

Posted by Patricia Shanahan <pa...@acm.org>.

One possibility is that some cases may just be "these tasks must not run 
in parallel" rather than an actual ordering.

Also, I'm not sure all ordering constraints that are needed are 
necessarily implemented. The whole thing feels messy to me.

On 4/3/2013 2:04 PM, Dan Creswell wrote:
> I'm with you. My first step was going to be reviewing where runAfter is
> used, how often etc. I'd first like to be convinced that all the ordering
> constraints are actually required and can't be circumvented/dropped.
>
> On 3 April 2013 22:01, Patricia Shanahan <pa...@acm.org> wrote:
>
>> I agree with the idea of understanding the use cases before designing the
>> solution, and with using standard API classes as much as possible. The
>> table I sent you was intended as a first step towards that.
>>
>> I'm not convinced that the right solution is a single TaskManager
>> successor. Different TaskManager instances may have different use cases,
>> and separating them may lead to several simpler solutions, each of which as
>> a narrower set of requirements.

Re: test failure repeatability - TaskManager

Posted by Dan Creswell <da...@gmail.com>.

I'm with you. My first step was going to be reviewing where runAfter is
used, how often etc. I'd first like to be convinced that all the ordering
constraints are actually required and can't be circumvented/dropped.

On 3 April 2013 22:01, Patricia Shanahan <pa...@acm.org> wrote:

> I agree with the idea of understanding the use cases before designing the
> solution, and with using standard API classes as much as possible. The
> table I sent you was intended as a first step towards that.
>
> I'm not convinced that the right solution is a single TaskManager
> successor. Different TaskManager instances may have different use cases,
> and separating them may lead to several simpler solutions, each of which as
> a narrower set of requirements.
>
> Patricia
>
>
>
> On 4/3/2013 1:55 PM, Peter wrote:
>
>> Gut feeling suggests the solution will be executor based, so you're
>> asking good questions, I think we need to understand the use cases
>> better and probably redesign dependant code too.
>>
>> One example of retry, the task will continue attemtping to retry for
>> an entire day.
>>
>> We might need some kind of delay queue, where dependant tasks can
>> signal to following tasks when it's ok to execute.
>>
>> ----- Original message -----
>>
>>> I am not clear on the semantics for runAfter, but maybe this can
>>> be achieved by wrapping a Runnable within another Runnable such
>>> that the 2nd runnable is automatically scheduled after the first
>>> has succeeded? Likewise, it is possible to wrap a Runnable in order
>>> to automatically retry if it throws an exception.
>>>
>>> There are people who are experts at these patterns, but an example
>>> is given (below my signature) for an Executor that wraps an
>>> ExecutorService and queues Runnable instances with limited
>>> parallelism.  It hooks the Runnable in its own run() method.
>>>
>>> If you use a ScheduledExecutorService, you can queue a task to run
>>> with an initial and repeated delay (or at a repeated interval).
>>> The task will be rescheduled *unless* it throws an exception.  This
>>> could be reused to periodically run-try a task after a timeout if
>>> we convert an error thrown in the task into "no error" (hence run
>>> after a fixed delay) and throw out a known exception if there is no
>>> error (to terminate the retry of the task).  A bit of a hack, but
>>> it leverages existing code for re-running a task with a fixed
>>> delay.
>>>
>>> Thanks, Bryan
>>>
>>> package com.bigdata.util.concurrent;
>>>
>>> import java.util.concurrent.**BlockingQueue; import
>>> java.util.concurrent.Callable; import
>>> java.util.concurrent.Executor; import
>>> java.util.concurrent.**ExecutorService; import
>>> java.util.concurrent.Future; import
>>> java.util.concurrent.**FutureTask; import
>>> java.util.concurrent.**LinkedBlockingDeque; import
>>> java.util.concurrent.**RejectedExecutionException; import
>>> java.util.concurrent.**Semaphore;
>>>
>>> import org.apache.log4j.Logger;
>>>
>>> /** * A fly weight helper class that runs tasks either sequentially
>>> or with limited * parallelism against some thread pool. Deadlock
>>> can arise when limited * parallelism is applied if there are
>>> dependencies among the tasks. Limited * parallelism is enforced
>>> using a counting {@link Semaphore}. New tasks can * start iff the
>>> latch is non-zero. The maximum parallelism is the minimum of * the
>>> value specified to the constructor and the potential parallelism
>>> of the * delegate service. * <p> * Note: The pattern for running
>>> tasks on this service is generally to * {@link #execute(Runnable)}
>>> a {@link Runnable} and to make that * {@link Runnable} a {@link
>>> FutureTask} if you want to await the {@link Future} * of a {@link
>>> Runnable} or {@link Callable} or otherwise manage its execution. *
>>> <p> * Note: This class can NOT be trivially wrapped as an {@link
>>> ExecutorService} * since the resulting delegation pattern for
>>> submit() winds up invoking * execute() on the delegate {@link
>>> ExecutorService} rather than on this class. * * @author <a
>>> href="mailto:thompsonbry@**users.sourceforge.net<th...@users.sourceforge.net>">Bryan
>>> Thompson</a>
>>> * @version $Id: LatchedExecutor.java 6749 2012-12-03 14:42:48Z
>>> thompsonbry $ */ public class LatchedExecutor implements Executor
>>> {
>>>
>>> private static final transient Logger log = Logger
>>> .getLogger(LatchedExecutor.**class);
>>>
>>> /** * The delegate executor. */ private final Executor executor;
>>>
>>> /** * This is used to limit the concurrency with which tasks
>>> submitted to this * class may execute on the delegate {@link
>>> #executor}. */ private final Semaphore semaphore;
>>>
>>> /** * A thread-safe blocking queue of pending tasks. * * @todo The
>>> capacity of this queue does not of necessity need to be *
>>> unbounded. */ private final BlockingQueue<Runnable> queue = new
>>> LinkedBlockingDeque<Runnable>(**/*unbounded*/);
>>>
>>> private final int nparallel;
>>>
>>> /** * Return the maximum parallelism allowed by this {@link
>>> Executor}. */ public int getNParallel() {
>>>
>>> return nparallel;
>>>
>>> }
>>>
>>> public LatchedExecutor(final Executor executor, final int
>>> nparallel) {
>>>
>>> if (executor == null) throw new IllegalArgumentException();
>>>
>>> if (nparallel < 1) throw new IllegalArgumentException();
>>>
>>> this.executor = executor;
>>>
>>> this.nparallel = nparallel;
>>>
>>> this.semaphore = new Semaphore(nparallel);
>>>
>>> }
>>>
>>> public void execute(final Runnable r) { if (!queue.offer(new
>>> Runnable() { /* * Wrap the Runnable in a class that will start the
>>> next Runnable * from the queue when it completes. */ public void
>>> run() { try { r.run(); } finally { scheduleNext(); } } })) { // The
>>> queue is full. throw new RejectedExecutionException(); } if
>>> (semaphore.tryAcquire()) { // We were able to obtain a permit, so
>>> start another task. scheduleNext(); } }
>>>
>>> /** * Schedule the next task if one is available (non-blocking). *
>>> <p> * Pre-condition: The caller has a permit. */ private void
>>> scheduleNext() { while (true) { Runnable next = null; if ((next =
>>> queue.poll()) != null) { try { executor.execute(next); return; }
>>> catch (RejectedExecutionException ex) { // log error and poll the
>>> queue again. log.error(ex, ex); continue; } } else {
>>> semaphore.release(); return; } } }
>>>
>>> }
>>>
>>>
>>>
>>
>>
>

Re: test failure repeatability - TaskManager

Posted by Patricia Shanahan <pa...@acm.org>.

I agree with the idea of understanding the use cases before designing 
the solution, and with using standard API classes as much as possible. 
The table I sent you was intended as a first step towards that.

I'm not convinced that the right solution is a single TaskManager 
successor. Different TaskManager instances may have different use cases, 
and separating them may lead to several simpler solutions, each of which 
as a narrower set of requirements.

Patricia


On 4/3/2013 1:55 PM, Peter wrote:
> Gut feeling suggests the solution will be executor based, so you're
> asking good questions, I think we need to understand the use cases
> better and probably redesign dependant code too.
>
> One example of retry, the task will continue attemtping to retry for
> an entire day.
>
> We might need some kind of delay queue, where dependant tasks can
> signal to following tasks when it's ok to execute.
>
> ----- Original message -----
>> I am not clear on the semantics for runAfter, but maybe this can
>> be achieved by wrapping a Runnable within another Runnable such
>> that the 2nd runnable is automatically scheduled after the first
>> has succeeded? Likewise, it is possible to wrap a Runnable in order
>> to automatically retry if it throws an exception.
>>
>> There are people who are experts at these patterns, but an example
>> is given (below my signature) for an Executor that wraps an
>> ExecutorService and queues Runnable instances with limited
>> parallelism.  It hooks the Runnable in its own run() method.
>>
>> If you use a ScheduledExecutorService, you can queue a task to run
>> with an initial and repeated delay (or at a repeated interval).
>> The task will be rescheduled *unless* it throws an exception.  This
>> could be reused to periodically run-try a task after a timeout if
>> we convert an error thrown in the task into "no error" (hence run
>> after a fixed delay) and throw out a known exception if there is no
>> error (to terminate the retry of the task).  A bit of a hack, but
>> it leverages existing code for re-running a task with a fixed
>> delay.
>>
>> Thanks, Bryan
>>
>> package com.bigdata.util.concurrent;
>>
>> import java.util.concurrent.BlockingQueue; import
>> java.util.concurrent.Callable; import
>> java.util.concurrent.Executor; import
>> java.util.concurrent.ExecutorService; import
>> java.util.concurrent.Future; import
>> java.util.concurrent.FutureTask; import
>> java.util.concurrent.LinkedBlockingDeque; import
>> java.util.concurrent.RejectedExecutionException; import
>> java.util.concurrent.Semaphore;
>>
>> import org.apache.log4j.Logger;
>>
>> /** * A fly weight helper class that runs tasks either sequentially
>> or with limited * parallelism against some thread pool. Deadlock
>> can arise when limited * parallelism is applied if there are
>> dependencies among the tasks. Limited * parallelism is enforced
>> using a counting {@link Semaphore}. New tasks can * start iff the
>> latch is non-zero. The maximum parallelism is the minimum of * the
>> value specified to the constructor and the potential parallelism
>> of the * delegate service. * <p> * Note: The pattern for running
>> tasks on this service is generally to * {@link #execute(Runnable)}
>> a {@link Runnable} and to make that * {@link Runnable} a {@link
>> FutureTask} if you want to await the {@link Future} * of a {@link
>> Runnable} or {@link Callable} or otherwise manage its execution. *
>> <p> * Note: This class can NOT be trivially wrapped as an {@link
>> ExecutorService} * since the resulting delegation pattern for
>> submit() winds up invoking * execute() on the delegate {@link
>> ExecutorService} rather than on this class. * * @author <a
>> href="mailto:thompsonbry@users.sourceforge.net">Bryan Thompson</a>
>> * @version $Id: LatchedExecutor.java 6749 2012-12-03 14:42:48Z
>> thompsonbry $ */ public class LatchedExecutor implements Executor
>> {
>>
>> private static final transient Logger log = Logger
>> .getLogger(LatchedExecutor.class);
>>
>> /** * The delegate executor. */ private final Executor executor;
>>
>> /** * This is used to limit the concurrency with which tasks
>> submitted to this * class may execute on the delegate {@link
>> #executor}. */ private final Semaphore semaphore;
>>
>> /** * A thread-safe blocking queue of pending tasks. * * @todo The
>> capacity of this queue does not of necessity need to be *
>> unbounded. */ private final BlockingQueue<Runnable> queue = new
>> LinkedBlockingDeque<Runnable>(/*unbounded*/);
>>
>> private final int nparallel;
>>
>> /** * Return the maximum parallelism allowed by this {@link
>> Executor}. */ public int getNParallel() {
>>
>> return nparallel;
>>
>> }
>>
>> public LatchedExecutor(final Executor executor, final int
>> nparallel) {
>>
>> if (executor == null) throw new IllegalArgumentException();
>>
>> if (nparallel < 1) throw new IllegalArgumentException();
>>
>> this.executor = executor;
>>
>> this.nparallel = nparallel;
>>
>> this.semaphore = new Semaphore(nparallel);
>>
>> }
>>
>> public void execute(final Runnable r) { if (!queue.offer(new
>> Runnable() { /* * Wrap the Runnable in a class that will start the
>> next Runnable * from the queue when it completes. */ public void
>> run() { try { r.run(); } finally { scheduleNext(); } } })) { // The
>> queue is full. throw new RejectedExecutionException(); } if
>> (semaphore.tryAcquire()) { // We were able to obtain a permit, so
>> start another task. scheduleNext(); } }
>>
>> /** * Schedule the next task if one is available (non-blocking). *
>> <p> * Pre-condition: The caller has a permit. */ private void
>> scheduleNext() { while (true) { Runnable next = null; if ((next =
>> queue.poll()) != null) { try { executor.execute(next); return; }
>> catch (RejectedExecutionException ex) { // log error and poll the
>> queue again. log.error(ex, ex); continue; } } else {
>> semaphore.release(); return; } } }
>>
>> }
>>
>>
>
>

Re: test failure repeatability - TaskManager

Posted by Peter <ji...@zeus.net.au>.

Gut feeling suggests the solution will be executor based, so you're asking good questions, I think we need to understand the use cases better and probably redesign dependant code too.

One example of retry, the task will continue attemtping to retry for an entire day.

We might need some kind of delay queue, where dependant tasks can signal to following tasks when it's ok to execute.

----- Original message -----
> I am not clear on the semantics for runAfter, but maybe this can be
> achieved by wrapping a Runnable within another Runnable such that the 2nd
> runnable is automatically scheduled after the first has succeeded?
> Likewise, it is possible to wrap a Runnable in order to automatically
> retry if it throws an exception.
>
> There are people who are experts at these patterns, but an example is
> given (below my signature) for an Executor that wraps an ExecutorService
> and queues Runnable instances with limited parallelism.  It hooks the
> Runnable in its own run() method.
>
> If you use a ScheduledExecutorService, you can queue a task to run with an
> initial and repeated delay (or at a repeated interval).  The task will be
> rescheduled *unless* it throws an exception.  This could be reused to
> periodically run-try a task after a timeout if we convert an error thrown
> in the task into "no error" (hence run after a fixed delay) and throw out
> a known exception if there is no error (to terminate the retry of the
> task).  A bit of a hack, but it leverages existing code for re-running a
> task with a fixed delay.
>
> Thanks,
> Bryan
>
> package com.bigdata.util.concurrent;
>
> import java.util.concurrent.BlockingQueue;
> import java.util.concurrent.Callable;
> import java.util.concurrent.Executor;
> import java.util.concurrent.ExecutorService;
> import java.util.concurrent.Future;
> import java.util.concurrent.FutureTask;
> import java.util.concurrent.LinkedBlockingDeque;
> import java.util.concurrent.RejectedExecutionException;
> import java.util.concurrent.Semaphore;
>
> import org.apache.log4j.Logger;
>
> /**
>  * A fly weight helper class that runs tasks either sequentially or with
> limited
>  * parallelism against some thread pool. Deadlock can arise when limited
>  * parallelism is applied if there are dependencies among the tasks.
> Limited
>  * parallelism is enforced using a counting {@link Semaphore}. New tasks
> can
>  * start iff the latch is non-zero. The maximum parallelism is the minimum
> of
>  * the value specified to the constructor and the potential parallelism of
> the
>  * delegate service.
>  * <p>
>  * Note: The pattern for running tasks on this service is generally to
>  * {@link #execute(Runnable)} a {@link Runnable} and to make that
>  * {@link Runnable} a {@link FutureTask} if you want to await the {@link
> Future}
>  * of a {@link Runnable} or {@link Callable} or otherwise manage its
> execution.
>  * <p>
>  * Note: This class can NOT be trivially wrapped as an {@link
> ExecutorService}
>  * since the resulting delegation pattern for submit() winds up invoking
>  * execute() on the delegate {@link ExecutorService} rather than on this
> class.
>  *
>  * @author <a href="mailto:thompsonbry@users.sourceforge.net">Bryan
> Thompson</a>
>  * @version $Id: LatchedExecutor.java 6749 2012-12-03 14:42:48Z
> thompsonbry $
> */
> public class LatchedExecutor implements Executor {
>
>        private static final transient Logger log = Logger
>                        .getLogger(LatchedExecutor.class);
>       
>        /**
>          * The delegate executor.
>          */
>        private final Executor executor;
>       
>        /**
>          * This is used to limit the concurrency with which tasks submitted to
> this
>          * class may execute on the delegate {@link #executor}.
>          */
>        private final Semaphore semaphore;
>       
>        /**
>          * A thread-safe blocking queue of pending tasks.
>          *
>          * @todo The capacity of this queue does not of necessity need to be
>          *            unbounded.
>          */
>        private final BlockingQueue<Runnable> queue = new
> LinkedBlockingDeque<Runnable>(/*unbounded*/);
>
>        private final int nparallel;
>       
>        /**
>          * Return the maximum parallelism allowed by this {@link Executor}.
>          */
>        public int getNParallel() {
>           
>            return nparallel;
>           
>        }
>       
>        public LatchedExecutor(final Executor executor, final int nparallel) {
>
>                if (executor == null)
>                        throw new IllegalArgumentException();
>
>                if (nparallel < 1)
>                        throw new IllegalArgumentException();
>
>                this.executor = executor;
>
>                this.nparallel = nparallel;
>               
>                this.semaphore = new Semaphore(nparallel);
>
>        }
>
>        public void execute(final Runnable r) {
>                if (!queue.offer(new Runnable() {
>                        /*
>                          * Wrap the Runnable in a class that will start the next
> Runnable
>                          * from the queue when it completes.
>                          */
>                        public void run() {
>                                try {
>                                        r.run();
>                                } finally {
>                                        scheduleNext();
>                                }
>                        }
>                })) {
>                        // The queue is full.
>                        throw new RejectedExecutionException();
>                }
>                if (semaphore.tryAcquire()) {
>                        // We were able to obtain a permit, so start another task.
>                        scheduleNext();
>                }
>        }
>
>        /**
>          * Schedule the next task if one is available (non-blocking).
>          * <p>
>          * Pre-condition: The caller has a permit.
>          */
>        private void scheduleNext() {
>                while (true) {
>                        Runnable next = null;
>                        if ((next = queue.poll()) != null) {
>                                try {
>                                        executor.execute(next);
>                                        return;
>                                } catch (RejectedExecutionException ex) {
>                                        // log error and poll the queue again.
>                                        log.error(ex, ex);
>                                        continue;
>                                }
>                        } else {
>                                semaphore.release();
>                                return;
>                        }
>                }
>        }
>
> }
>
>

Re: test failure repeatability - TaskManager

Posted by Patricia Shanahan <pa...@acm.org>.

runAfter is a method in the TaskManager.Task interface, implemented by
each of its tasks:

/**
* Return true if this task must be run after at least one task
* in the given task list with an index less than size (size may be
* less then tasks.size()).  Using List.get will be more efficient
* than List.iterator.
*
* @param tasks the tasks to consider.  A read-only List, with all
* elements instanceof Task.
* @param size elements with index less than size should be considered
*

The notes I sent to Peter were part of an effort on my part to improve
performance. This has O(N^2) tendencies, because whenever a task
finishes the TaskManager has to ask each waiting task whether it still
needs to wait for any older task. I wanted to change it so that
TaskManager would know which task was being waited for. It could then
associate with one task a list of tasks that need to be reconsidered
when it finishes.

I think runAfter has two possible uses, and I'm not sure which cases are 
for which purpose:

1. Mutual exclusion - two tasks should not be running at the same time.
That could be implemented by the younger returning true for a task list
containing the older. In this case the sort of overtaking I described
below does not matter.

2. Order preservation - A task needs a state change to have happened
that will not happen until after an older task has run.

Patricia

On 4/2/2013 2:17 PM, Bryan Thompson wrote:
> I am not clear on the semantics for runAfter, but maybe this can be
> achieved by wrapping a Runnable within another Runnable such that the 2nd
> runnable is automatically scheduled after the first has succeeded?
> Likewise, it is possible to wrap a Runnable in order to automatically
> retry if it throws an exception.
>
> There are people who are experts at these patterns, but an example is
> given (below my signature) for an Executor that wraps an ExecutorService
> and queues Runnable instances with limited parallelism.  It hooks the
> Runnable in its own run() method.
>
> If you use a ScheduledExecutorService, you can queue a task to run with an
> initial and repeated delay (or at a repeated interval).  The task will be
> rescheduled *unless* it throws an exception.  This could be reused to
> periodically run-try a task after a timeout if we convert an error thrown
> in the task into "no error" (hence run after a fixed delay) and throw out
> a known exception if there is no error (to terminate the retry of the
> task).  A bit of a hack, but it leverages existing code for re-running a
> task with a fixed delay.
>
> Thanks,
> Bryan
>
> package com.bigdata.util.concurrent;
>
> import java.util.concurrent.BlockingQueue;
> import java.util.concurrent.Callable;
> import java.util.concurrent.Executor;
> import java.util.concurrent.ExecutorService;
> import java.util.concurrent.Future;
> import java.util.concurrent.FutureTask;
> import java.util.concurrent.LinkedBlockingDeque;
> import java.util.concurrent.RejectedExecutionException;
> import java.util.concurrent.Semaphore;
>
> import org.apache.log4j.Logger;
>
> /**
>   * A fly weight helper class that runs tasks either sequentially or with
> limited
>   * parallelism against some thread pool. Deadlock can arise when limited
>   * parallelism is applied if there are dependencies among the tasks.
> Limited
>   * parallelism is enforced using a counting {@link Semaphore}. New tasks
> can
>   * start iff the latch is non-zero. The maximum parallelism is the minimum
> of
>   * the value specified to the constructor and the potential parallelism of
> the
>   * delegate service.
>   * <p>
>   * Note: The pattern for running tasks on this service is generally to
>   * {@link #execute(Runnable)} a {@link Runnable} and to make that
>   * {@link Runnable} a {@link FutureTask} if you want to await the {@link
> Future}
>   * of a {@link Runnable} or {@link Callable} or otherwise manage its
> execution.
>   * <p>
>   * Note: This class can NOT be trivially wrapped as an {@link
> ExecutorService}
>   * since the resulting delegation pattern for submit() winds up invoking
>   * execute() on the delegate {@link ExecutorService} rather than on this
> class.
>   *
>   * @author <a href="mailto:thompsonbry@users.sourceforge.net">Bryan
> Thompson</a>
>   * @version $Id: LatchedExecutor.java 6749 2012-12-03 14:42:48Z
> thompsonbry $
> */
> public class LatchedExecutor implements Executor {
>
>      private static final transient Logger log = Logger
>              .getLogger(LatchedExecutor.class);
>
>      /**
>       * The delegate executor.
>       */
>      private final Executor executor;
>
>      /**
>       * This is used to limit the concurrency with which tasks submitted to
> this
>       * class may execute on the delegate {@link #executor}.
>       */
>      private final Semaphore semaphore;
>
>      /**
>       * A thread-safe blocking queue of pending tasks.
>       *
>       * @todo The capacity of this queue does not of necessity need to be
>       *       unbounded.
>       */
>      private final BlockingQueue<Runnable> queue = new
> LinkedBlockingDeque<Runnable>(/*unbounded*/);
>
>      private final int nparallel;
>
>      /**
>       * Return the maximum parallelism allowed by this {@link Executor}.
>       */
>      public int getNParallel() {
>      	
>      	return nparallel;
>      	
>      }
>
>      public LatchedExecutor(final Executor executor, final int nparallel) {
>
>          if (executor == null)
>              throw new IllegalArgumentException();
>
>          if (nparallel < 1)
>              throw new IllegalArgumentException();
>
>          this.executor = executor;
>
>          this.nparallel = nparallel;
>
>          this.semaphore = new Semaphore(nparallel);
>
>      }
>
>      public void execute(final Runnable r) {
>          if (!queue.offer(new Runnable() {
>              /*
>               * Wrap the Runnable in a class that will start the next
> Runnable
>               * from the queue when it completes.
>               */
>              public void run() {
>                  try {
>                      r.run();
>                  } finally {
>                      scheduleNext();
>                  }
>              }
>          })) {
>              // The queue is full.
>              throw new RejectedExecutionException();
>          }
>          if (semaphore.tryAcquire()) {
>              // We were able to obtain a permit, so start another task.
>              scheduleNext();
>          }
>      }
>
>      /**
>       * Schedule the next task if one is available (non-blocking).
>       * <p>
>       * Pre-condition: The caller has a permit.
>       */
>      private void scheduleNext() {
>          while (true) {
>              Runnable next = null;
>              if ((next = queue.poll()) != null) {
>                  try {
>                      executor.execute(next);
>                      return;
>                  } catch (RejectedExecutionException ex) {
>                      // log error and poll the queue again.
>                      log.error(ex, ex);
>                      continue;
>                  }
>              } else {
>                  semaphore.release();
>                  return;
>              }
>          }
>      }
>
> }
>
>

Re: test failure repeatability - TaskManager

Posted by Bryan Thompson <br...@systap.com>.

I am not clear on the semantics for runAfter, but maybe this can be
achieved by wrapping a Runnable within another Runnable such that the 2nd
runnable is automatically scheduled after the first has succeeded?
Likewise, it is possible to wrap a Runnable in order to automatically
retry if it throws an exception.

There are people who are experts at these patterns, but an example is
given (below my signature) for an Executor that wraps an ExecutorService
and queues Runnable instances with limited parallelism.  It hooks the
Runnable in its own run() method.

If you use a ScheduledExecutorService, you can queue a task to run with an
initial and repeated delay (or at a repeated interval).  The task will be
rescheduled *unless* it throws an exception.  This could be reused to
periodically run-try a task after a timeout if we convert an error thrown
in the task into "no error" (hence run after a fixed delay) and throw out
a known exception if there is no error (to terminate the retry of the
task).  A bit of a hack, but it leverages existing code for re-running a
task with a fixed delay.

Thanks,
Bryan

package com.bigdata.util.concurrent;

import java.util.concurrent.BlockingQueue;
import java.util.concurrent.Callable;
import java.util.concurrent.Executor;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Future;
import java.util.concurrent.FutureTask;
import java.util.concurrent.LinkedBlockingDeque;
import java.util.concurrent.RejectedExecutionException;
import java.util.concurrent.Semaphore;

import org.apache.log4j.Logger;

/**
 * A fly weight helper class that runs tasks either sequentially or with
limited
 * parallelism against some thread pool. Deadlock can arise when limited
 * parallelism is applied if there are dependencies among the tasks.
Limited
 * parallelism is enforced using a counting {@link Semaphore}. New tasks
can
 * start iff the latch is non-zero. The maximum parallelism is the minimum
of
 * the value specified to the constructor and the potential parallelism of
the
 * delegate service.
 * <p>
 * Note: The pattern for running tasks on this service is generally to
 * {@link #execute(Runnable)} a {@link Runnable} and to make that
 * {@link Runnable} a {@link FutureTask} if you want to await the {@link
Future}
 * of a {@link Runnable} or {@link Callable} or otherwise manage its
execution.
 * <p>
 * Note: This class can NOT be trivially wrapped as an {@link
ExecutorService}
 * since the resulting delegation pattern for submit() winds up invoking
 * execute() on the delegate {@link ExecutorService} rather than on this
class.
 * 
 * @author <a href="mailto:thompsonbry@users.sourceforge.net">Bryan
Thompson</a>
 * @version $Id: LatchedExecutor.java 6749 2012-12-03 14:42:48Z
thompsonbry $
*/
public class LatchedExecutor implements Executor {

    private static final transient Logger log = Logger
            .getLogger(LatchedExecutor.class);
    
    /**
     * The delegate executor.
     */
    private final Executor executor;
    
    /**
     * This is used to limit the concurrency with which tasks submitted to
this
     * class may execute on the delegate {@link #executor}.
     */
    private final Semaphore semaphore;
    
    /**
     * A thread-safe blocking queue of pending tasks.
     * 
     * @todo The capacity of this queue does not of necessity need to be
     *       unbounded.
     */
    private final BlockingQueue<Runnable> queue = new
LinkedBlockingDeque<Runnable>(/*unbounded*/);

    private final int nparallel;
    
    /**
     * Return the maximum parallelism allowed by this {@link Executor}.
     */
    public int getNParallel() {
    	
    	return nparallel;
    	
    }
    
    public LatchedExecutor(final Executor executor, final int nparallel) {

        if (executor == null)
            throw new IllegalArgumentException();

        if (nparallel < 1)
            throw new IllegalArgumentException();

        this.executor = executor;

        this.nparallel = nparallel;
        
        this.semaphore = new Semaphore(nparallel);

    }

    public void execute(final Runnable r) {
        if (!queue.offer(new Runnable() {
            /*
             * Wrap the Runnable in a class that will start the next
Runnable
             * from the queue when it completes.
             */
            public void run() {
                try {
                    r.run();
                } finally {
                    scheduleNext();
                }
            }
        })) {
            // The queue is full.
            throw new RejectedExecutionException();
        }
        if (semaphore.tryAcquire()) {
            // We were able to obtain a permit, so start another task.
            scheduleNext();
        }
    }

    /**
     * Schedule the next task if one is available (non-blocking).
     * <p>
     * Pre-condition: The caller has a permit.
     */
    private void scheduleNext() {
        while (true) {
            Runnable next = null;
            if ((next = queue.poll()) != null) {
                try {
                    executor.execute(next);
                    return;
                } catch (RejectedExecutionException ex) {
                    // log error and poll the queue again.
                    log.error(ex, ex);
                    continue;
                }
            } else {
                semaphore.release();
                return;
            }
        }
    }

}

Re: test failure repeatability - TaskManager

Posted by Peter Firmstone <ji...@zeus.net.au>.

So there are some fundamental design flaws, not compatible with a 
distributed network environment we need to reconsider.

The Notifier used in Outrigger (JavaSpaces) and ProxyRegTask in 
JoinManager, both use retry and runAfter.

Implementations of RetryTask that don't use runAfter are ok.  So in 
other words RetryTask really needs to be an ordinary Runnable which is 
retried until it passes and it really needs to be idempotent.

Other tasks that require ordering probably require a sequence number and 
an executor that can re-request  a task with a particular sequence 
number, it could give up waiting after a reasonable amount of time.

Anyone have time to assist investigating a solution?

Regards,

Peter.

On 3/04/2013 1:01 AM, Patricia Shanahan wrote:
> My concern with RetryTask is related to your point about "If a task 
> completes before another task which it's supposed to runAfter but 
> isn't present in the queue; that could explain some issues."
>
> A RetryTask puts itself back on the end of the queue when it needs to 
> retry. Suppose taskA is a RetryTask, taskB is supposed to runAfter 
> taskA, and originally appears on the queue after taskA. The first 
> attempt at taskA fails, and it puts itself back on the TaskManager 
> queue, using the normal add call. Now taskA is after taskB.
>
> Patricia
>
> On 4/2/2013 12:38 AM, Peter Firmstone wrote:
>> The formatting didn't work out, I'll create a Jira issue to discuss.
>>
>> Patricia's done a great job detailing the dependencies and issues with
>> TaskManager's Task implementations.
>>
>> I recall a list discussion from the original Sun developers who had
>> intended to replace TaskManager, the runAfter method has issues.
>>
>> Being so prevalent, it's quite possible that TaskManager is causing
>> issues and it might also explain why as performance improves more issues
>> arise.
>>
>> If a task completes before another task which it's supposed to runAfter
>> but isn't present in the queue; that could explain some issues.
>>
>> I much prefer idempotent code myself.
>>
>> This could take some effort to fix, any volunteers?
>>
>> Dennis are you able to continue with your 2.2.1 branch release?
>>
>> Regards,
>>
>> Peter.
>>
>> On 2/04/2013 5:17 PM, Peter Firmstone wrote:
>>> I've appended Patricia's notes in html so we don't lose the table
>>> formatting, hopefully it will be accepted by the mailer.
>>>
>>> On 2/04/2013 1:38 PM, Patricia Shanahan wrote:
>>>> I've sent Peter some notes that I hope he can make available - I
>>>> don't think I can send attachments to the list.
>>>>
>>>> Rereading my notes has reminded me that I had special concerns with
>>>> RetryTask. Is that still used? If so, I'll explain the problem.
>>>>
>>>>
>>> *TaskManager notes*
>>>
>>>
>>>  Classes That Reference TaskManager
>>>
>>> Class
>>>
>>>
>>>
>>> Package
>>>
>>>
>>>
>>> Notes
>>>
>>> AbortJob
>>>
>>>
>>>
>>> com.sun.jini.mahalo
>>>
>>>
>>>
>>> Subclass of Job. Passed a TaskManager as parameter. Uses
>>> ParticipantTask, no dependencies.
>>>
>>> CommitJob
>>>
>>>
>>>
>>> com.sun.jini.mahalo
>>>
>>>
>>>
>>> Subclass of Job. Passed a TaskManager as parameter. Uses
>>> ParticipantTask, no dependencies.
>>>
>>> EventType
>>>
>>>
>>>
>>> com.sun.jini.norm.event
>>>
>>>
>>>
>>> Task type SendTask, subclass of RetryTask, no dependencies.
>>>
>>> EventTypeGenerator
>>>
>>>
>>>
>>> com.sun.jini.norm.event
>>>
>>>
>>>
>>> Supplies a TaskManager for use by the EventType objects it generates.
>>>
>>> FiddlerImpl
>>>
>>>
>>>
>>> com.sun.jini.fiddler
>>>
>>>
>>>
>>> Extensive use of TaskManager, with many different Task subtypes. No
>>> dependencies.
>>>
>>> Job
>>>
>>>
>>>
>>> com.sun.jini.mahalo
>>>
>>>
>>>
>>> Manage performance of a job as a set of tasks all of which need to be
>>> created by the Job subclass. There is some dubious code in performWork
>>> that silently throws away an exception that would indicate internal
>>> inconsistency.
>>>
>>> JoinManager
>>>
>>>
>>>
>>> net.jini.lookup
>>>
>>>
>>>
>>> Uses ProxyRegTask, which extends RetryTask. Special problem - making
>>> sure a service gets exactly one ID. If the ID has already been
>>> allocated, no dependencies. If not, runAfter any ProxyRegTask with
>>> lower sequence number, ensuring that only the lowest sequence number
>>> ProxyRegTask in the TaskManager can run. Safe if, and only if, tasks
>>> are submitted in sequence number order, and there are no retries.
>>>
>>>
>>> LeaseRenewalManager
>>>
>>>
>>>
>>> net.jini.lease
>>>
>>>
>>>
>>> Uses QueuerTask and RenewTask. No dependencies.
>>>
>>> LookupDiscovery
>>>
>>>
>>>
>>> net.jini.discovery
>>>
>>>
>>>
>>> Uses DecodeAnnouncementTask and UnicastDiscoveryTask. No dependencies.
>>>
>>> LookupLocatorDiscovery
>>>
>>>
>>>
>>> net.jini.discovery
>>>
>>>
>>>
>>> Uses DiscoveryTask. No dependencies.
>>>
>>> MailboxImpl
>>>
>>>
>>>
>>> com.sun.jini.mercury
>>>
>>>
>>>
>>> Uses a NotifyTask, subclass of RetryTask, no dependencies.
>>>
>>> Notifier
>>>
>>>
>>>
>>> com.sun.jini.outrigger
>>>
>>>
>>>
>>> Uses its own NotifyTask, subclass of RetryTask. Dependency based on
>>> EventSender runAfter test. EventSender has two implementations. An
>>> EventRegistrationWatcher.BasicEventSender waits for any
>>> BasicEventSender belonging to the same EventRegistrationWatcher.
>>> VisibilityEventSender has no dependencies.
>>>
>>> ParticipantTask
>>>
>>>
>>>
>>> com.sun.jini.mahalo
>>>
>>>
>>>
>>> No dependencies.
>>>
>>> PrepareAndCommitJob
>>>
>>>
>>>
>>> com.sun.jini.mahalo
>>>
>>>
>>>
>>> Subclass of Job. Passed a TaskManager as parameter. Uses
>>> ParticipantTask, no dependencies.
>>>
>>> PrepareJob
>>>
>>>
>>>
>>> com.sun.jini.mahalo
>>>
>>>
>>>
>>> Subclass of Job. Passed a TaskManager as parameter. Uses
>>> ParticipantTask, no dependencies.
>>>
>>> RegistrarImpl
>>>
>>>
>>>
>>> com.sun.jini.reggie
>>>
>>>
>>>
>>> Uses multiple Task types: AddressTask - no dependencies;
>>> DecodeRequestTask - no dependencies; EventTask - run after EventTask
>>> for same listener, "Keep events going to the same listener ordered";
>>> SocketTask - no dependencies.
>>>
>>> RetryTask
>>>
>>>
>>>
>>> com.sun.jini.thread
>>>
>>>
>>>
>>> Abstract class implementing Task. It provides for automatic retry of
>>> failed attempts, where an attempt is a call to tryOnce.
>>>
>>> ServiceDiscoveryManager
>>>
>>>
>>>
>>> net.jini.lookup
>>>
>>>
>>>
>>> Uses CacheTask - no dependencies; ServiceIdTask - run after
>>> ServiceIdTask with same ServiceId and lower sequence number. Its
>>> subclasses NewOldServiceTask and UnmapProxyTask inherit runAfter.
>>> ServiceIdTask's subclass NotifyEventTask runs after
>>> RegisterListenerTask or LookupTask with same ProxyReg and lower
>>> sequence, and also calls the ServiceId runAfter. Bug ID 6291851.
>>> Comment suggests the writer thought it was necessary to do a sequence
>>> number check to find the queue order: " and if those tasks were queued
>>> prior to this task (have lower sequence numbers)".
>>>
>>>
>>> /** Whenever a ServiceIdTask is created in this cache, it is assigned
>>>
>>> * a unique sequence number to allow such tasks associated with the
>>>
>>> * same ServiceID to be executed in the order in which they were
>>>
>>> * queued in the TaskManager. This field contains the value of
>>>
>>> * the sequence number assigned to the most recently created
>>>
>>> * ServiceIdTask.
>>>
>>> */
>>>
>>> *private**long*taskSeqN= 0;
>>>
>>>
>>> Synchronization window needs fixing. taskSeqN is protected by
>>> serviceIdMap synchronization, but it is released before calling
>>> cacheTaskMgr.add in addProxyReg
>>>
>>>
>>> SettlerTask
>>>
>>>
>>>
>>> com.sun.jini.mahalo
>>>
>>>
>>>
>>> Subclass of RetryTask. No dependencies. Used in TxnManagerImpl.
>>>
>>> TxnManagerImpl
>>>
>>>
>>>
>>> com.sun.jini.mahalo
>>>
>>>
>>>
>>> Uses SettlerTask and ParticipantTask. No dependencies.
>>>
>>> TxnManagerTransaction
>>>
>>>
>>>
>>> com.sun.jini.mahalo
>>>
>>>
>>>
>>> Creates a TaskManager, threadpool, and passes it around to e.g. Job
>>> and AbortJob.
>>>
>>> TxnMonitor
>>>
>>>
>>>
>>> com.sun.jini.outrigger
>>>
>>>
>>>
>>> Uses TxnMonitorTask.
>>>
>>> TxnMonitorTask
>>>
>>>
>>>
>>> com.sun.jini.outrigger
>>>
>>>
>>>
>>> Subclass of RetryTask. No dependencies.
>>>
>>>
>>>  Issues
>>>
>>>
>>>    RetryTask
>>>
>>> RetryTask is a Task implementation whose run method tries a subclass
>>> supplied method with a boolean result. If the method returns false,
>>> indicating failure, the RetryTask's run method schedules another try
>>> in the future, using a WakeupManager supplied to the RetryTask
>>> constructor.
>>>
>>> During the time between a failed attempt and its retry, there does not
>>> seem to be any control to prevent conflicting tasks from entering the
>>> same TaskManager. Some of those tasks would have waited for the task
>>> being retried, if it had been in the TaskManager at their time of
>>> arrival. Delayed retry and dependence on sequence number seem
>>> incompatible. Notifier.NotifyTask and JoinManager.ProxyRegTask both
>>> extend RetryTask and have dependencies. JoinManager.ProxyRegTask uses
>>> a sequence number, but probably does not need to, and should not. The
>>> intent seems to be to run tasks for a given service one-at-a-time
>>> until its ServiceId has been set.
>>>
>>>
>>>    ServiceDiscoveryManager.CacheTask
>>>
>>> Most subclasses inherit a "return false;" runAfter. The exceptions are
>>> ServiceIdTask, its subclasses, and LookupTask. Both have sequence
>>> number dependencies. It is not yet clear whether
>>> ServiceDiscoveryManager is ensuring that tasks enter the TaskManager
>>> in sequence number order. If it does, the code is correct, but wastes
>>> time with a trivially true check. If not, the code is incorrect
>>> relative to the comments, which seem to expect order.
>>>
>>>
>>>
>>>
>>>
>>>
>>
>