You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by adam <ad...@gmail.com> on 2010/12/14 02:07:40 UTC

Proposal to parallelize subscription recovery and retroactive playback

There is a pretty severe concurrency issue right now with retroactive
subscriptions and durable subscription playback.  Both operations lock
the topic such that it cannot accept new messages while a playback 
is occurring.  They also prevent multiple playbacks from happening
simultaneously.

This is a big problem if playback takes any significant amount of time.
In my application, stopping other consumers from seeing messages for 
more than a few seconds is unacceptable and I can easily cue up many
minutes worth of replay.

Additionally, if the recovery action requires locks or complex activity
that conflict with other connection/subscription oriented locks then
deadlock problems become likely.  See AMQ-3070 for an example.

So, I want to fix this, and I am willing to do the work, but I need
some guidance from you guys on the right thing to do.

Basically, I want to:

1) Remove dispatchValve from broker/region/Topic and move it to
the Subscription object instead in the form of a recovery specific
lock.  Topic.dispatch would ignore subscriptions in a recovery state.

2) Create an independent thread for running
SubscriptionRecoveryPolicy.recover

3) Create an independent thread for running Store.recoverSubscription

4) Implement the correct synchronization in #2 and #3 to allow the topic
to continue dispatching messages and allow recovery to not drop those
messages while recovering and to stop recovering at the right time.

#4 is of course the hard part.  The difficulties there explain the current
design.  I can see pretty well how to do this for retroactives, but I am
a little more fuzzy on durable recovery from the message store.

I would like to believe that I can just simply synchronize Topic.dispatch
with a simple read lock check on the subscription 'in recovery' lock
and call it done, but I am not sure about that.

If there are any thoughts you guys have on doing this properly, please let
me know.  I want to get it right and give you guys a patch.

thanks,

-adam

-- 
View this message in context: http://activemq.2283324.n4.nabble.com/Proposal-to-parallelize-subscription-recovery-and-retroactive-playback-tp3086330p3086330.html
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.

Re: Proposal to parallelize subscription recovery and retroactive playback

Posted by Gary Tully <ga...@gmail.com>.
I think you have identified the difficult bit in #4, ensuring that new
messages do not get dropped by the recovering subscription. For
durables I think this will be possible so long as the cursor cache is
disabled. For subs where the pending list is still in memory using the
existing inline dispatch is probably preferred, but it may be more
difficult to support both mechanisms.

One bits of advice: do test first dev, so build the broken test cases
that you want to fix first as this will solidify the requirement.
There are a bunch of existing unit tests for topic subs that should
ensure you don't break anything else, they are a little scattered
through the test suite though. In activemq-core, 'mvn clean test' is
your friend, even if it does take a long.

On 14 December 2010 01:07, adam <ad...@gmail.com> wrote:
>
> There is a pretty severe concurrency issue right now with retroactive
> subscriptions and durable subscription playback.  Both operations lock
> the topic such that it cannot accept new messages while a playback
> is occurring.  They also prevent multiple playbacks from happening
> simultaneously.
>
> This is a big problem if playback takes any significant amount of time.
> In my application, stopping other consumers from seeing messages for
> more than a few seconds is unacceptable and I can easily cue up many
> minutes worth of replay.
>
> Additionally, if the recovery action requires locks or complex activity
> that conflict with other connection/subscription oriented locks then
> deadlock problems become likely.  See AMQ-3070 for an example.
>
> So, I want to fix this, and I am willing to do the work, but I need
> some guidance from you guys on the right thing to do.
>
> Basically, I want to:
>
> 1) Remove dispatchValve from broker/region/Topic and move it to
> the Subscription object instead in the form of a recovery specific
> lock.  Topic.dispatch would ignore subscriptions in a recovery state.
>
> 2) Create an independent thread for running
> SubscriptionRecoveryPolicy.recover
>
> 3) Create an independent thread for running Store.recoverSubscription
>
> 4) Implement the correct synchronization in #2 and #3 to allow the topic
> to continue dispatching messages and allow recovery to not drop those
> messages while recovering and to stop recovering at the right time.
>
> #4 is of course the hard part.  The difficulties there explain the current
> design.  I can see pretty well how to do this for retroactives, but I am
> a little more fuzzy on durable recovery from the message store.
>
> I would like to believe that I can just simply synchronize Topic.dispatch
> with a simple read lock check on the subscription 'in recovery' lock
> and call it done, but I am not sure about that.
>
> If there are any thoughts you guys have on doing this properly, please let
> me know.  I want to get it right and give you guys a patch.
>
> thanks,
>
> -adam
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Proposal-to-parallelize-subscription-recovery-and-retroactive-playback-tp3086330p3086330.html
> Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.
>



-- 
http://blog.garytully.com
http://fusesource.com