You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Rob Springer (Commented) (JIRA)" <ji...@apache.org> on 2012/01/13 16:41:39 UTC

[jira] [Commented] (QPID-3757) Difficult recovery on broker death

    [ https://issues.apache.org/jira/browse/QPID-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185637#comment-13185637 ] 

Rob Springer commented on QPID-3757:
------------------------------------

FINALLY - Apologies, but I'm too unfamiliar with this area of the code to be able to suggest a possible fix.
                
> Difficult recovery on broker death
> ----------------------------------
>
>                 Key: QPID-3757
>                 URL: https://issues.apache.org/jira/browse/QPID-3757
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Client
>    Affects Versions: 0.12
>         Environment: RHEL 4.7 and RHEL 6.2
>            Reporter: Rob Springer
>            Priority: Minor
>         Attachments: backtrace, restart_example.cpp
>
>
> When using the old API (which might render this bug as invalid, if the old API is completely deprecated), if the broker dies, it's not possible to recover Subscription and LocalQueue variables unless you follow a precise workaround procedure.  
> The problem is:
>    If the broker dies and is then respawned, if one attempts to reconnect to the new broker and doesn't create a new Session (i.e., use the old one), bad things happen (since Session doesn't yet support resume(), I assume that's expected behavior).
>    If, however, one tries to create new Session, new SubscriptionManager, and new Subscription objects, an assertion failure is generated (backtrace attached).
>    After reading the backtrace, I believe the following is happening:
> 1) In recovery, we attempt to assign a new Subscription to the previous Subscription variable (i.e., "sub = subMgr->subscribe()")
> 2) That causes the refcount for the old Subscription to fall to 0, causing it to be cleaned up.
> 3) As part of that cleanup, the associated SubscriptionImpl object goes to destroy its (std::auto_ptr<ScopedDivert>) demuxRule member.
> 4) That demuxRule member maintains a reference to a Demux object, demuxer, which exists inside the Session object. Since the Session object has been re-created, that old reference is invalid & results in the assertion.
> Thus, we have a fatal circle - we need to create a new Session object to be able to proceed, but when we do so, we render ourselves unable to re-use Subscription variables.
> Gordon proposed a workaround which does solve the problem for me, in practice, and that is to assign "null" Subscription and LocalQueue objects to those variables before re-creating the Session object. Unfortunately, this won't be clear to any new users, so if anyone is still using the old API, they might be likely to encounter it.
> I'll attach an example showing the problem and the fix as well as snippets from my backtrace shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org