You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org> on 2011/06/22 20:39:49 UTC

[jira] [Commented] (QPID-3121) Cluster management inconsistency when using persistent store.

    [ https://issues.apache.org/jira/browse/QPID-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053387#comment-13053387 ] 

jiraposter@reviews.apache.org commented on QPID-3121:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/943/
-----------------------------------------------------------

Review request for Gordon Sim and Kenneth Giusti.


Summary
-------

QPID-3121: Bug 682815 - Cluster management inconsistency when using persistent store.

With the recent changes to asynchronous completion, a message can be
completed either in a journal thread or in the connection thread. If
it is completed in the connection thread then completeRcvMsg is called
immediately in the connection thread.  Otherwise completeRcvMsg is
called via requestIOProcessing as an IO callback.

This makes the ordering of management events generated during
completeRcvMsg unpredictalbe and causes an inconsistency error when
completeRcvMsg updates connection stats.

The fix is to mark completeRcvMsg as a cluster-unsafe scope so no management
messages will be generated regardless of how it is called.


This addresses bug QPID-3121.
    https://issues.apache.org/jira/browse/QPID-3121


Diffs
-----

  /trunk/qpid/cpp/src/qpid/broker/SessionState.cpp 1138296 

Diff: https://reviews.apache.org/r/943/diff


Testing
-------


Thanks,

Alan



> Cluster management inconsistency when using persistent store.
> -------------------------------------------------------------
>
>                 Key: QPID-3121
>                 URL: https://issues.apache.org/jira/browse/QPID-3121
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Clustering
>    Affects Versions: 0.9
>            Reporter: Alan Conway
>            Assignee: Alan Conway
>             Fix For: 0.9
>
>         Attachments: durable-test-mgmt.patch
>
>
> If cluster_tests.py, test_management is modified to enable durable messages, it fails the log comparison test shows messages like this on one broker but not the others:
> trace Changed V1 statistics org.apache.qpid.broker:connection:127.0.0.1:52742-127.0.0.1:44104 len=NN
> trace Changed V2 statistics org.apache.qpid.broker:connection:127.0.0.1:52742-127.0.0.1:44104
> To date this hasn't been seen to actually cause a cluster crash but in principle it is possible it could.
> To reproduce, build the message store at: http://anonsvn.jboss.org/repos/rhmessaging/store/
> In the tests/cluster directory, run this in a loop:
> make check TESTS=run_python_cluster_tests CLUSTER_TESTS='*.test_management* -DDURATION=2'
> It will fail, usually on the first iteration, showing the log files that don't match. Use diff or other such tool to confirm that the mismatched lines are as above. The file may also contain some other mismatches showing a different number of stats in a periodic update - that is a consequence of the above.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org