You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Alan Conway (JIRA)" <ji...@apache.org> on 2014/12/19 04:21:13 UTC

[jira] [Resolved] (QPID-6278) HA broker abort in TXN soak test

     [ https://issues.apache.org/jira/browse/QPID-6278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Conway resolved QPID-6278.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 0.31

>  HA broker abort in TXN soak test
> ---------------------------------
>
>                 Key: QPID-6278
>                 URL: https://issues.apache.org/jira/browse/QPID-6278
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Clustering
>    Affects Versions: 0.30
>            Reporter: Alan Conway
>            Assignee: Alan Conway
>             Fix For: 0.31
>
>         Attachments: ha-tx-race.diff
>
>
> see also https://bugzilla.redhat.com/show_bug.cgi?id=1145386
> I have a repeatable crash in primary HA broker, by doing a soak test on TXNs.
> This is with trunk code new as of an hour ago:
>   
> URL: https://svn.apache.org/repos/asf/qpid/trunk/qpid/cpp
> Repository Root: https://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 1626916
> Node Kind: directory
> Schedule: normal
> Last Changed Author: aconway
> Last Changed Rev: 1626887
> I did a standard build, first of proton and then of qpidd -- except that I had them install themselves in /usr instead of /usr/local .
> Here are the scripts I use.
> script 1
> starting the HA cluster
> {
> #! /bin/bash
> export PYTHONPATH=/home/mick/trunk/qpid/python
> QPIDD=/usr/sbin/qpidd
> QPID_HA=/home/mick/trunk/qpid/tools/src/py/qpid-ha
> # This is where I put the log files.
> rm -rf /tmp/mick
> mkdir /tmp/mick
> for N in 1 2 3
> do
>   $QPIDD                                                          \
>     --auth=no                                                     \
>     --no-module-dir                                               \
>     --load-module /usr/lib64/qpid/daemon/ha.so                    \
>     --log-enable debug+:ha::                                      \
>     --ha-cluster=yes                                              \
>     --ha-replicate=all                                            \
>     --ha-brokers-url=localhost:5801,localhost:5802,localhost:5803 \
>     --ha-public-url=localhost:5801,localhost:5802,localhost:5803  \
>     -p 580$N                                                      \
>     --data-dir /tmp/mick/data_$N                                  \
>     --log-to-file /tmp/mick/qpidd_$N.log                          \
>     --mgmt-enable=yes                                             \
>     -d
>   echo "============================================"
>   echo "started broker $N from $QPIDD"
>   echo "============================================"
>   sleep 1
> done
> # Now promote one broker to primary.
> echo "Promoting broker 5801..."
> ${QPID_HA} promote --cluster-manager -b localhost:5801
> echo "done."
> }
> script 2
> create the tx queues, and load the first one with 1000 messages
> {
>   #! /bin/bash
> TXTEST2=/usr/libexec/qpid/tests/qpid-txtest2
> echo "Loading data to queues..."
> ${TXTEST2} --init=yes --transfer=no --check=no                           \
>            --port 5801                                                   \
>            --total-messages 1000 --connection-options '{reconnect:true}' \
>            --messages-per-tx 10 --tx-count 100                           \
>            --queue-base-name=tx --fetch-timeout=1
> }
> script 3
> now beat the heck out of the TXN code
> {
>   #! /bin/bash
> TXTEST2=/usr/libexec/qpid/tests/qpid-txtest2
> echo "starting transfers..."
> ${TXTEST2} --init=no --transfer=yes --check=no                           \
>            --port 5801                                                   \
>            --total-messages 5000000 --connection-options '{reconnect:true}' \
>            --messages-per-tx 10 --tx-count 500000                          \
>            --queue-base-name=tx --fetch-timeout=1
> }
> I do *not* do any failovers.  Just let that TXN-exercising script run until the primary broker dies.  
> It took quite a while.  In my most recent test, I got through something like 300,000 transactions (10 messages each) before the broker became brokest.
> I then tried the same test on a standalone broker and it got all the way through.
> Here is the traceback:
> #0  0x0000003186a328a5 in raise () from /lib64/libc.so.6
> #1  0x0000003186a34085 in abort () from /lib64/libc.so.6
> #2  0x0000003186a2ba1e in __assert_fail_base () from /lib64/libc.so.6
> #3  0x0000003186a2bae0 in __assert_fail () from /lib64/libc.so.6
> #4  0x00007f6bb72b4f16 in operator-> (this=0x7f6b48378060, sync=<value optimized out>)
>     at /usr/include/boost/smart_ptr/intrusive_ptr.hpp:166
> #5  qpid::broker::SessionState::IncompleteIngressMsgXfer::completed (this=0x7f6b48378060, 
>     sync=<value optimized out>) at /home/mick/trunk/qpid/cpp/src/qpid/broker/SessionState.cpp:409
> #6  0x00007f6bb726d670 in invokeCallback (this=<value optimized out>, msg=<value optimized out>)
>     at /home/mick/trunk/qpid/cpp/src/qpid/broker/AsyncCompletion.h:117
> #7  finishCompleter (this=<value optimized out>, msg=<value optimized out>)
>     at /home/mick/trunk/qpid/cpp/src/qpid/broker/AsyncCompletion.h:158
> #8  enqueueComplete (this=<value optimized out>, msg=<value optimized out>)
>     at /home/mick/trunk/qpid/cpp/src/qpid/broker/PersistableMessage.h:76
> #9  qpid::broker::NullMessageStore::enqueue (this=<value optimized out>, msg=<value optimized out>)
>     at /home/mick/trunk/qpid/cpp/src/qpid/broker/NullMessageStore.cpp:97
> #10 0x00007f6bb71f4578 in qpid::broker::Queue::enqueue (this=0x7f6b4801ef90, ctxt=0x7f6b6821bdf0, msg=...)
>     at /home/mick/trunk/qpid/cpp/src/qpid/broker/Queue.cpp:910
> #11 0x00007f6bb71f46db in qpid::broker::Queue::TxPublish::prepare (this=0x7f6b48435c70, 
>     ctxt=<value optimized out>) at /home/mick/trunk/qpid/cpp/src/qpid/broker/Queue.cpp:159
> #12 0x00007f6bb72c8b3d in qpid::broker::TxBuffer::prepare (this=0x7f6b68549120, ctxt=0x7f6b6821bdf0)
>     at /home/mick/trunk/qpid/cpp/src/qpid/broker/TxBuffer.cpp:42
> #13 0x00007f6bb72c9dbe in qpid::broker::TxBuffer::startCommit (this=0x7f6b68549120, 
>     store=<value optimized out>) at /home/mick/trunk/qpid/cpp/src/qpid/broker/TxBuffer.cpp:73
> #14 0x00007f6bb7298c74 in qpid::broker::SemanticState::commit (this=0x7f6b6c567fb8, store=0x2460440)
>     at /home/mick/trunk/qpid/cpp/src/qpid/broker/SemanticState.cpp:198
> #15 0x00007f6bb6c5886e in invoke<qpid::framing::AMQP_ServerOperations::TxHandler> (this=0x7f6b8bffd4a0, 
>     body=<value optimized out>) at /home/mick/trunk/qpid/cpp/build/src/qpid/framing/TxCommitBody.h:53
> #16 qpid::framing::AMQP_ServerOperations::TxHandler::Invoker::visit (this=0x7f6b8bffd4a0, 
>     body=<value optimized out>) at /home/mick/trunk/qpid/cpp/build/src/qpid/framing/ServerInvoker.cpp:582
> #17 0x00007f6bb6c5cc41 in qpid::framing::AMQP_ServerOperations::Invoker::visit (this=0x7f6b8bffd670, body=...)
>     at /home/mick/trunk/qpid/cpp/build/src/qpid/framing/ServerInvoker.cpp:278
> #18 0x00007f6bb72b504c in invoke<qpid::broker::SessionAdapter> (this=<value optimized out>, 
>     method=0x7f6b68130790) at /home/mick/trunk/qpid/cpp/src/qpid/framing/Invoker.h:67
> #19 qpid::broker::SessionState::handleCommand (this=<value optimized out>, method=0x7f6b68130790)
>     at /home/mick/trunk/qpid/cpp/src/qpid/broker/SessionState.cpp:198
> #20 0x00007f6bb72b6235 in qpid::broker::SessionState::handleIn (this=0x7f6b6c567df0, frame=...)
>     at /home/mick/trunk/qpid/cpp/src/qpid/broker/SessionState.cpp:295
> #21 0x00007f6bb6cd5291 in qpid::amqp_0_10::SessionHandler::handleIn (this=0x7f6b6c4e2120, f=...)
>     at /home/mick/trunk/qpid/cpp/src/qpid/amqp_0_10/SessionHandler.cpp:93
> #22 0x00007f6bb722692b in operator() (this=0x7f6b500ab840, frame=...)
>     at /home/mick/trunk/qpid/cpp/src/qpid/framing/Handler.h:39
> #23 qpid::broker::ConnectionHandler::handle (this=0x7f6b500ab840, frame=...)
>     at /home/mick/trunk/qpid/cpp/src/qpid/broker/ConnectionHandler.cpp:94
> #24 0x00007f6bb7221ba8 in qpid::broker::amqp_0_10::Connection::received (this=0x7f6b500ab660, frame=...)
>     at /home/mick/trunk/qpid/cpp/src/qpid/broker/amqp_0_10/Connection.cpp:198
> #25 0x00007f6bb71aea4d in qpid::amqp_0_10::Connection::decode (this=0x7f6b5005d770, 
>     buffer=<value optimized out>, size=<value optimized out>)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org