You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Alan Conway (JIRA)" <ji...@apache.org> on 2014/12/19 04:21:13 UTC
[jira] [Resolved] (QPID-6278) HA broker abort in TXN soak test
[ https://issues.apache.org/jira/browse/QPID-6278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Conway resolved QPID-6278.
-------------------------------
Resolution: Fixed
Fix Version/s: 0.31
> HA broker abort in TXN soak test
> ---------------------------------
>
> Key: QPID-6278
> URL: https://issues.apache.org/jira/browse/QPID-6278
> Project: Qpid
> Issue Type: Bug
> Components: C++ Clustering
> Affects Versions: 0.30
> Reporter: Alan Conway
> Assignee: Alan Conway
> Fix For: 0.31
>
> Attachments: ha-tx-race.diff
>
>
> see also https://bugzilla.redhat.com/show_bug.cgi?id=1145386
> I have a repeatable crash in primary HA broker, by doing a soak test on TXNs.
> This is with trunk code new as of an hour ago:
>
> URL: https://svn.apache.org/repos/asf/qpid/trunk/qpid/cpp
> Repository Root: https://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 1626916
> Node Kind: directory
> Schedule: normal
> Last Changed Author: aconway
> Last Changed Rev: 1626887
> I did a standard build, first of proton and then of qpidd -- except that I had them install themselves in /usr instead of /usr/local .
> Here are the scripts I use.
> script 1
> starting the HA cluster
> {
> #! /bin/bash
> export PYTHONPATH=/home/mick/trunk/qpid/python
> QPIDD=/usr/sbin/qpidd
> QPID_HA=/home/mick/trunk/qpid/tools/src/py/qpid-ha
> # This is where I put the log files.
> rm -rf /tmp/mick
> mkdir /tmp/mick
> for N in 1 2 3
> do
> $QPIDD \
> --auth=no \
> --no-module-dir \
> --load-module /usr/lib64/qpid/daemon/ha.so \
> --log-enable debug+:ha:: \
> --ha-cluster=yes \
> --ha-replicate=all \
> --ha-brokers-url=localhost:5801,localhost:5802,localhost:5803 \
> --ha-public-url=localhost:5801,localhost:5802,localhost:5803 \
> -p 580$N \
> --data-dir /tmp/mick/data_$N \
> --log-to-file /tmp/mick/qpidd_$N.log \
> --mgmt-enable=yes \
> -d
> echo "============================================"
> echo "started broker $N from $QPIDD"
> echo "============================================"
> sleep 1
> done
> # Now promote one broker to primary.
> echo "Promoting broker 5801..."
> ${QPID_HA} promote --cluster-manager -b localhost:5801
> echo "done."
> }
> script 2
> create the tx queues, and load the first one with 1000 messages
> {
> #! /bin/bash
> TXTEST2=/usr/libexec/qpid/tests/qpid-txtest2
> echo "Loading data to queues..."
> ${TXTEST2} --init=yes --transfer=no --check=no \
> --port 5801 \
> --total-messages 1000 --connection-options '{reconnect:true}' \
> --messages-per-tx 10 --tx-count 100 \
> --queue-base-name=tx --fetch-timeout=1
> }
> script 3
> now beat the heck out of the TXN code
> {
> #! /bin/bash
> TXTEST2=/usr/libexec/qpid/tests/qpid-txtest2
> echo "starting transfers..."
> ${TXTEST2} --init=no --transfer=yes --check=no \
> --port 5801 \
> --total-messages 5000000 --connection-options '{reconnect:true}' \
> --messages-per-tx 10 --tx-count 500000 \
> --queue-base-name=tx --fetch-timeout=1
> }
> I do *not* do any failovers. Just let that TXN-exercising script run until the primary broker dies.
> It took quite a while. In my most recent test, I got through something like 300,000 transactions (10 messages each) before the broker became brokest.
> I then tried the same test on a standalone broker and it got all the way through.
> Here is the traceback:
> #0 0x0000003186a328a5 in raise () from /lib64/libc.so.6
> #1 0x0000003186a34085 in abort () from /lib64/libc.so.6
> #2 0x0000003186a2ba1e in __assert_fail_base () from /lib64/libc.so.6
> #3 0x0000003186a2bae0 in __assert_fail () from /lib64/libc.so.6
> #4 0x00007f6bb72b4f16 in operator-> (this=0x7f6b48378060, sync=<value optimized out>)
> at /usr/include/boost/smart_ptr/intrusive_ptr.hpp:166
> #5 qpid::broker::SessionState::IncompleteIngressMsgXfer::completed (this=0x7f6b48378060,
> sync=<value optimized out>) at /home/mick/trunk/qpid/cpp/src/qpid/broker/SessionState.cpp:409
> #6 0x00007f6bb726d670 in invokeCallback (this=<value optimized out>, msg=<value optimized out>)
> at /home/mick/trunk/qpid/cpp/src/qpid/broker/AsyncCompletion.h:117
> #7 finishCompleter (this=<value optimized out>, msg=<value optimized out>)
> at /home/mick/trunk/qpid/cpp/src/qpid/broker/AsyncCompletion.h:158
> #8 enqueueComplete (this=<value optimized out>, msg=<value optimized out>)
> at /home/mick/trunk/qpid/cpp/src/qpid/broker/PersistableMessage.h:76
> #9 qpid::broker::NullMessageStore::enqueue (this=<value optimized out>, msg=<value optimized out>)
> at /home/mick/trunk/qpid/cpp/src/qpid/broker/NullMessageStore.cpp:97
> #10 0x00007f6bb71f4578 in qpid::broker::Queue::enqueue (this=0x7f6b4801ef90, ctxt=0x7f6b6821bdf0, msg=...)
> at /home/mick/trunk/qpid/cpp/src/qpid/broker/Queue.cpp:910
> #11 0x00007f6bb71f46db in qpid::broker::Queue::TxPublish::prepare (this=0x7f6b48435c70,
> ctxt=<value optimized out>) at /home/mick/trunk/qpid/cpp/src/qpid/broker/Queue.cpp:159
> #12 0x00007f6bb72c8b3d in qpid::broker::TxBuffer::prepare (this=0x7f6b68549120, ctxt=0x7f6b6821bdf0)
> at /home/mick/trunk/qpid/cpp/src/qpid/broker/TxBuffer.cpp:42
> #13 0x00007f6bb72c9dbe in qpid::broker::TxBuffer::startCommit (this=0x7f6b68549120,
> store=<value optimized out>) at /home/mick/trunk/qpid/cpp/src/qpid/broker/TxBuffer.cpp:73
> #14 0x00007f6bb7298c74 in qpid::broker::SemanticState::commit (this=0x7f6b6c567fb8, store=0x2460440)
> at /home/mick/trunk/qpid/cpp/src/qpid/broker/SemanticState.cpp:198
> #15 0x00007f6bb6c5886e in invoke<qpid::framing::AMQP_ServerOperations::TxHandler> (this=0x7f6b8bffd4a0,
> body=<value optimized out>) at /home/mick/trunk/qpid/cpp/build/src/qpid/framing/TxCommitBody.h:53
> #16 qpid::framing::AMQP_ServerOperations::TxHandler::Invoker::visit (this=0x7f6b8bffd4a0,
> body=<value optimized out>) at /home/mick/trunk/qpid/cpp/build/src/qpid/framing/ServerInvoker.cpp:582
> #17 0x00007f6bb6c5cc41 in qpid::framing::AMQP_ServerOperations::Invoker::visit (this=0x7f6b8bffd670, body=...)
> at /home/mick/trunk/qpid/cpp/build/src/qpid/framing/ServerInvoker.cpp:278
> #18 0x00007f6bb72b504c in invoke<qpid::broker::SessionAdapter> (this=<value optimized out>,
> method=0x7f6b68130790) at /home/mick/trunk/qpid/cpp/src/qpid/framing/Invoker.h:67
> #19 qpid::broker::SessionState::handleCommand (this=<value optimized out>, method=0x7f6b68130790)
> at /home/mick/trunk/qpid/cpp/src/qpid/broker/SessionState.cpp:198
> #20 0x00007f6bb72b6235 in qpid::broker::SessionState::handleIn (this=0x7f6b6c567df0, frame=...)
> at /home/mick/trunk/qpid/cpp/src/qpid/broker/SessionState.cpp:295
> #21 0x00007f6bb6cd5291 in qpid::amqp_0_10::SessionHandler::handleIn (this=0x7f6b6c4e2120, f=...)
> at /home/mick/trunk/qpid/cpp/src/qpid/amqp_0_10/SessionHandler.cpp:93
> #22 0x00007f6bb722692b in operator() (this=0x7f6b500ab840, frame=...)
> at /home/mick/trunk/qpid/cpp/src/qpid/framing/Handler.h:39
> #23 qpid::broker::ConnectionHandler::handle (this=0x7f6b500ab840, frame=...)
> at /home/mick/trunk/qpid/cpp/src/qpid/broker/ConnectionHandler.cpp:94
> #24 0x00007f6bb7221ba8 in qpid::broker::amqp_0_10::Connection::received (this=0x7f6b500ab660, frame=...)
> at /home/mick/trunk/qpid/cpp/src/qpid/broker/amqp_0_10/Connection.cpp:198
> #25 0x00007f6bb71aea4d in qpid::amqp_0_10::Connection::decode (this=0x7f6b5005d770,
> buffer=<value optimized out>, size=<value optimized out>)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org