You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Jason Dillaman (JIRA)" <ji...@apache.org> on 2012/08/27 17:00:07 UTC

[jira] [Created] (QPID-4256) HA failover caused by unresponsive broker during queue cleaner invocation

Jason Dillaman created QPID-4256:
------------------------------------

             Summary: HA failover caused by unresponsive broker during queue cleaner invocation
                 Key: QPID-4256
                 URL: https://issues.apache.org/jira/browse/QPID-4256
             Project: Qpid
          Issue Type: Bug
          Components: C++ Broker
    Affects Versions: 0.18
            Reporter: Jason Dillaman


With the ring queue policy and tens of thousands of messages in a queue, the HA primary broker can become unresponsive for long enough to cause a failover.  Since the queue cleaner owns the lock on the queue for the length of the cleaning, it is possible to deadlock all other worker threads if they are attempting to consume or deliver messages to the queue.

Queue cleaner thread:

#0  0x00000039472135ff in std::deque<qpid::broker::QueuedMessage, std::allocator<qpid::broker::QueuedMessage> >::erase(std::_Deque_iterator<qpid::broker::QueuedMessage, qpid::broker::QueuedMessage&, qpid::broker::QueuedMessage*>) () from /usr/lib64/libqpidbroker.so.6
#1  0x000000394720ecdd in qpid::broker::RingQueuePolicy::find(qpid::broker::QueuedMessage const&, std::deque<qpid::broker::QueuedMessage, std::allocator<qpid::broker::QueuedMessage> >&, bool) () from /usr/lib64/libqpidbroker.so.6
#2  0x0000003947210b57 in qpid::broker::RingQueuePolicy::dequeued(qpid::broker::QueuedMessage const&) () from /usr/lib64/libqpidbroker.so.6
#3  0x00000039471f0325 in qpid::broker::Queue::dequeue(qpid::broker::TransactionContext*, qpid::broker::QueuedMessage const&) () from /usr/lib64/libqpidbroker.so.6
#4  0x00000039471f2095 in qpid::broker::Queue::dequeueIf(boost::function1<bool, qpid::broker::QueuedMessage&>, std::deque<qpid::broker::QueuedMessage, std::allocator<qpid::broker::QueuedMessage> >&) () from /usr/lib64/libqpidbroker.so.6
#5  0x00000039471f2a26 in qpid::broker::Queue::purgeExpired(qpid::sys::Duration) () from /usr/lib64/libqpidbroker.so.6
#6  0x0000003947202d58 in qpid::broker::QueueCleaner::fired() () from /usr/lib64/libqpidbroker.so.6
#7  0x0000003948226326 in qpid::sys::Timer::fire(boost::intrusive_ptr<qpid::sys::TimerTask>) () from /usr/lib64/libqpidcommon.so.6
#8  0x00000039482276a9 in qpid::sys::Timer::run() () from /usr/lib64/libqpidcommon.so.6

Other worker threads:

#0  0x00000036f420dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00000036f4209343 in _L_lock_892 () from /lib64/libpthread.so.0
#2  0x00000036f4209227 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x000000394716097a in qpid::sys::Mutex::lock() () from /usr/lib64/libqpidbroker.so.6
#4  0x00000039471f59c6 in qpid::broker::Queue::consumeNextMessage(qpid::broker::QueuedMessage&, boost::shared_ptr<qpid::broker::Consumer>&) () from /usr/lib64/libqpidbroker.so.6
#5  0x00000039471f67ec in qpid::broker::Queue::getNextMessage(qpid::broker::QueuedMessage&, boost::shared_ptr<qpid::broker::Consumer>&) () from /usr/lib64/libqpidbroker.so.6
#6  0x00000039471f687e in qpid::broker::Queue::dispatch(boost::shared_ptr<qpid::broker::Consumer>) () from /usr/lib64/libqpidbroker.so.6
#7  0x000000394722c507 in qpid::broker::SemanticState::ConsumerImpl::doDispatch() () from /usr/lib64/libqpidbroker.so.6
#8  0x0000003947228941 in qpid::broker::SemanticState::ConsumerImpl::doOutput() () from /usr/lib64/libqpidbroker.so.6
#9  0x00000039482195f2 in qpid::sys::AggregateOutput::doOutput() () from /usr/lib64/libqpidcommon.so.6
#10 0x000000394718db49 in qpid::broker::Connection::doOutput() () from /usr/lib64/libqpidbroker.so.6
#11 0x000000394715ce39 in qpid::amqp_0_10::Connection::encode(char const*, unsigned long) () from /usr/lib64/libqpidbroker.so.6
#12 0x000000394821c377 in qpid::sys::AsynchIOHandler::idle(qpid::sys::AsynchIO&) () from /usr/lib64/libqpidcommon.so.6
#13 0x000000394813eba6 in qpid::sys::posix::AsynchIO::writeable(qpid::sys::DispatchHandle&) () from /usr/lib64/libqpidcommon.so.6
#14 0x00000039482225f3 in boost::function1<void, qpid::sys::DispatchHandle&>::operator()(qpid::sys::DispatchHandle&) const () from /usr/lib64/libqpidcommon.so.6
#15 0x000000394821f4be in qpid::sys::DispatchHandle::processEvent(qpid::sys::Poller::EventType) () from /usr/lib64/libqpidcommon.so.6
#16 0x000000394814b08d in qpid::sys::Poller::run() () from /usr/lib64/libqpidcommon.so.6


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org


[jira] [Assigned] (QPID-4256) HA failover caused by unresponsive broker during queue cleaner invocation

Posted by "Ken Giusti (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/QPID-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ken Giusti reassigned QPID-4256:
--------------------------------

    Assignee: Ken Giusti
    
> HA failover caused by unresponsive broker during queue cleaner invocation
> -------------------------------------------------------------------------
>
>                 Key: QPID-4256
>                 URL: https://issues.apache.org/jira/browse/QPID-4256
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker
>    Affects Versions: 0.18
>            Reporter: Jason Dillaman
>            Assignee: Ken Giusti
>
> With the ring queue policy and tens of thousands of messages in a queue, the HA primary broker can become unresponsive for long enough to cause a failover.  Since the queue cleaner owns the lock on the queue for the length of the cleaning, it is possible to deadlock all other worker threads if they are attempting to consume or deliver messages to the queue.
> Queue cleaner thread:
> #0  0x00000039472135ff in std::deque<qpid::broker::QueuedMessage, std::allocator<qpid::broker::QueuedMessage> >::erase(std::_Deque_iterator<qpid::broker::QueuedMessage, qpid::broker::QueuedMessage&, qpid::broker::QueuedMessage*>) () from /usr/lib64/libqpidbroker.so.6
> #1  0x000000394720ecdd in qpid::broker::RingQueuePolicy::find(qpid::broker::QueuedMessage const&, std::deque<qpid::broker::QueuedMessage, std::allocator<qpid::broker::QueuedMessage> >&, bool) () from /usr/lib64/libqpidbroker.so.6
> #2  0x0000003947210b57 in qpid::broker::RingQueuePolicy::dequeued(qpid::broker::QueuedMessage const&) () from /usr/lib64/libqpidbroker.so.6
> #3  0x00000039471f0325 in qpid::broker::Queue::dequeue(qpid::broker::TransactionContext*, qpid::broker::QueuedMessage const&) () from /usr/lib64/libqpidbroker.so.6
> #4  0x00000039471f2095 in qpid::broker::Queue::dequeueIf(boost::function1<bool, qpid::broker::QueuedMessage&>, std::deque<qpid::broker::QueuedMessage, std::allocator<qpid::broker::QueuedMessage> >&) () from /usr/lib64/libqpidbroker.so.6
> #5  0x00000039471f2a26 in qpid::broker::Queue::purgeExpired(qpid::sys::Duration) () from /usr/lib64/libqpidbroker.so.6
> #6  0x0000003947202d58 in qpid::broker::QueueCleaner::fired() () from /usr/lib64/libqpidbroker.so.6
> #7  0x0000003948226326 in qpid::sys::Timer::fire(boost::intrusive_ptr<qpid::sys::TimerTask>) () from /usr/lib64/libqpidcommon.so.6
> #8  0x00000039482276a9 in qpid::sys::Timer::run() () from /usr/lib64/libqpidcommon.so.6
> Other worker threads:
> #0  0x00000036f420dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x00000036f4209343 in _L_lock_892 () from /lib64/libpthread.so.0
> #2  0x00000036f4209227 in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x000000394716097a in qpid::sys::Mutex::lock() () from /usr/lib64/libqpidbroker.so.6
> #4  0x00000039471f59c6 in qpid::broker::Queue::consumeNextMessage(qpid::broker::QueuedMessage&, boost::shared_ptr<qpid::broker::Consumer>&) () from /usr/lib64/libqpidbroker.so.6
> #5  0x00000039471f67ec in qpid::broker::Queue::getNextMessage(qpid::broker::QueuedMessage&, boost::shared_ptr<qpid::broker::Consumer>&) () from /usr/lib64/libqpidbroker.so.6
> #6  0x00000039471f687e in qpid::broker::Queue::dispatch(boost::shared_ptr<qpid::broker::Consumer>) () from /usr/lib64/libqpidbroker.so.6
> #7  0x000000394722c507 in qpid::broker::SemanticState::ConsumerImpl::doDispatch() () from /usr/lib64/libqpidbroker.so.6
> #8  0x0000003947228941 in qpid::broker::SemanticState::ConsumerImpl::doOutput() () from /usr/lib64/libqpidbroker.so.6
> #9  0x00000039482195f2 in qpid::sys::AggregateOutput::doOutput() () from /usr/lib64/libqpidcommon.so.6
> #10 0x000000394718db49 in qpid::broker::Connection::doOutput() () from /usr/lib64/libqpidbroker.so.6
> #11 0x000000394715ce39 in qpid::amqp_0_10::Connection::encode(char const*, unsigned long) () from /usr/lib64/libqpidbroker.so.6
> #12 0x000000394821c377 in qpid::sys::AsynchIOHandler::idle(qpid::sys::AsynchIO&) () from /usr/lib64/libqpidcommon.so.6
> #13 0x000000394813eba6 in qpid::sys::posix::AsynchIO::writeable(qpid::sys::DispatchHandle&) () from /usr/lib64/libqpidcommon.so.6
> #14 0x00000039482225f3 in boost::function1<void, qpid::sys::DispatchHandle&>::operator()(qpid::sys::DispatchHandle&) const () from /usr/lib64/libqpidcommon.so.6
> #15 0x000000394821f4be in qpid::sys::DispatchHandle::processEvent(qpid::sys::Poller::EventType) () from /usr/lib64/libqpidcommon.so.6
> #16 0x000000394814b08d in qpid::sys::Poller::run() () from /usr/lib64/libqpidcommon.so.6

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org