You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Alan Conway (JIRA)" <ji...@apache.org> on 2011/04/01 16:25:05 UTC

[jira] [Commented] (QPID-3129) cluster_tests.LongTests.test_failover hangs

    [ https://issues.apache.org/jira/browse/QPID-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014573#comment-13014573 ] 

Alan Conway commented on QPID-3129:
-----------------------------------

Revision 1087052 fixes a race condition in posix/Socket.cpp Socket::connect that was causing the hangs with the same local/remote port. However there is another issue causing a hang (less frequent). In this case either sender or receiver hangs with a stack trace that seems to indicate an exception thrown in the FailoverUpdates thread (below). The main thread is hung in join() waiting for the FailoverUpdates thread to exit. It appears that the thread is  hung or looping in gcc exception handling code. The traces are not always identical but they always have some exception-related functions being called in the FailoverUpdate thread.

Example:

#0  0x0000003507c0aa77 in ?? () from /lib64/libgcc_s.so.1
#1  0x0000003505108a07 in dl_iterate_phdr () from /lib64/libc.so.6
#2  0x0000003507c0a6b6 in _Unwind_Find_FDE () from /lib64/libgcc_s.so.1
#3  0x0000003507c07625 in ?? () from /lib64/libgcc_s.so.1
#4  0x0000003507c08f79 in _Unwind_RaiseException () from /lib64/libgcc_s.so.1
#5  0x00000035094bcf1d in __cxa_throw () from /usr/lib64/libstdc++.so.6
#6  0x00002ba6b5b306cd in qpid::sys::ExceptionHolder::Wrapper<qpid::TransportFailure>::raise (this=<value optimized out>) at ../../qpid/cpp/include/qpid/sys/ExceptionHolder.h:62
#7  0x00002ba6b5b53381 in raise (this=<value optimized out>) at ../../qpid/cpp/include/qpid/sys/ExceptionHolder.h:53
#8  qpid::client::SessionImpl::check (this=<value optimized out>) at ../../qpid/cpp/src/qpid/client/SessionImpl.cpp:775
#9  0x00002ba6b5b57ed7 in qpid::client::SessionImpl::checkOpen (this=0x257) at ../../qpid/cpp/src/qpid/client/SessionImpl.cpp:780
#10 0x00002ba6b5b5af18 in qpid::client::SessionImpl::sendCommand (this=0x19f91710, command=..., content=0x0) at ../../qpid/cpp/src/qpid/client/SessionImpl.cpp:383
#11 0x00002ba6b5b5b1ea in qpid::client::SessionImpl::send (this=0x274, command=...) at ../../qpid/cpp/src/qpid/client/SessionImpl.cpp:301
#12 0x00002ba6b5b0ae49 in qpid::client::no_keyword::AsyncSession_0_10::messageSetFlowMode (this=<value optimized out>, destination=<value optimized out>, flowMode=0 '\000',
    sync=false) at ./qpid/client/no_keyword/AsyncSession_0_10.cpp:111
#13 0x00002ba6b588bd6c in messageSetFlowMode_with_named_params<boost::parameter::aux::arg_list<boost::parameter::aux::tagged_argument<qpid::client::arg::keyword_tags::destination, std\
::basic_string<char, std::char_traits<char>, std::allocator<char> > const>, boost::parameter::aux::arg_list<boost::parameter::aux::tagged_argument<qpid::client::arg::keyword_tags::flo\
wMode, qpid::framing::message::FlowMode const>, boost::parameter::aux::empty_arg_list> > > (this=0x19ee4220, message=..., timeout=...)
    at ../include/qpid/client/AsyncSession_0_10.h:366
#14 messageSetFlowMode<std::string, qpid::framing::message::FlowMode> (this=0x19ee4220, message=..., timeout=...) at ../include/qpid/client/AsyncSession_0_10.h:362
#15 qpid::client::amqp0_10::ReceiverImpl::fetchImpl (this=0x19ee4220, message=..., timeout=...) at ../../qpid/cpp/src/qpid/client/amqp0_10/ReceiverImpl.cpp:168
#16 0x00002ba6b588df14 in operator() (this=0x19f2af00, f=...) at ../../qpid/cpp/src/qpid/client/amqp0_10/ReceiverImpl.h:118
#17 qpid::client::amqp0_10::SessionImpl::execute<qpid::client::amqp0_10::ReceiverImpl::Fetch> (this=0x19f2af00, f=...) at ../../qpid/cpp/src/qpid/client/amqp0_10/SessionImpl.h:99
#18 0x00002ba6b588c3e8 in qpid::client::amqp0_10::ReceiverImpl::fetch (this=0x19ee4220, message=<value optimized out>, timeout=...)
    at ../../qpid/cpp/src/qpid/client/amqp0_10/ReceiverImpl.cpp:71
#19 0x00002ba6b5866225 in qpid::messaging::Receiver::fetch (this=<value optimized out>, message=..., timeout=...) at ../../qpid/cpp/src/qpid/messaging/Receiver.cpp:38
#20 0x00002ba6b5867937 in qpid::messaging::FailoverUpdatesImpl::run (this=0x19f2a6d0) at ../../qpid/cpp/src/qpid/messaging/FailoverUpdates.cpp:63
#21 0x00002ba6b5ecac2a in qpid::sys::(anonymous namespace)::runRunnable (p=0x257) at ../../qpid/cpp/src/qpid/sys/posix/Thread.cpp:35
#22 0x0000003505c0673d in start_thread () from /lib64/libpthread.so.0
#23 0x00000035050d40cd in clone () from /lib64/libc.so.6


> cluster_tests.LongTests.test_failover hangs 
> --------------------------------------------
>
>                 Key: QPID-3129
>                 URL: https://issues.apache.org/jira/browse/QPID-3129
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Clustering
>    Affects Versions: 0.9
>            Reporter: Alan Conway
>            Assignee: Alan Conway
>            Priority: Minor
>             Fix For: 0.9
>
>
> The test cluster_tests.LongTests.test_failover hangs if run with duration of a minute or more.
> To reproduce:
>   make check TESTS=run_cluster_tests CLUSTER_TESTS='cluster_tests.LongTests.test_failover -DDURATION=1'
> Should pass after about 1 minute, but hangs indefinitely.
> This is probably a test bug.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org