You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by rs...@mail.etinternational.com on 2009/06/30 22:23:31 UTC

C++ Broker: segfault in TopicExchange::isBound()

All,
We've been using Qpid M4 and 0.5 internally, both with C++, and have been
seeing occasional segfaults within the broker.  They all seem to occur
inside TopicExchange::isBound() (the one with 3 parameters, not 2).

Looking at the access patterns in this function, and judging by the
core dumps, it appears that multiple threads may access and modify
the class member "bindings" simultaneously, which seems like it
has the potential to cause a segfault.

Since this is a race condition (if we're right), it's hard to
say for sure that the problem has been resolved, but since doing
the below, we've not had a segfault (roughly 5 hours so far,
rather than every 2-3 hours or so).
 - We added locks to all accesses of bindings throughout the
   file.  I believe the new ones were limited to isBound.
 - This may not be necessary at all, but to be safe (while
   we were investigating the problem), we changed all lock
   types from RWlock::ScopedLocks to Mutex::ScopedLocks, to avoid
   being tripped up by pthread (read/write) mutex semantics.

We wanted to get the opinion of the experienced Qpid developers
on this list before opening a bug, but again, so far it seems to
be much more stable (we're hammering on Session::exchangeBound()
in our testing, which may be why we're seeing it in the first place).

I'm not sure if the other exchange types have the same problem or not.

Thanks!
-Rob



---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


Re: C++ Broker: segfault in TopicExchange::isBound()

Posted by Gordon Sim <gs...@redhat.com>.
rspringer@mail.etinternational.com wrote:
> All,
> We've been using Qpid M4 and 0.5 internally, both with C++, and have been
> seeing occasional segfaults within the broker.  They all seem to occur
> inside TopicExchange::isBound() (the one with 3 parameters, not 2).
> 
> Looking at the access patterns in this function, and judging by the
> core dumps, it appears that multiple threads may access and modify
> the class member "bindings" simultaneously, which seems like it
> has the potential to cause a segfault.

Thanks very much for the report! I think you are exactly right, the 
topic exchange has no locking on the isBound() check and is therefore 
unsafe.

I raised a JIRA: https://issues.apache.org/jira/browse/QPID-1963

> Since this is a race condition (if we're right), it's hard to
> say for sure that the problem has been resolved, but since doing
> the below, we've not had a segfault (roughly 5 hours so far,
> rather than every 2-3 hours or so).
>  - We added locks to all accesses of bindings throughout the
>    file.  I believe the new ones were limited to isBound.
>  - This may not be necessary at all, but to be safe (while
>    we were investigating the problem), we changed all lock
>    types from RWlock::ScopedLocks to Mutex::ScopedLocks, to avoid
>    being tripped up by pthread (read/write) mutex semantics.

My suggested fix would be to simply hold a read lock across the 
isBound() method (as per attached patch). I have a reproducer that 
causes a crash within a few minutes and will try this fix out with it 
(so far so good).

> We wanted to get the opinion of the experienced Qpid developers
> on this list before opening a bug, but again, so far it seems to
> be much more stable (we're hammering on Session::exchangeBound()
> in our testing, which may be why we're seeing it in the first place).
> 
> I'm not sure if the other exchange types have the same problem or not.

 From a brief examination, the other types do appear to have locking for 
that method. I'll run my test against them as well however.

> Thanks!
> -Rob

Thanks again for raising the issue and for all the investigation you 
have done on this so far!

--Gordon.

Re: C++ Broker: segfault in TopicExchange::isBound()

Posted by Gordon Sim <gs...@redhat.com>.
rspringer wrote:
> Wow - that's awesome turnaround!  I'm testing now, but Gordon, we were
> wondering if there may also be need for a lock around what I believe is
> now line 228...another "for" loop over bindings (near a test on
> fedOpReorigin).  I've not known it to cause a problem, but I've got this
> hammer and now everything looks like a nail. :)

Yes, you are absolutely right again! That would be an issue where the 
dynamic federation routing feature is used (I'm not sure exactly what 
the reorigen op does).

I've added a separate JIRA for that one and we'll get a fix committed 
shortly: https://issues.apache.org/jira/browse/QPID-1971


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


Re: C++ Broker: segfault in TopicExchange::isBound()

Posted by rspringer <rs...@etinternational.com>.
Wow - that's awesome turnaround!  I'm testing now, but Gordon, we were
wondering if there may also be need for a lock around what I believe is
now line 228...another "for" loop over bindings (near a test on
fedOpReorigin).  I've not known it to cause a problem, but I've got this
hammer and now everything looks like a nail. :)

Again, thanks for everything - I'll test a fresh SVN checkout and I'll
only reply if I have a problem for some reason; otherwise, just assume
everything's working great.

Thanks again, guys,
-rob

On Mon, 2009-07-06 at 05:59 -0700, Carl Trieloff (via Nabble) wrote:
> 
> Gordon put a fix in for isBound() 
> 
> http://svn.apache.org/viewvc?rev=790164&view=rev
> 
> 
> I'll look at the core, but you can also test with a rev later than
> the 
> above and see if that works for you 
> 
> Carl. 



-- 
View this message in context: http://n2.nabble.com/C%2B%2B-Broker%3A-segfault-in-TopicExchange%3A%3AisBound%28%29-tp3185091p3213531.html
Sent from the Apache Qpid developers mailing list archive at Nabble.com.

Re: C++ Broker: segfault in TopicExchange::isBound()

Posted by Carl Trieloff <cc...@redhat.com>.
Gordon put a fix in for isBound()

http://svn.apache.org/viewvc?rev=790164&view=rev


I'll look at the core, but you can also test with a rev later than the 
above and see if that works for you

Carl.



rspringer wrote:
> Carl - Sorry for the delay, I was out of the office for most of last week.  I
> have what I THINK is an accurate core (my binary has been re-created since I
> generated it, but looking over it, the relevant sections appear accurate)
> and the associated dumps.  If you'd like / need, let me know and I can
> revert my changes and re-generate.
>
> Below are the backtraces - thanks for checking this out (thread 1 is the
> interesting one)!
> -Rob
>
> #0  0x0000003ebb25c213 in std::_Rb_tree_increment () from
> /usr/lib64/libstdc++.so.6      
> (gdb) thread apply all bt
>
> Thread 8 (process 20504):
> #0  0x0000003eb85c9c5c in epoll_wait () from /lib64/tls/libc.so.6
> #1  0x0000002a95c56b67 in qpid::sys::Poller::wait (this=0x533250,
> timeout={nanosecs = 9223372036854775807})
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:432                                    
> #2  0x0000002a95c56a07 in qpid::sys::Poller::run (this=0x533250) at
> ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:398
> #3  0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
> ../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
> #4  0x0000002a9573cd05 in qpid::broker::Broker::run (this=0x534160) at
> ../../../src/qpid-0.5/cpp/src/qpid/broker/Broker.cpp:319     
> #5  0x00000000004102fb in QpiddBroker::execute (this=0x7fbffff3b7,
> options=0x52bfa0) at ../../../src/qpid-0.5/cpp/src/posix/QpiddBroker.cpp:165
> #6  0x000000000040d9af in main (argc=3, argv=0x7fbffff6f8) at
> ../../../src/qpid-0.5/cpp/src/qpidd.cpp:77                                       
>
> Thread 7 (process 20505):
> #0  0x0000002a9575cd76 in
> boost::intrusive_ptr<qpid::broker::Message>::operator-> (this=0x409efe80)
>     at /home/rspringer/opt/include/boost/smart_ptr/intrusive_ptr.hpp:149                           
> #1  0x0000002a95814ff1 in qpid::management::ManagementBroker::sendBuffer
> (this=0x2a96741010, buf=@0x409eff50, length=103, exchange=
>         {px = 0x538a20, pn = {pi_ = 0x538ce0}}, routingKey=                                                                        
>         {static npos = 18446744073709551615, _M_dataplus =
> {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data
> fields>}, <No data fields>}, _M_p = 0x2a976ada38
> "console.obj.1.0.org.apache.qpid.broker.binding"}})                                                               
>     at
> ../../../src/qpid-0.5/cpp/src/qpid/management/ManagementBroker.cpp:304                                                                      
> #2  0x0000002a95815a20 in
> qpid::management::ManagementBroker::periodicProcessing (this=0x2a96741010)                                               
>     at
> ../../../src/qpid-0.5/cpp/src/qpid/management/ManagementBroker.cpp:388                                                                      
> #3  0x0000002a95814b21 in Periodic (this=0x0, _broker=@0x2a96741148,
> _seconds=42)                                                                  
>     at
> ../../../src/qpid-0.5/cpp/src/qpid/management/ManagementBroker.cpp:245                                                                      
> #4  0x0000002a95804f74 in qpid::broker::Timer::run (this=0x2a96741140) at
> ../../../src/qpid-0.5/cpp/src/qpid/broker/Timer.cpp:67                   
> #5  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
> (p=0x2a96741140)                                                           
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35                                                                                  
> #6  0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0                                                                          
> #7  0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6                                                                                       
> #8  0x0000000000000000 in ?? ()                                                                                                                    
>
> Thread 6 (process 20506):
> #0  0x0000003eb8c08d2f in pthread_cond_timedwait@@GLIBC_2.3.2 () from
> /lib64/tls/libpthread.so.0
> #1  0x0000002a958053d7 in qpid::sys::Condition::wait (this=0x534558,
> mutex=@0x534530, absoluteTime=@0x2a97ea5048)
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Condition.h:69                                               
> #2  0x0000002a9580538f in qpid::sys::Monitor::wait (this=0x534530,
> absoluteTime=@0x2a97ea5048)                   
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/Monitor.h:45                                                       
> #3  0x0000002a95804fb2 in qpid::broker::Timer::run (this=0x534528) at
> ../../../src/qpid-0.5/cpp/src/qpid/broker/Timer.cpp:69
> #4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
> (p=0x534528) at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35
> #5  0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0                                                                         
> #6  0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6                                                                                      
> #7  0x0000000000000000 in ?? ()                                                                                                                   
>
> Thread 5 (process 20507):
> #0  0x0000003eb8c08d2f in pthread_cond_timedwait@@GLIBC_2.3.2 () from
> /lib64/tls/libpthread.so.0
> #1  0x0000002a958053d7 in qpid::sys::Condition::wait (this=0x534620,
> mutex=@0x5345f8, absoluteTime=@0x538d88)
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Condition.h:69                                           
> #2  0x0000002a9580538f in qpid::sys::Monitor::wait (this=0x5345f8,
> absoluteTime=@0x538d88) at
> ../../../src/qpid-0.5/cpp/src/qpid/sys/Monitor.h:45
> #3  0x0000002a95804fb2 in qpid::broker::Timer::run (this=0x5345f0) at
> ../../../src/qpid-0.5/cpp/src/qpid/broker/Timer.cpp:69                     
> #4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
> (p=0x5345f0) at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35
> #5  0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0                                                                         
> #6  0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6                                                                                      
> #7  0x0000000000000000 in ?? ()                                                                                                                   
>
> Thread 4 (process 20508):
> #0  0x0000003eb85c9c5c in epoll_wait () from /lib64/tls/libc.so.6
> #1  0x0000002a95c56b67 in qpid::sys::Poller::wait (this=0x533250,
> timeout={nanosecs = 9223372036854775807})
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:432                                    
> #2  0x0000002a95c56a07 in qpid::sys::Poller::run (this=0x533250) at
> ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:398
> #3  0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
> ../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
> #4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
> (p=0x7fbffff0e0)                                            
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35                                                                   
> #5  0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0                                                           
> #6  0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6                                                                        
> #7  0x0000000000000000 in ?? ()                                                                                                     
>
> Thread 3 (process 20509):
> #0  0x0000003eb85c9c5c in epoll_wait () from /lib64/tls/libc.so.6
> #1  0x0000002a95c56b67 in qpid::sys::Poller::wait (this=0x533250,
> timeout={nanosecs = 9223372036854775807})
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:432                                    
> #2  0x0000002a95c56a07 in qpid::sys::Poller::run (this=0x533250) at
> ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:398
> #3  0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
> ../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
> #4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
> (p=0x7fbffff0e0)                                            
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35                                                                   
> #5  0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0                                                           
> #6  0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6                                                                        
> #7  0x0000000000000000 in ?? ()                                                                                                     
>
> Thread 2 (process 20511):
> #0  0x0000003eb85c9c5c in epoll_wait () from /lib64/tls/libc.so.6
> #1  0x0000002a95c56b67 in qpid::sys::Poller::wait (this=0x533250,
> timeout={nanosecs = 9223372036854775807})
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:432                                    
> #2  0x0000002a95c56a07 in qpid::sys::Poller::run (this=0x533250) at
> ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:398
> #3  0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
> ../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
> #4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
> (p=0x7fbffff0e0)                                            
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35                                                                   
> #5  0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0                                                           
> #6  0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6                                                                        
> #7  0x0000000000000000 in ?? ()                                                                                                     
>
> Thread 1 (process 20510):
> #0  0x0000003ebb25c213 in std::_Rb_tree_increment () from
> /usr/lib64/libstdc++.so.6
> #1  0x0000002a9580a583 in
> std::_Rb_tree_iterator<std::pair<qpid::broker::TopicPattern const,
> qpid::broker::TopicExchange::BoundKey> >::operator++
>     (this=0x2a9580a583) at
> /usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../../include/c++/3.4.6/bits/stl_tree.h:187                              
> #2  0x0000002a95808d58 in qpid::broker::TopicExchange::isBound
> (this=0x537ee0, queue={px = 0x2a979719f0, pn = {pi_ = 0x2a97971a18}},             
>     routingKey=0x2a9791eb90) at
> ../../../src/qpid-0.5/cpp/src/qpid/broker/TopicExchange.cpp:297                                                  
> #3  0x0000002a957eb4fd in
> qpid::broker::SessionAdapter::ExchangeHandlerImpl::bound (this=0xbe90f8,
> exchangeName=@0x2a9791eb80,                   
>     queueName=@0x2a9791eb88, key=@0x2a9791eb90, args=@0x2a9791eb98) at
> ../../../src/qpid-0.5/cpp/src/qpid/broker/SessionAdapter.cpp:256          
> #4  0x0000002a95c2f6ee in
> qpid::framing::ExchangeBoundBody::invoke<qpid::framing::AMQP_ServerOperations::ExchangeHandler>
> (this=0x2a9791eb70,    
>     invocable=@0xbe90f8) at gen/qpid/framing/ExchangeBoundBody.h:88                                                                              
> #5  0x0000002a95c2dce3 in
> qpid::framing::AMQP_ServerOperations::ExchangeHandler::Invoker::visit
> (this=0x43c03bc0, body=@0x2a9791eb70)            
>     at gen/qpid/framing/ServerInvoker.cpp:650                                                                                                    
> #6  0x0000002a95c0e278 in qpid::framing::ExchangeBoundBody::accept
> (this=0x2a9791eb70, v=@0x43c03bc0) at
> gen/qpid/framing/ExchangeBoundBody.h:92 
> #7  0x0000002a95c2c685 in
> qpid::framing::AMQP_ServerOperations::Invoker::visit (this=0x43c03c40,
> body=@0x2a9791eb70)                             
>     at gen/qpid/framing/ServerInvoker.cpp:363                                                                                                    
> #8  0x0000002a95c0e278 in qpid::framing::ExchangeBoundBody::accept
> (this=0x2a9791eb70, v=@0x43c03c40) at
> gen/qpid/framing/ExchangeBoundBody.h:92 
> #9  0x0000002a957fc3dc in
> qpid::framing::invoke<qpid::broker::SessionAdapter> (target=@0xbe90e0,
> body=@0x2a9791eb70)                             
>     at ../../../src/qpid-0.5/cpp/src/qpid/framing/Invoker.h:67                                                                                   
> #10 0x0000002a957f9a78 in qpid::broker::SessionState::handleCommand
> (this=0xbe8db0, method=0x2a9791eb70, id=@0x43c03ec0)                         
>     at ../../../src/qpid-0.5/cpp/src/qpid/broker/SessionState.cpp:189                                                                            
> #11 0x0000002a957faaf7 in qpid::broker::SessionState::handleIn
> (this=0xbe8db0, frame=@0x43c04420)                                                
>     at ../../../src/qpid-0.5/cpp/src/qpid/broker/SessionState.cpp:323                                                                            
> #12 0x0000002a957fda37 in
> qpid::framing::Handler<qpid::framing::AMQFrame&>::MemFunRef<qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface,
> &(qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface::handleIn(qpid::framing::AMQFrame&))>::handle
> (              
>     this=0xbe8f40, t=@0x43c04420) at
> ../../../src/qpid-0.5/cpp/src/qpid/framing/Handler.h:67                                                       
> #13 0x0000002a95c7ca57 in qpid::amqp_0_10::SessionHandler::handleIn
> (this=0xd55f20, f=@0x43c04420)                                                 
>     at ../../../src/qpid-0.5/cpp/src/qpid/amqp_0_10/SessionHandler.cpp:86                                                                          
> #14 0x0000002a957fda37 in
> qpid::framing::Handler<qpid::framing::AMQFrame&>::MemFunRef<qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface,
> &(qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface::handleIn(qpid::framing::AMQFrame&))>::handle
> (              
>     this=0xd55f30, t=@0x43c04420) at
> ../../../src/qpid-0.5/cpp/src/qpid/framing/Handler.h:67                                                       
> #15 0x0000002a9577a10e in
> qpid::framing::Handler<qpid::framing::AMQFrame&>::operator() (this=0xd55f30,
> t=@0x43c04420)                              
>     at ../../../src/qpid-0.5/cpp/src/qpid/framing/Handler.h:42                                                                                     
> #16 0x0000002a95777854 in qpid::broker::Connection::received (this=0xca54d0,
> frame=@0x43c04420)                                                    
>     at ../../../src/qpid-0.5/cpp/src/qpid/broker/Connection.cpp:106                                                                                
> #17 0x0000002a95733199 in qpid::amqp_0_10::Connection::decode
> (this=0xb9a170, buffer=0xc61eb0 "\017\001", size=166)                                
>     at ../../../src/qpid-0.5/cpp/src/qpid/amqp_0_10/Connection.cpp:55                                                                              
> #18 0x0000002a957d9ecc in qpid::broker::SecureConnection::decode
> (this=0xd33100, buffer=0xc61eb0 "\017\001", size=166)                             
>     at ../../../src/qpid-0.5/cpp/src/qpid/broker/SecureConnection.cpp:42                                                                           
> #19 0x0000002a95cb0a7d in qpid::sys::AsynchIOHandler::readbuff
> (this=0xc9e520, buff=0xbee030)                                                      
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/AsynchIOHandler.cpp:104                                                                              
> #20 0x0000002a95825674 in
> boost::detail::function::functor_manager_common<boost::_bi::bind_t<bool,
> boost::_mfi::mf2<bool, qpid::sys::AsynchIOHandler, qpid::sys::AsynchIO&,
> qpid::sys::AsynchIOBufferBase*>,
> boost::_bi::list3<boost::_bi::value<qpid::sys::AsynchIOHandler*>,
> boost::arg<1> (*)(), boost::arg<2> (*)()> > >::manage_small
> (in_buffer=@0xc9e520, out_buffer=@0xbe5160,
> op=boost::detail::function::clone_functor_tag)
>     at /home/rspringer/opt/include/boost/function/function_base.hpp:307
> #21 0x0000002a95824dff in
> boost::detail::function::functor_manager<boost::_bi::bind_t<void,
> boost::_mfi::mf4<void, qpid::sys::AsynchIOProtocolFactory,
> boost::shared_ptr<qpid::sys::Poller>, qpid::sys::Socket const&,
> qpid::sys::ConnectionCodec::Factory*, bool>,
> boost::_bi::list5<boost::_bi::value<qpid::sys::AsynchIOProtocolFactory*>,
> boost::_bi::value<boost::shared_ptr<qpid::sys::Poller> >, boost::arg<1>
> (*)(), boost::_bi::value<qpid::sys::ConnectionCodec::Factory*>,
> boost::_bi::value<bool> > > >::manager (in_buffer=@0x3eb8c0d340,
> out_buffer=@0xbe5168,
>     op=boost::detail::function::clone_functor_tag) at
> /home/rspringer/opt/include/boost/function/function_base.hpp:395
> #22 0x0000002a95824995 in storage5 (this=0x43c04af8, a1={t_ = 0x3eb8c0d300},
> a2={t_ = {px = 0x2a96c00128, pn = {pi_ = 0x2a97c4df10}}},
>     a3=0x43c04af8, a4={t_ = 0xbe5160}, a5={t_ = false}) at
> /home/rspringer/opt/include/boost/bind/storage.hpp:227
> #23 0x0000002a9582450b in bind_t (this=0xbe5270, f={f_ = 0xbe5270, this
> adjustment 12509232}, l=@0x2a9582450b)
>     at /home/rspringer/opt/include/boost/bind/bind.hpp:859
> #24 0x0000002a95c4b756 in boost::function2<bool, qpid::sys::AsynchIO&,
> qpid::sys::AsynchIOBufferBase*>::operator() (this=0xbe5268, a0=@0xbe5160,
>     a1=0xbee030) at
> /home/rspringer/opt/include/boost/function/function_template.hpp:988
> #25 0x0000002a95c49548 in qpid::sys::posix::AsynchIO::readable
> (this=0xbe5160, h=@0xbe5168)
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/AsynchIO.cpp:446
> #26 0x0000002a95c4efd8 in boost::_mfi::mf1<void, qpid::sys::posix::AsynchIO,
> qpid::sys::DispatchHandle&>::operator() (this=0xbe5180, p=0xbe5160,
>     a1=@0xbe5168) at
> /home/rspringer/opt/include/boost/bind/mem_fn_template.hpp:162
> #27 0x0000002a95c4e609 in
> boost::_bi::list2<boost::_bi::value<qpid::sys::posix::AsynchIO*>,
> boost::arg<1> (*)()>::operator()<boost::_mfi::mf1<void,
> qpid::sys::posix::AsynchIO, qpid::sys::DispatchHandle&>,
> boost::_bi::list1<qpid::sys::DispatchHandle&> > (this=0xbe5190, f=@0xbe5180,
>     a=@0x43c04e20) at /home/rspringer/opt/include/boost/bind/bind.hpp:306
> #28 0x0000002a95c4dcdd in boost::_bi::bind_t<void, boost::_mfi::mf1<void,
> qpid::sys::posix::AsynchIO, qpid::sys::DispatchHandle&>,
> boost::_bi::list2<boost::_bi::value<qpid::sys::posix::AsynchIO*>,
> boost::arg<1> (*)()> >::operator()<qpid::sys::DispatchHandle>
> (this=0xbe5180, a1=@0xbe5168)
>     at /home/rspringer/opt/include/boost/bind/bind_template.hpp:32
> #29 0x0000002a95c4d095 in
> boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void,
> boost::_mfi::mf1<void, qpid::sys::posix::AsynchIO,
> qpid::sys::DispatchHandle&>,
> boost::_bi::list2<boost::_bi::value<qpid::sys::posix::AsynchIO*>,
> boost::arg<1> (*)()> >, void, qpid::sys::DispatchHandle&>::invoke
> (function_obj_ptr=@0xbe5180, a0=@0xbe5168) at
> /home/rspringer/opt/include/boost/function/function_template.hpp:152
> #30 0x0000002a95cb432e in boost::function1<void,
> qpid::sys::DispatchHandle&>::operator() (this=0xbe5178, a0=@0xbe5168)
>     at /home/rspringer/opt/include/boost/function/function_template.hpp:988
> #31 0x0000002a95cb3b96 in qpid::sys::DispatchHandle::processEvent
> (this=0xbe5168, type=qpid::sys::Poller::READABLE)
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/DispatchHandle.cpp:428
> #32 0x0000002a95c57d43 in qpid::sys::Poller::Event::process
> (this=0x43c05030) at ../../../src/qpid-0.5/cpp/src/qpid/sys/Poller.h:122
> #33 0x0000002a95c56a2b in qpid::sys::Poller::run (this=0x533250) at
> ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:402
> #34 0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
> ../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
> #35 0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
> (p=0x7fbffff0e0)
>     at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35
> #36 0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0
> #37 0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6
> #38 0x0000000000000000 in ?? ()
>
>
>
>
> Carl Trieloff wrote:
>   
>> Rob,
>>
>> are you able to do a
>>
>> 'thread apply all bt'
>>
>> on the core. That will allow us to reason through your theory with you
>>
>> Carl.
>>
>>
>> rspringer@mail.etinternational.com wrote:
>>     
>>> All,
>>> We've been using Qpid M4 and 0.5 internally, both with C++, and have been
>>> seeing occasional segfaults within the broker.  They all seem to occur
>>> inside TopicExchange::isBound() (the one with 3 parameters, not 2).
>>>
>>> Looking at the access patterns in this function, and judging by the
>>> core dumps, it appears that multiple threads may access and modify
>>> the class member "bindings" simultaneously, which seems like it
>>> has the potential to cause a segfault.
>>>
>>> Since this is a race condition (if we're right), it's hard to
>>> say for sure that the problem has been resolved, but since doing
>>> the below, we've not had a segfault (roughly 5 hours so far,
>>> rather than every 2-3 hours or so).
>>>  - We added locks to all accesses of bindings throughout the
>>>    file.  I believe the new ones were limited to isBound.
>>>  - This may not be necessary at all, but to be safe (while
>>>    we were investigating the problem), we changed all lock
>>>    types from RWlock::ScopedLocks to Mutex::ScopedLocks, to avoid
>>>    being tripped up by pthread (read/write) mutex semantics.
>>>
>>> We wanted to get the opinion of the experienced Qpid developers
>>> on this list before opening a bug, but again, so far it seems to
>>> be much more stable (we're hammering on Session::exchangeBound()
>>> in our testing, which may be why we're seeing it in the first place).
>>>
>>> I'm not sure if the other exchange types have the same problem or not.
>>>
>>> Thanks!
>>> -Rob
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> Apache Qpid - AMQP Messaging Implementation
>>> Project:      http://qpid.apache.org
>>> Use/Interact: mailto:dev-subscribe@qpid.apache.org
>>>
>>>   
>>>       
>> ---------------------------------------------------------------------
>> Apache Qpid - AMQP Messaging Implementation
>> Project:      http://qpid.apache.org
>> Use/Interact: mailto:dev-subscribe@qpid.apache.org
>>
>>
>>
>>     
>
>   


Re: C++ Broker: segfault in TopicExchange::isBound()

Posted by rspringer <rs...@etinternational.com>.
Carl - Sorry for the delay, I was out of the office for most of last week.  I
have what I THINK is an accurate core (my binary has been re-created since I
generated it, but looking over it, the relevant sections appear accurate)
and the associated dumps.  If you'd like / need, let me know and I can
revert my changes and re-generate.

Below are the backtraces - thanks for checking this out (thread 1 is the
interesting one)!
-Rob

#0  0x0000003ebb25c213 in std::_Rb_tree_increment () from
/usr/lib64/libstdc++.so.6      
(gdb) thread apply all bt

Thread 8 (process 20504):
#0  0x0000003eb85c9c5c in epoll_wait () from /lib64/tls/libc.so.6
#1  0x0000002a95c56b67 in qpid::sys::Poller::wait (this=0x533250,
timeout={nanosecs = 9223372036854775807})
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:432                                    
#2  0x0000002a95c56a07 in qpid::sys::Poller::run (this=0x533250) at
../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:398
#3  0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
#4  0x0000002a9573cd05 in qpid::broker::Broker::run (this=0x534160) at
../../../src/qpid-0.5/cpp/src/qpid/broker/Broker.cpp:319     
#5  0x00000000004102fb in QpiddBroker::execute (this=0x7fbffff3b7,
options=0x52bfa0) at ../../../src/qpid-0.5/cpp/src/posix/QpiddBroker.cpp:165
#6  0x000000000040d9af in main (argc=3, argv=0x7fbffff6f8) at
../../../src/qpid-0.5/cpp/src/qpidd.cpp:77                                       

Thread 7 (process 20505):
#0  0x0000002a9575cd76 in
boost::intrusive_ptr<qpid::broker::Message>::operator-> (this=0x409efe80)
    at /home/rspringer/opt/include/boost/smart_ptr/intrusive_ptr.hpp:149                           
#1  0x0000002a95814ff1 in qpid::management::ManagementBroker::sendBuffer
(this=0x2a96741010, buf=@0x409eff50, length=103, exchange=
        {px = 0x538a20, pn = {pi_ = 0x538ce0}}, routingKey=                                                                        
        {static npos = 18446744073709551615, _M_dataplus =
{<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data
fields>}, <No data fields>}, _M_p = 0x2a976ada38
"console.obj.1.0.org.apache.qpid.broker.binding"}})                                                               
    at
../../../src/qpid-0.5/cpp/src/qpid/management/ManagementBroker.cpp:304                                                                      
#2  0x0000002a95815a20 in
qpid::management::ManagementBroker::periodicProcessing (this=0x2a96741010)                                               
    at
../../../src/qpid-0.5/cpp/src/qpid/management/ManagementBroker.cpp:388                                                                      
#3  0x0000002a95814b21 in Periodic (this=0x0, _broker=@0x2a96741148,
_seconds=42)                                                                  
    at
../../../src/qpid-0.5/cpp/src/qpid/management/ManagementBroker.cpp:245                                                                      
#4  0x0000002a95804f74 in qpid::broker::Timer::run (this=0x2a96741140) at
../../../src/qpid-0.5/cpp/src/qpid/broker/Timer.cpp:67                   
#5  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x2a96741140)                                                           
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35                                                                                  
#6  0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0                                                                          
#7  0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6                                                                                       
#8  0x0000000000000000 in ?? ()                                                                                                                    

Thread 6 (process 20506):
#0  0x0000003eb8c08d2f in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/tls/libpthread.so.0
#1  0x0000002a958053d7 in qpid::sys::Condition::wait (this=0x534558,
mutex=@0x534530, absoluteTime=@0x2a97ea5048)
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Condition.h:69                                               
#2  0x0000002a9580538f in qpid::sys::Monitor::wait (this=0x534530,
absoluteTime=@0x2a97ea5048)                   
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/Monitor.h:45                                                       
#3  0x0000002a95804fb2 in qpid::broker::Timer::run (this=0x534528) at
../../../src/qpid-0.5/cpp/src/qpid/broker/Timer.cpp:69
#4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x534528) at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35
#5  0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0                                                                         
#6  0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6                                                                                      
#7  0x0000000000000000 in ?? ()                                                                                                                   

Thread 5 (process 20507):
#0  0x0000003eb8c08d2f in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/tls/libpthread.so.0
#1  0x0000002a958053d7 in qpid::sys::Condition::wait (this=0x534620,
mutex=@0x5345f8, absoluteTime=@0x538d88)
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Condition.h:69                                           
#2  0x0000002a9580538f in qpid::sys::Monitor::wait (this=0x5345f8,
absoluteTime=@0x538d88) at
../../../src/qpid-0.5/cpp/src/qpid/sys/Monitor.h:45
#3  0x0000002a95804fb2 in qpid::broker::Timer::run (this=0x5345f0) at
../../../src/qpid-0.5/cpp/src/qpid/broker/Timer.cpp:69                     
#4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x5345f0) at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35
#5  0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0                                                                         
#6  0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6                                                                                      
#7  0x0000000000000000 in ?? ()                                                                                                                   

Thread 4 (process 20508):
#0  0x0000003eb85c9c5c in epoll_wait () from /lib64/tls/libc.so.6
#1  0x0000002a95c56b67 in qpid::sys::Poller::wait (this=0x533250,
timeout={nanosecs = 9223372036854775807})
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:432                                    
#2  0x0000002a95c56a07 in qpid::sys::Poller::run (this=0x533250) at
../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:398
#3  0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
#4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x7fbffff0e0)                                            
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35                                                                   
#5  0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0                                                           
#6  0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6                                                                        
#7  0x0000000000000000 in ?? ()                                                                                                     

Thread 3 (process 20509):
#0  0x0000003eb85c9c5c in epoll_wait () from /lib64/tls/libc.so.6
#1  0x0000002a95c56b67 in qpid::sys::Poller::wait (this=0x533250,
timeout={nanosecs = 9223372036854775807})
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:432                                    
#2  0x0000002a95c56a07 in qpid::sys::Poller::run (this=0x533250) at
../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:398
#3  0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
#4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x7fbffff0e0)                                            
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35                                                                   
#5  0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0                                                           
#6  0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6                                                                        
#7  0x0000000000000000 in ?? ()                                                                                                     

Thread 2 (process 20511):
#0  0x0000003eb85c9c5c in epoll_wait () from /lib64/tls/libc.so.6
#1  0x0000002a95c56b67 in qpid::sys::Poller::wait (this=0x533250,
timeout={nanosecs = 9223372036854775807})
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:432                                    
#2  0x0000002a95c56a07 in qpid::sys::Poller::run (this=0x533250) at
../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:398
#3  0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
#4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x7fbffff0e0)                                            
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35                                                                   
#5  0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0                                                           
#6  0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6                                                                        
#7  0x0000000000000000 in ?? ()                                                                                                     

Thread 1 (process 20510):
#0  0x0000003ebb25c213 in std::_Rb_tree_increment () from
/usr/lib64/libstdc++.so.6
#1  0x0000002a9580a583 in
std::_Rb_tree_iterator<std::pair<qpid::broker::TopicPattern const,
qpid::broker::TopicExchange::BoundKey> >::operator++
    (this=0x2a9580a583) at
/usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../../include/c++/3.4.6/bits/stl_tree.h:187                              
#2  0x0000002a95808d58 in qpid::broker::TopicExchange::isBound
(this=0x537ee0, queue={px = 0x2a979719f0, pn = {pi_ = 0x2a97971a18}},             
    routingKey=0x2a9791eb90) at
../../../src/qpid-0.5/cpp/src/qpid/broker/TopicExchange.cpp:297                                                  
#3  0x0000002a957eb4fd in
qpid::broker::SessionAdapter::ExchangeHandlerImpl::bound (this=0xbe90f8,
exchangeName=@0x2a9791eb80,                   
    queueName=@0x2a9791eb88, key=@0x2a9791eb90, args=@0x2a9791eb98) at
../../../src/qpid-0.5/cpp/src/qpid/broker/SessionAdapter.cpp:256          
#4  0x0000002a95c2f6ee in
qpid::framing::ExchangeBoundBody::invoke<qpid::framing::AMQP_ServerOperations::ExchangeHandler>
(this=0x2a9791eb70,    
    invocable=@0xbe90f8) at gen/qpid/framing/ExchangeBoundBody.h:88                                                                              
#5  0x0000002a95c2dce3 in
qpid::framing::AMQP_ServerOperations::ExchangeHandler::Invoker::visit
(this=0x43c03bc0, body=@0x2a9791eb70)            
    at gen/qpid/framing/ServerInvoker.cpp:650                                                                                                    
#6  0x0000002a95c0e278 in qpid::framing::ExchangeBoundBody::accept
(this=0x2a9791eb70, v=@0x43c03bc0) at
gen/qpid/framing/ExchangeBoundBody.h:92 
#7  0x0000002a95c2c685 in
qpid::framing::AMQP_ServerOperations::Invoker::visit (this=0x43c03c40,
body=@0x2a9791eb70)                             
    at gen/qpid/framing/ServerInvoker.cpp:363                                                                                                    
#8  0x0000002a95c0e278 in qpid::framing::ExchangeBoundBody::accept
(this=0x2a9791eb70, v=@0x43c03c40) at
gen/qpid/framing/ExchangeBoundBody.h:92 
#9  0x0000002a957fc3dc in
qpid::framing::invoke<qpid::broker::SessionAdapter> (target=@0xbe90e0,
body=@0x2a9791eb70)                             
    at ../../../src/qpid-0.5/cpp/src/qpid/framing/Invoker.h:67                                                                                   
#10 0x0000002a957f9a78 in qpid::broker::SessionState::handleCommand
(this=0xbe8db0, method=0x2a9791eb70, id=@0x43c03ec0)                         
    at ../../../src/qpid-0.5/cpp/src/qpid/broker/SessionState.cpp:189                                                                            
#11 0x0000002a957faaf7 in qpid::broker::SessionState::handleIn
(this=0xbe8db0, frame=@0x43c04420)                                                
    at ../../../src/qpid-0.5/cpp/src/qpid/broker/SessionState.cpp:323                                                                            
#12 0x0000002a957fda37 in
qpid::framing::Handler<qpid::framing::AMQFrame&>::MemFunRef<qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface,
&(qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface::handleIn(qpid::framing::AMQFrame&))>::handle
(              
    this=0xbe8f40, t=@0x43c04420) at
../../../src/qpid-0.5/cpp/src/qpid/framing/Handler.h:67                                                       
#13 0x0000002a95c7ca57 in qpid::amqp_0_10::SessionHandler::handleIn
(this=0xd55f20, f=@0x43c04420)                                                 
    at ../../../src/qpid-0.5/cpp/src/qpid/amqp_0_10/SessionHandler.cpp:86                                                                          
#14 0x0000002a957fda37 in
qpid::framing::Handler<qpid::framing::AMQFrame&>::MemFunRef<qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface,
&(qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface::handleIn(qpid::framing::AMQFrame&))>::handle
(              
    this=0xd55f30, t=@0x43c04420) at
../../../src/qpid-0.5/cpp/src/qpid/framing/Handler.h:67                                                       
#15 0x0000002a9577a10e in
qpid::framing::Handler<qpid::framing::AMQFrame&>::operator() (this=0xd55f30,
t=@0x43c04420)                              
    at ../../../src/qpid-0.5/cpp/src/qpid/framing/Handler.h:42                                                                                     
#16 0x0000002a95777854 in qpid::broker::Connection::received (this=0xca54d0,
frame=@0x43c04420)                                                    
    at ../../../src/qpid-0.5/cpp/src/qpid/broker/Connection.cpp:106                                                                                
#17 0x0000002a95733199 in qpid::amqp_0_10::Connection::decode
(this=0xb9a170, buffer=0xc61eb0 "\017\001", size=166)                                
    at ../../../src/qpid-0.5/cpp/src/qpid/amqp_0_10/Connection.cpp:55                                                                              
#18 0x0000002a957d9ecc in qpid::broker::SecureConnection::decode
(this=0xd33100, buffer=0xc61eb0 "\017\001", size=166)                             
    at ../../../src/qpid-0.5/cpp/src/qpid/broker/SecureConnection.cpp:42                                                                           
#19 0x0000002a95cb0a7d in qpid::sys::AsynchIOHandler::readbuff
(this=0xc9e520, buff=0xbee030)                                                      
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/AsynchIOHandler.cpp:104                                                                              
#20 0x0000002a95825674 in
boost::detail::function::functor_manager_common<boost::_bi::bind_t<bool,
boost::_mfi::mf2<bool, qpid::sys::AsynchIOHandler, qpid::sys::AsynchIO&,
qpid::sys::AsynchIOBufferBase*>,
boost::_bi::list3<boost::_bi::value<qpid::sys::AsynchIOHandler*>,
boost::arg<1> (*)(), boost::arg<2> (*)()> > >::manage_small
(in_buffer=@0xc9e520, out_buffer=@0xbe5160,
op=boost::detail::function::clone_functor_tag)
    at /home/rspringer/opt/include/boost/function/function_base.hpp:307
#21 0x0000002a95824dff in
boost::detail::function::functor_manager<boost::_bi::bind_t<void,
boost::_mfi::mf4<void, qpid::sys::AsynchIOProtocolFactory,
boost::shared_ptr<qpid::sys::Poller>, qpid::sys::Socket const&,
qpid::sys::ConnectionCodec::Factory*, bool>,
boost::_bi::list5<boost::_bi::value<qpid::sys::AsynchIOProtocolFactory*>,
boost::_bi::value<boost::shared_ptr<qpid::sys::Poller> >, boost::arg<1>
(*)(), boost::_bi::value<qpid::sys::ConnectionCodec::Factory*>,
boost::_bi::value<bool> > > >::manager (in_buffer=@0x3eb8c0d340,
out_buffer=@0xbe5168,
    op=boost::detail::function::clone_functor_tag) at
/home/rspringer/opt/include/boost/function/function_base.hpp:395
#22 0x0000002a95824995 in storage5 (this=0x43c04af8, a1={t_ = 0x3eb8c0d300},
a2={t_ = {px = 0x2a96c00128, pn = {pi_ = 0x2a97c4df10}}},
    a3=0x43c04af8, a4={t_ = 0xbe5160}, a5={t_ = false}) at
/home/rspringer/opt/include/boost/bind/storage.hpp:227
#23 0x0000002a9582450b in bind_t (this=0xbe5270, f={f_ = 0xbe5270, this
adjustment 12509232}, l=@0x2a9582450b)
    at /home/rspringer/opt/include/boost/bind/bind.hpp:859
#24 0x0000002a95c4b756 in boost::function2<bool, qpid::sys::AsynchIO&,
qpid::sys::AsynchIOBufferBase*>::operator() (this=0xbe5268, a0=@0xbe5160,
    a1=0xbee030) at
/home/rspringer/opt/include/boost/function/function_template.hpp:988
#25 0x0000002a95c49548 in qpid::sys::posix::AsynchIO::readable
(this=0xbe5160, h=@0xbe5168)
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/AsynchIO.cpp:446
#26 0x0000002a95c4efd8 in boost::_mfi::mf1<void, qpid::sys::posix::AsynchIO,
qpid::sys::DispatchHandle&>::operator() (this=0xbe5180, p=0xbe5160,
    a1=@0xbe5168) at
/home/rspringer/opt/include/boost/bind/mem_fn_template.hpp:162
#27 0x0000002a95c4e609 in
boost::_bi::list2<boost::_bi::value<qpid::sys::posix::AsynchIO*>,
boost::arg<1> (*)()>::operator()<boost::_mfi::mf1<void,
qpid::sys::posix::AsynchIO, qpid::sys::DispatchHandle&>,
boost::_bi::list1<qpid::sys::DispatchHandle&> > (this=0xbe5190, f=@0xbe5180,
    a=@0x43c04e20) at /home/rspringer/opt/include/boost/bind/bind.hpp:306
#28 0x0000002a95c4dcdd in boost::_bi::bind_t<void, boost::_mfi::mf1<void,
qpid::sys::posix::AsynchIO, qpid::sys::DispatchHandle&>,
boost::_bi::list2<boost::_bi::value<qpid::sys::posix::AsynchIO*>,
boost::arg<1> (*)()> >::operator()<qpid::sys::DispatchHandle>
(this=0xbe5180, a1=@0xbe5168)
    at /home/rspringer/opt/include/boost/bind/bind_template.hpp:32
#29 0x0000002a95c4d095 in
boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void,
boost::_mfi::mf1<void, qpid::sys::posix::AsynchIO,
qpid::sys::DispatchHandle&>,
boost::_bi::list2<boost::_bi::value<qpid::sys::posix::AsynchIO*>,
boost::arg<1> (*)()> >, void, qpid::sys::DispatchHandle&>::invoke
(function_obj_ptr=@0xbe5180, a0=@0xbe5168) at
/home/rspringer/opt/include/boost/function/function_template.hpp:152
#30 0x0000002a95cb432e in boost::function1<void,
qpid::sys::DispatchHandle&>::operator() (this=0xbe5178, a0=@0xbe5168)
    at /home/rspringer/opt/include/boost/function/function_template.hpp:988
#31 0x0000002a95cb3b96 in qpid::sys::DispatchHandle::processEvent
(this=0xbe5168, type=qpid::sys::Poller::READABLE)
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/DispatchHandle.cpp:428
#32 0x0000002a95c57d43 in qpid::sys::Poller::Event::process
(this=0x43c05030) at ../../../src/qpid-0.5/cpp/src/qpid/sys/Poller.h:122
#33 0x0000002a95c56a2b in qpid::sys::Poller::run (this=0x533250) at
../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:402
#34 0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
#35 0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x7fbffff0e0)
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35
#36 0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0
#37 0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6
#38 0x0000000000000000 in ?? ()




Carl Trieloff wrote:
> 
> 
> Rob,
> 
> are you able to do a
> 
> 'thread apply all bt'
> 
> on the core. That will allow us to reason through your theory with you
> 
> Carl.
> 
> 
> rspringer@mail.etinternational.com wrote:
>> All,
>> We've been using Qpid M4 and 0.5 internally, both with C++, and have been
>> seeing occasional segfaults within the broker.  They all seem to occur
>> inside TopicExchange::isBound() (the one with 3 parameters, not 2).
>>
>> Looking at the access patterns in this function, and judging by the
>> core dumps, it appears that multiple threads may access and modify
>> the class member "bindings" simultaneously, which seems like it
>> has the potential to cause a segfault.
>>
>> Since this is a race condition (if we're right), it's hard to
>> say for sure that the problem has been resolved, but since doing
>> the below, we've not had a segfault (roughly 5 hours so far,
>> rather than every 2-3 hours or so).
>>  - We added locks to all accesses of bindings throughout the
>>    file.  I believe the new ones were limited to isBound.
>>  - This may not be necessary at all, but to be safe (while
>>    we were investigating the problem), we changed all lock
>>    types from RWlock::ScopedLocks to Mutex::ScopedLocks, to avoid
>>    being tripped up by pthread (read/write) mutex semantics.
>>
>> We wanted to get the opinion of the experienced Qpid developers
>> on this list before opening a bug, but again, so far it seems to
>> be much more stable (we're hammering on Session::exchangeBound()
>> in our testing, which may be why we're seeing it in the first place).
>>
>> I'm not sure if the other exchange types have the same problem or not.
>>
>> Thanks!
>> -Rob
>>
>>
>>
>> ---------------------------------------------------------------------
>> Apache Qpid - AMQP Messaging Implementation
>> Project:      http://qpid.apache.org
>> Use/Interact: mailto:dev-subscribe@qpid.apache.org
>>
>>   
> 
> 
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:dev-subscribe@qpid.apache.org
> 
> 
> 

-- 
View this message in context: http://n2.nabble.com/C%2B%2B-Broker%3A-segfault-in-TopicExchange%3A%3AisBound%28%29-tp3185091p3212878.html
Sent from the Apache Qpid developers mailing list archive at Nabble.com.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


Re: C++ Broker: segfault in TopicExchange::isBound()

Posted by Carl Trieloff <cc...@redhat.com>.
Rob,

are you able to do a

'thread apply all bt'

on the core. That will allow us to reason through your theory with you

Carl.


rspringer@mail.etinternational.com wrote:
> All,
> We've been using Qpid M4 and 0.5 internally, both with C++, and have been
> seeing occasional segfaults within the broker.  They all seem to occur
> inside TopicExchange::isBound() (the one with 3 parameters, not 2).
>
> Looking at the access patterns in this function, and judging by the
> core dumps, it appears that multiple threads may access and modify
> the class member "bindings" simultaneously, which seems like it
> has the potential to cause a segfault.
>
> Since this is a race condition (if we're right), it's hard to
> say for sure that the problem has been resolved, but since doing
> the below, we've not had a segfault (roughly 5 hours so far,
> rather than every 2-3 hours or so).
>  - We added locks to all accesses of bindings throughout the
>    file.  I believe the new ones were limited to isBound.
>  - This may not be necessary at all, but to be safe (while
>    we were investigating the problem), we changed all lock
>    types from RWlock::ScopedLocks to Mutex::ScopedLocks, to avoid
>    being tripped up by pthread (read/write) mutex semantics.
>
> We wanted to get the opinion of the experienced Qpid developers
> on this list before opening a bug, but again, so far it seems to
> be much more stable (we're hammering on Session::exchangeBound()
> in our testing, which may be why we're seeing it in the first place).
>
> I'm not sure if the other exchange types have the same problem or not.
>
> Thanks!
> -Rob
>
>
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:dev-subscribe@qpid.apache.org
>
>   


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org