You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Sridharan, Venkatesan" <Ve...@Squarepoint-Capital.com> on 2015/11/02 01:44:58 UTC

RE: Crash on inability to catch the internal exception thrown from qpid c++ API

Thanks! The original issue is that the session is closed by one thread, while it is being accessed by another. 
Thread 1: session.nextReceiver()
Thread 2: session.close()

If this is not a supported use case of qpid API, then the bug is in my code. I had a cursory look at the code and saw that SessionImpl had addRef()/release() and assumed the object will not be released when in use. 

But the session was closed (even when nextReceiver() was blocking in another thread). At some point when nextReceiver() continued execution, 
1) an exception was thrown
2)  ScopedUnlock tried to lock mutex again
3) the mutex was already gone, which threw an exception 
4) this crashed the entire program.

Stack trace on Thread 1:
#0  0x0000003d890328e5 in raise () from /lib64/libc.so.6
#1  0x0000003d890340c5 in abort () from /lib64/libc.so.6
#2  0x00007f7a429cd69b in __gnu_cxx::__verbose_terminate_handler() () from /home/replicate/dev/sentrylib/libsentry.so
#3  0x00007f7a4298cd37 in __cxxabiv1::__terminate(void (*)()) () from /home/replicate/dev/sentrylib/libsentry.so
#4  0x00007f7a4298ae5a in __cxa_call_terminate () from /home/replicate/dev/sentrylib/libsentry.so
#5  0x00007f7a4298a7c5 in __gxx_personality_v0 () from /home/replicate/dev/sentrylib/libsentry.so
#6  0x00007f7a429fa363 in _Unwind_RaiseException_Phase2 (exc=exc@entry=0x7f79f4000f80, context=context@entry=0x7f79fbffe370) at /apps/gcc-4.9.1/libgcc/unwind.inc:62
#7  0x00007f7a429faa0a in _Unwind_RaiseException (exc=0x7f79f4000f80) at /apps/gcc-4.9.1/libgcc/unwind.inc:131
#8  0x00007f7a4298bff2 in __cxa_throw () from /home/replicate/dev/sentrylib/libsentry.so
#9  0x00007f7a4266f27f in qpid::sys::Mutex::lock (this=<optimized out>) at /apps/qpid-cpp-0.34/src/qpid/sys/posix/Mutex.h:116
#10 0x00007f7a426aaadc in ~ScopedUnlock (this=<synthetic pointer>, __in_chrg=<optimized out>) at /apps/qpid-cpp-0.34/src/qpid/sys/Mutex.h:44
#11 qpid::client::amqp0_10::IncomingMessages::getNextDestination (this=this@entry=0x1f05690, destination=..., timeout=...)
    at /apps/qpid-cpp-0.34/src/qpid/client/amqp0_10/IncomingMessages.cpp:202
#12 0x00007f7a426b8032 in qpid::client::amqp0_10::SessionImpl::nextReceiver (this=0x1f05640, receiver=..., timeout=...)


Cant seem to find any qpid::RefCounted::addRef call in the entire project except in a couple of places:
/apps/qpid-cpp-0.34/src$ find . -type f|xargs grep addRef
./qpid/RefCounted.h:    void addRef() const { ++count; }
./qpid/RefCounted.h:inline void intrusive_ptr_add_ref(const RefCounted* p) { p->addRef(); }
./qpid/legacystore/TxnCtxt.cpp:        dtokp->addRef();
./qpid/legacystore/TxnCtxt.cpp:void TxnCtxt::incrDtokRef() { dtokp->addRef(); }
./qpid/legacystore/MessageStoreImpl.cpp:            dtokp->addRef();
./qpid/legacystore/MessageStoreImpl.cpp:    ddtokp->addRef();
./qpid/linearstore/TxnCtxt.cpp:        dtokp->addRef();
./qpid/linearstore/TxnCtxt.cpp:void TxnCtxt::incrDtokRef() { dtokp->addRef(); }
./qpid/linearstore/MessageStoreImpl.cpp:            dtokp->addRef();
./qpid/linearstore/MessageStoreImpl.cpp:    ddtokp->addRef();

We can probably use the addRef()/release() machinery if the use case I highlighted is indeed supported by qpid APi

> However in your patch, you use lambdas which though very neat would not work on older compilers.
Ok, I will convert it to use boost::function. As an aside, are there any plans to switch to c++11 in the future?

-----Original Message-----
From: Gordon Sim [mailto:gsim@redhat.com] 
Sent: Thursday, October 29, 2015 6:29 PM
To: dev@qpid.apache.org
Subject: Re: Crash on inability to catch the internal exception thrown from qpid c++ API

On 10/29/2015 06:14 AM, Sridharan, Venkatesan wrote:
> I have created the JIRA: 
> https://issues.apache.org/jira/browse/QPID-6811
> for the issue.
>
> I have attached a patch to the JIRA, am new to qpid development process, not sure how to go about it - can anybody provide some pointers on how to take this patch forward?
> Tested on Linux and Windows

First off, I apologise for not replying to earlier mails. I meant to try and reproduce myself, but didn't get the time to do so.

I tend to agree with you that the ScopedUnlock is a little unsafe. 
However in your patch, you use lambdas which though very neat would not work on older compilers.

One option (not as neat, admittedly) would be to use boost::function instead (which is already used elsewhere in the codebase).

In terms of the original issue, i.e. if the object to which the lock instance is scoped is being deleted while still in use by active threads, I think that may require some higher level fix.

As for process, attaching to a JIRA is fine. There is also a reviewboard instance which is quite nice, as it allows easy discussion of specific parts of the proposals: https://reviews.apache.org

> -----Original Message-----
> From: Sridharan, Venkatesan
> Sent: Friday, October 16, 2015 4:06 PM
> To: 'dev@qpid.apache.org' <de...@qpid.apache.org>
> Subject: RE: Crash on inability to catch the internal exception thrown 
> from qpid c++ API
>
> I finally traced the issue to throwing of an exception in ScopedUnlock destructor.
> Defined as:
> ~ScopedUnlock() { mutex.lock(); }
> Calls =>
> void Mutex::lock() {
>      QPID_POSIX_ASSERT_THROW_IF(pthread_mutex_lock(&mutex));
> }
>
> which throws and terminates the application.
>
> In my case, the ScopedUnlock dtor is called from 
> [qpid-cpp-0.34/src/qpid/client/amqp0_10/IncomingMessages.cpp:202]
>
> I am using the Session object from multiple threads and I suspect the one thread calls close() on session while another is operating on it. Is this scenario supported?
>
> Scoped unlock does not seem the correct approach to me. Instead of freeing a resource in the destructor whose failure is not catastrophic, ScopedUnlock tries to acquire a resource on exit which can fail and there is no way this can be communicated back to the code!
>
> I propose that the class be done away with and an explicit resource acquisition with proper failure handling semantics be used.
>
>
> -----Original Message-----
> From: Sridharan, Venkatesan
> Sent: Monday, October 12, 2015 10:54 AM
> To: 'dev@qpid.apache.org' <de...@qpid.apache.org>
> Subject: Crash on inability to catch the internal exception thrown 
> from qpid c++ API
>
> Hi Folks,
>
> While debugging an application crash in a client program using qpid c++ API, I traced the root cause of the error to the fact that an internal exception (qpid::Exception) which is not exposed in the API was thrown.  The exception was in Mutex.h:116 locking code. I had coded an error recovery  part of the code and looks like:
>
> // qpid::messaging::Connection m_conn; // qpid::messaging::Session 
> m_session try {
> 	if (m_session.hasError())
> 	{	
> 		m_session.close();
> 		m_conn.close();
> 		LOG(WARNING) <<  "Session has errors, recreating...";
> 		m_conn.open();
> 		m_session = m_conn.createSession();
> 	}
> }
> catch(std::exception const& ex)
> {
> ..
> }
> catch(...)
> {
> ..
> }
>
> Exception is not caught in both the catch() blocks.
>
> This is because there are places in the control flow where a Mutex object is locked but the internal exception thrown is not captured. This leads to leak of internal exception and since it is not exposed in the API, gcc cannot get its typeinfo for exception handling and terminate() gets called.  Since the issue is with the QPid API, I am powerless to prevent crashes in my application.
>
> Using two exception hierarchies and  leaking internal exceptions is bug prone and I think we should expose all the exception types (including internal) in the API so as to not crash the clients using the API.  What is the consensus here on exposing ALL the exceptions qpid client API might possibly throw? Or am I missing something?
>
> Cheers
> Venkatesan
>
>
>
>
> Confidentiality Note:  This e-mail and any attachments are 
> confidential and may be protected by legal privilege. If you are not 
> the intended recipient, be aware that any disclosure, copying, 
> distribution or use of this e-mail or any attachment is prohibited. If 
> you have received this e-mail in error, please notify us immediately 
> by returning it to the sender and delete this copy from your system. Thank you for your cooperation.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org For additional 
> commands, e-mail: dev-help@qpid.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org For additional commands, e-mail: dev-help@qpid.apache.org




Confidentiality Note:  This e-mail and any attachments are confidential and may
be protected by legal privilege. If you are not the intended recipient, be aware
that any disclosure, copying, distribution or use of this e-mail or any
attachment is prohibited. If you have received this e-mail in error, please
notify us immediately by returning it to the sender and delete this copy from
your system. Thank you for your cooperation.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org