You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by Aaron Fabbri <aj...@gmail.com> on 2010/10/28 00:31:27 UTC
segfault in Rdma broker
I recently updated to svn trunk latest and now my Rdma broker is crashing.
1. Is anyone else seeing this?
2. Should I open a bug.. and if so please give pointer to url and
anything else I'd need to know.
I'm running from gdb with this script
# cat gdbscript
file /aafabbri/autotools_build/src/.libs/qpidd
# get libraries loaded
break main
run
del 1
define go
run \
--auth no \
--mgmt-enable no \
--load-module /aafabbri/autotools_build/src/.libs/rdma.so \
--transport rdma \
--worker-threads 4 \
--log-to-stdout yes
end
# LD_LIBRARY_PATH=/aafabbri/autotools_build/src/.libs gdb -x myscript
...
(gdb) go
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff5af0710 (LWP 2850)]
2010-10-27 15:28:32 notice SASL disabled: No Authentication Performed
2010-10-27 15:28:32 notice SASL disabled: No Authentication Performed
2010-10-27 15:28:32 notice Listening on TCP port 5672
2010-10-27 15:28:32 notice Listening on TCP port 5672
2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
5672
2010-10-27 15:28:32 notice Broker running
2010-10-27 15:28:32 notice Broker running
[New Thread 0x7ffff4cd6710 (LWP 2851)]
[New Thread 0x7ffff42d5710 (LWP 2852)]
[New Thread 0x7ffff38d4710 (LWP 2853)]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff38d4710 (LWP 2853)]
0x00007ffff67cacb2 in operator() (function_obj_ptr=<value optimized
out>, a0=...)
at /usr/include/boost/bind/mem_fn_template.hpp:49
49 BOOST_MEM_FN_RETURN (p->*f_)();
(gdb) thread apply all bt
Thread 5 (Thread 0x7ffff38d4710 (LWP 2853)):
#0 0x00007ffff67cacb2 in operator() (function_obj_ptr=<value
optimized out>, a0=...)
at /usr/include/boost/bind/mem_fn_template.hpp:49
#1 operator()<boost::_mfi::mf0<void, Rdma::AsynchIO>,
boost::_bi::list1<qpid::sys::DispatchHandle&> > (
function_obj_ptr=<value optimized out>, a0=...) at
/usr/include/boost/bind/bind.hpp:246
#2 operator()<qpid::sys::DispatchHandle> (function_obj_ptr=<value
optimized out>, a0=...)
at /usr/include/boost/bind/bind_template.hpp:32
#3 boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void,
boost::_mfi::mf0<void, Rdma::AsynchIO>,
boost::_bi::list1<boost::_bi::value<Rdma::AsynchIO*> > >, void,
qpid::sys::DispatchHandle&>::invoke (
function_obj_ptr=<value optimized out>, a0=...) at
/usr/include/boost/function/function_template.hpp:153
#4 0x00007ffff78f5298 in boost::function1<void,
qpid::sys::DispatchHandle&>::operator() (this=<value optimized out>,
a0=<value optimized out>) at
/usr/include/boost/function/function_template.hpp:1013
#5 0x00007ffff78f45b9 in qpid::sys::DispatchHandle::processEvent
(this=0x7fffe80021a0,
type=qpid::sys::Poller::READABLE) at qpid/sys/DispatchHandle.cpp:278
#6 0x00007ffff78450d2 in process (this=0x62fc50) at ./qpid/sys/Poller.h:131
#7 qpid::sys::Poller::run (this=0x62fc50) at qpid/sys/epoll/EpollPoller.cpp:519
#8 0x00007ffff783cf6a in qpid::sys::(anonymous
namespace)::runRunnable (p=<value optimized out>)
at qpid/sys/posix/Thread.cpp:35
#9 0x000000305a207761 in start_thread () from /lib64/libpthread.so.0
#10 0x0000003059ee14fd in clone () from /lib64/libc.so.6
---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project: http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org
Re: segfault in Rdma broker
Posted by Aaron Fabbri <aj...@gmail.com>.
On Thu, Oct 28, 2010 at 8:16 AM, Andrew Stitcher <as...@redhat.com> wrote:
> On Wed, 2010-10-27 at 15:31 -0700, Aaron Fabbri wrote:
>> I recently updated to svn trunk latest and now my Rdma broker is crashing.
>>
>> 1. Is anyone else seeing this?
>
> Is this happening before connections, during, after?? What is your
> platform? What rdma hardware?
Before connections, on MT25208 DDR InfiniBand HCAs.
>>
>> 2. Should I open a bug.. and if so please give pointer to url and
>> anything else I'd need to know.
>
> If you open a bug make sure you a good way to replicate the bug (at
> least in your own testing).
I can reproduce it 100%.
>>
>> I'm running from gdb with this script
>>
>> # cat gdbscript
>> file /aafabbri/autotools_build/src/.libs/qpidd
>>
>> # get libraries loaded
>> break main
>> run
>> del 1
>>
>> define go
>> run \
>> --auth no \
>> --mgmt-enable no \
>> --load-module /aafabbri/autotools_build/src/.libs/rdma.so \
>> --transport rdma \
>> --worker-threads 4 \
>> --log-to-stdout yes
>
> Can you add --no-module-dir to rule out any other modules from being
> loaded and confusing things.
Thanks! That was the problem. Perhaps someone I share the machines
with installed libs in the standard place. I confirmed with strace
that qpid was opening two different rdma.so's:
[root@localhost build]# grep rdma.so /tmp/strace.out
11920 open("/root/aafabbri/apache_qpid/trunk/qpid/cpp/build/src/.libs/rdma.so",
O_RDONLY) = 5
11920 stat("/usr/local/lib/qpid/daemon/rdma.so",
{st_mode=S_IFREG|0755, st_size=978468, ...}) = 0
11920 open("/usr/local/lib/qpid/daemon/rdma.so", O_RDONLY) = 6
When I add --no-module-dir, everything works again, and strace only
shows a single open of rdma.so:
[root@localhost build]# grep rdma.so /tmp/strace2.out
11955 open("/root/aafabbri/apache_qpid/trunk/qpid/cpp/build/src/.libs/rdma.so",
O_RDONLY) = 5
>
>> end
>>
>> # LD_LIBRARY_PATH=/aafabbri/autotools_build/src/.libs gdb -x myscript
>> ...
>> (gdb) go
>> [Thread debugging using libthread_db enabled]
>> [New Thread 0x7ffff5af0710 (LWP 2850)]
>> 2010-10-27 15:28:32 notice SASL disabled: No Authentication Performed
>> 2010-10-27 15:28:32 notice SASL disabled: No Authentication Performed
>> 2010-10-27 15:28:32 notice Listening on TCP port 5672
>> 2010-10-27 15:28:32 notice Listening on TCP port 5672
>> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
>> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
>> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
>> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
>> 5672
>> 2010-10-27 15:28:32 notice Broker running
>> 2010-10-27 15:28:32 notice Broker running
>> [New Thread 0x7ffff4cd6710 (LWP 2851)]
>> [New Thread 0x7ffff42d5710 (LWP 2852)]
>> [New Thread 0x7ffff38d4710 (LWP 2853)]
>
> I'm a little suspicious of the duplicated lines here - is this just an
> artifact of your cut and paste? If not it might indicate that multiple
> rdma modules are being loaded.
>
> Andrew
>
>
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project: http://qpid.apache.org
> Use/Interact: mailto:dev-subscribe@qpid.apache.org
>
>
---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project: http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org
Re: segfault in Rdma broker
Posted by Andrew Stitcher <as...@redhat.com>.
On Wed, 2010-10-27 at 15:31 -0700, Aaron Fabbri wrote:
> I recently updated to svn trunk latest and now my Rdma broker is crashing.
>
> 1. Is anyone else seeing this?
Is this happening before connections, during, after?? What is your
platform? What rdma hardware?
>
> 2. Should I open a bug.. and if so please give pointer to url and
> anything else I'd need to know.
If you open a bug make sure you a good way to replicate the bug (at
least in your own testing).
>
> I'm running from gdb with this script
>
> # cat gdbscript
> file /aafabbri/autotools_build/src/.libs/qpidd
>
> # get libraries loaded
> break main
> run
> del 1
>
> define go
> run \
> --auth no \
> --mgmt-enable no \
> --load-module /aafabbri/autotools_build/src/.libs/rdma.so \
> --transport rdma \
> --worker-threads 4 \
> --log-to-stdout yes
Can you add --no-module-dir to rule out any other modules from being
loaded and confusing things.
> end
>
> # LD_LIBRARY_PATH=/aafabbri/autotools_build/src/.libs gdb -x myscript
> ...
> (gdb) go
> [Thread debugging using libthread_db enabled]
> [New Thread 0x7ffff5af0710 (LWP 2850)]
> 2010-10-27 15:28:32 notice SASL disabled: No Authentication Performed
> 2010-10-27 15:28:32 notice SASL disabled: No Authentication Performed
> 2010-10-27 15:28:32 notice Listening on TCP port 5672
> 2010-10-27 15:28:32 notice Listening on TCP port 5672
> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
> 5672
> 2010-10-27 15:28:32 notice Broker running
> 2010-10-27 15:28:32 notice Broker running
> [New Thread 0x7ffff4cd6710 (LWP 2851)]
> [New Thread 0x7ffff42d5710 (LWP 2852)]
> [New Thread 0x7ffff38d4710 (LWP 2853)]
I'm a little suspicious of the duplicated lines here - is this just an
artifact of your cut and paste? If not it might indicate that multiple
rdma modules are being loaded.
Andrew
---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project: http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org
RE: segfault in Rdma broker
Posted by Steve Huston <sh...@riverace.com>.
Hi Aaron,
> I recently updated to svn trunk latest and now my Rdma broker
> is crashing.
>
> 1. Is anyone else seeing this?
I don't recall any other reports on this list.
> 2. Should I open a bug.. and if so please give pointer to url
> and anything else I'd need to know.
Yes, please. The URL is http://issues.apache.org/jira/browse/qpid
Hopefully Andrew will weigh in on your stack... He's the rdma wiz.
Thanks,
-Steve
> I'm running from gdb with this script
>
> # cat gdbscript
> file /aafabbri/autotools_build/src/.libs/qpidd
>
> # get libraries loaded
> break main
> run
> del 1
>
> define go
> run \
> --auth no \
> --mgmt-enable no \
> --load-module /aafabbri/autotools_build/src/.libs/rdma.so \
> --transport rdma \
> --worker-threads 4 \
> --log-to-stdout yes
> end
>
> # LD_LIBRARY_PATH=/aafabbri/autotools_build/src/.libs gdb -x
> myscript ...
> (gdb) go
> [Thread debugging using libthread_db enabled]
> [New Thread 0x7ffff5af0710 (LWP 2850)]
> 2010-10-27 15:28:32 notice SASL disabled: No Authentication
> Performed 2010-10-27 15:28:32 notice SASL disabled: No
> Authentication Performed 2010-10-27 15:28:32 notice Listening
> on TCP port 5672 2010-10-27 15:28:32 notice Listening on TCP
> port 5672 2010-10-27 15:28:32 notice Rdma: Listening on RDMA
> port 5672 2010-10-27 15:28:32 notice Rdma: Listening on RDMA
> port 5672 2010-10-27 15:28:32 notice Rdma: Listening on RDMA
> port 5672 2010-10-27 15:28:32 notice Rdma: Listening on RDMA
> port 5672 5672 2010-10-27 15:28:32 notice Broker running
> 2010-10-27 15:28:32 notice Broker running [New Thread
> 0x7ffff4cd6710 (LWP 2851)] [New Thread 0x7ffff42d5710 (LWP
> 2852)] [New Thread 0x7ffff38d4710 (LWP 2853)]
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7ffff38d4710 (LWP 2853)]
> 0x00007ffff67cacb2 in operator() (function_obj_ptr=<value optimized
> out>, a0=...)
> at /usr/include/boost/bind/mem_fn_template.hpp:49
> 49 BOOST_MEM_FN_RETURN (p->*f_)();
> (gdb) thread apply all bt
>
> Thread 5 (Thread 0x7ffff38d4710 (LWP 2853)):
> #0 0x00007ffff67cacb2 in operator() (function_obj_ptr=<value
> optimized out>, a0=...)
> at /usr/include/boost/bind/mem_fn_template.hpp:49
> #1 operator()<boost::_mfi::mf0<void, Rdma::AsynchIO>,
> boost::_bi::list1<qpid::sys::DispatchHandle&> > (
> function_obj_ptr=<value optimized out>, a0=...) at
> /usr/include/boost/bind/bind.hpp:246
> #2 operator()<qpid::sys::DispatchHandle>
> (function_obj_ptr=<value optimized out>, a0=...)
> at /usr/include/boost/bind/bind_template.hpp:32
> #3
> boost::detail::function::void_function_obj_invoker1<boost::_bi
> ::bind_t<void,
> boost::_mfi::mf0<void, Rdma::AsynchIO>,
> boost::_bi::list1<boost::_bi::value<Rdma::AsynchIO*> > >,
> void, qpid::sys::DispatchHandle&>::invoke (
> function_obj_ptr=<value optimized out>, a0=...) at
> /usr/include/boost/function/function_template.hpp:153
> #4 0x00007ffff78f5298 in boost::function1<void,
> qpid::sys::DispatchHandle&>::operator() (this=<value optimized out>,
> a0=<value optimized out>) at
> /usr/include/boost/function/function_template.hpp:1013
> #5 0x00007ffff78f45b9 in qpid::sys::DispatchHandle::processEvent
> (this=0x7fffe80021a0,
> type=qpid::sys::Poller::READABLE) at
> qpid/sys/DispatchHandle.cpp:278 #6 0x00007ffff78450d2 in
> process (this=0x62fc50) at ./qpid/sys/Poller.h:131 #7
> qpid::sys::Poller::run (this=0x62fc50) at
> qpid/sys/epoll/EpollPoller.cpp:519
> #8 0x00007ffff783cf6a in qpid::sys::(anonymous
> namespace)::runRunnable (p=<value optimized out>)
> at qpid/sys/posix/Thread.cpp:35
> #9 0x000000305a207761 in start_thread () from
> /lib64/libpthread.so.0 #10 0x0000003059ee14fd in clone ()
> from /lib64/libc.so.6
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project: http://qpid.apache.org
> Use/Interact: mailto:dev-subscribe@qpid.apache.org
>
>
---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project: http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org