You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by Aaron Fabbri <aj...@gmail.com> on 2010/10/28 00:31:27 UTC

segfault in Rdma broker

I recently updated to svn trunk latest and now my Rdma broker is crashing.

1. Is anyone else seeing this?

2. Should I open a bug.. and if so please give pointer to url and
anything else I'd need to know.

I'm running from gdb with this script

# cat gdbscript
file /aafabbri/autotools_build/src/.libs/qpidd

# get libraries loaded
break main
run
del 1

define go
        run \
        --auth no \
        --mgmt-enable no \
        --load-module /aafabbri/autotools_build/src/.libs/rdma.so \
        --transport rdma \
        --worker-threads 4 \
        --log-to-stdout yes
end

# LD_LIBRARY_PATH=/aafabbri/autotools_build/src/.libs gdb -x myscript
...
(gdb) go
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff5af0710 (LWP 2850)]
2010-10-27 15:28:32 notice SASL disabled: No Authentication Performed
2010-10-27 15:28:32 notice SASL disabled: No Authentication Performed
2010-10-27 15:28:32 notice Listening on TCP port 5672
2010-10-27 15:28:32 notice Listening on TCP port 5672
2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
5672
2010-10-27 15:28:32 notice Broker running
2010-10-27 15:28:32 notice Broker running
[New Thread 0x7ffff4cd6710 (LWP 2851)]
[New Thread 0x7ffff42d5710 (LWP 2852)]
[New Thread 0x7ffff38d4710 (LWP 2853)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff38d4710 (LWP 2853)]
0x00007ffff67cacb2 in operator() (function_obj_ptr=<value optimized
out>, a0=...)
    at /usr/include/boost/bind/mem_fn_template.hpp:49
49              BOOST_MEM_FN_RETURN (p->*f_)();
(gdb) thread apply all bt

Thread 5 (Thread 0x7ffff38d4710 (LWP 2853)):
#0  0x00007ffff67cacb2 in operator() (function_obj_ptr=<value
optimized out>, a0=...)
    at /usr/include/boost/bind/mem_fn_template.hpp:49
#1  operator()<boost::_mfi::mf0<void, Rdma::AsynchIO>,
boost::_bi::list1<qpid::sys::DispatchHandle&> > (
    function_obj_ptr=<value optimized out>, a0=...) at
/usr/include/boost/bind/bind.hpp:246
#2  operator()<qpid::sys::DispatchHandle> (function_obj_ptr=<value
optimized out>, a0=...)
    at /usr/include/boost/bind/bind_template.hpp:32
#3  boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void,
boost::_mfi::mf0<void, Rdma::AsynchIO>,
boost::_bi::list1<boost::_bi::value<Rdma::AsynchIO*> > >, void,
qpid::sys::DispatchHandle&>::invoke (
    function_obj_ptr=<value optimized out>, a0=...) at
/usr/include/boost/function/function_template.hpp:153
#4  0x00007ffff78f5298 in boost::function1<void,
qpid::sys::DispatchHandle&>::operator() (this=<value optimized out>,
    a0=<value optimized out>) at
/usr/include/boost/function/function_template.hpp:1013
#5  0x00007ffff78f45b9 in qpid::sys::DispatchHandle::processEvent
(this=0x7fffe80021a0,
    type=qpid::sys::Poller::READABLE) at qpid/sys/DispatchHandle.cpp:278
#6  0x00007ffff78450d2 in process (this=0x62fc50) at ./qpid/sys/Poller.h:131
#7  qpid::sys::Poller::run (this=0x62fc50) at qpid/sys/epoll/EpollPoller.cpp:519
#8  0x00007ffff783cf6a in qpid::sys::(anonymous
namespace)::runRunnable (p=<value optimized out>)
    at qpid/sys/posix/Thread.cpp:35
#9  0x000000305a207761 in start_thread () from /lib64/libpthread.so.0
#10 0x0000003059ee14fd in clone () from /lib64/libc.so.6

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


Re: segfault in Rdma broker

Posted by Aaron Fabbri <aj...@gmail.com>.
On Thu, Oct 28, 2010 at 8:16 AM, Andrew Stitcher <as...@redhat.com> wrote:
> On Wed, 2010-10-27 at 15:31 -0700, Aaron Fabbri wrote:
>> I recently updated to svn trunk latest and now my Rdma broker is crashing.
>>
>> 1. Is anyone else seeing this?
>
> Is this happening before connections, during, after?? What is your
> platform? What rdma hardware?

Before connections, on MT25208 DDR InfiniBand HCAs.

>>
>> 2. Should I open a bug.. and if so please give pointer to url and
>> anything else I'd need to know.
>
> If you open a bug make sure you a good way to replicate the bug (at
> least in your own testing).

I can reproduce it 100%.

>>
>> I'm running from gdb with this script
>>
>> # cat gdbscript
>> file /aafabbri/autotools_build/src/.libs/qpidd
>>
>> # get libraries loaded
>> break main
>> run
>> del 1
>>
>> define go
>>         run \
>>         --auth no \
>>         --mgmt-enable no \
>>         --load-module /aafabbri/autotools_build/src/.libs/rdma.so \
>>         --transport rdma \
>>         --worker-threads 4 \
>>         --log-to-stdout yes
>
> Can you add --no-module-dir to rule out any other modules from being
> loaded and confusing things.


Thanks!  That was the problem.  Perhaps someone I share the machines
with installed libs in the standard place.  I confirmed with strace
that qpid was opening two different rdma.so's:

[root@localhost build]# grep rdma.so /tmp/strace.out
11920 open("/root/aafabbri/apache_qpid/trunk/qpid/cpp/build/src/.libs/rdma.so",
O_RDONLY) = 5
11920 stat("/usr/local/lib/qpid/daemon/rdma.so",
{st_mode=S_IFREG|0755, st_size=978468, ...}) = 0
11920 open("/usr/local/lib/qpid/daemon/rdma.so", O_RDONLY) = 6

When I add --no-module-dir, everything works again, and strace only
shows a single open of rdma.so:

[root@localhost build]# grep rdma.so /tmp/strace2.out
11955 open("/root/aafabbri/apache_qpid/trunk/qpid/cpp/build/src/.libs/rdma.so",
O_RDONLY) = 5


>
>> end
>>
>> # LD_LIBRARY_PATH=/aafabbri/autotools_build/src/.libs gdb -x myscript
>> ...
>> (gdb) go
>> [Thread debugging using libthread_db enabled]
>> [New Thread 0x7ffff5af0710 (LWP 2850)]
>> 2010-10-27 15:28:32 notice SASL disabled: No Authentication Performed
>> 2010-10-27 15:28:32 notice SASL disabled: No Authentication Performed
>> 2010-10-27 15:28:32 notice Listening on TCP port 5672
>> 2010-10-27 15:28:32 notice Listening on TCP port 5672
>> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
>> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
>> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
>> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
>> 5672
>> 2010-10-27 15:28:32 notice Broker running
>> 2010-10-27 15:28:32 notice Broker running
>> [New Thread 0x7ffff4cd6710 (LWP 2851)]
>> [New Thread 0x7ffff42d5710 (LWP 2852)]
>> [New Thread 0x7ffff38d4710 (LWP 2853)]
>
> I'm a little suspicious of the duplicated lines here - is this just an
> artifact of your cut and paste? If not it might indicate that multiple
> rdma modules are being loaded.
>
> Andrew
>
>
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:dev-subscribe@qpid.apache.org
>
>

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


Re: segfault in Rdma broker

Posted by Andrew Stitcher <as...@redhat.com>.
On Wed, 2010-10-27 at 15:31 -0700, Aaron Fabbri wrote:
> I recently updated to svn trunk latest and now my Rdma broker is crashing.
> 
> 1. Is anyone else seeing this?

Is this happening before connections, during, after?? What is your
platform? What rdma hardware?

> 
> 2. Should I open a bug.. and if so please give pointer to url and
> anything else I'd need to know.

If you open a bug make sure you a good way to replicate the bug (at
least in your own testing).

> 
> I'm running from gdb with this script
> 
> # cat gdbscript
> file /aafabbri/autotools_build/src/.libs/qpidd
> 
> # get libraries loaded
> break main
> run
> del 1
> 
> define go
>         run \
>         --auth no \
>         --mgmt-enable no \
>         --load-module /aafabbri/autotools_build/src/.libs/rdma.so \
>         --transport rdma \
>         --worker-threads 4 \
>         --log-to-stdout yes

Can you add --no-module-dir to rule out any other modules from being
loaded and confusing things.

> end
> 
> # LD_LIBRARY_PATH=/aafabbri/autotools_build/src/.libs gdb -x myscript
> ...
> (gdb) go
> [Thread debugging using libthread_db enabled]
> [New Thread 0x7ffff5af0710 (LWP 2850)]
> 2010-10-27 15:28:32 notice SASL disabled: No Authentication Performed
> 2010-10-27 15:28:32 notice SASL disabled: No Authentication Performed
> 2010-10-27 15:28:32 notice Listening on TCP port 5672
> 2010-10-27 15:28:32 notice Listening on TCP port 5672
> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
> 2010-10-27 15:28:32 notice Rdma: Listening on RDMA port 5672
> 5672
> 2010-10-27 15:28:32 notice Broker running
> 2010-10-27 15:28:32 notice Broker running
> [New Thread 0x7ffff4cd6710 (LWP 2851)]
> [New Thread 0x7ffff42d5710 (LWP 2852)]
> [New Thread 0x7ffff38d4710 (LWP 2853)]

I'm a little suspicious of the duplicated lines here - is this just an
artifact of your cut and paste? If not it might indicate that multiple
rdma modules are being loaded.

Andrew



---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


RE: segfault in Rdma broker

Posted by Steve Huston <sh...@riverace.com>.
Hi Aaron,

> I recently updated to svn trunk latest and now my Rdma broker 
> is crashing.
> 
> 1. Is anyone else seeing this?

I don't recall any other reports on this list.

> 2. Should I open a bug.. and if so please give pointer to url 
> and anything else I'd need to know.

Yes, please. The URL is http://issues.apache.org/jira/browse/qpid

Hopefully Andrew will weigh in on your stack... He's the rdma wiz.

Thanks,
-Steve

> I'm running from gdb with this script
> 
> # cat gdbscript
> file /aafabbri/autotools_build/src/.libs/qpidd
> 
> # get libraries loaded
> break main
> run
> del 1
> 
> define go
>         run \
>         --auth no \
>         --mgmt-enable no \
>         --load-module /aafabbri/autotools_build/src/.libs/rdma.so \
>         --transport rdma \
>         --worker-threads 4 \
>         --log-to-stdout yes
> end
> 
> # LD_LIBRARY_PATH=/aafabbri/autotools_build/src/.libs gdb -x 
> myscript ...
> (gdb) go
> [Thread debugging using libthread_db enabled]
> [New Thread 0x7ffff5af0710 (LWP 2850)]
> 2010-10-27 15:28:32 notice SASL disabled: No Authentication 
> Performed 2010-10-27 15:28:32 notice SASL disabled: No 
> Authentication Performed 2010-10-27 15:28:32 notice Listening 
> on TCP port 5672 2010-10-27 15:28:32 notice Listening on TCP 
> port 5672 2010-10-27 15:28:32 notice Rdma: Listening on RDMA 
> port 5672 2010-10-27 15:28:32 notice Rdma: Listening on RDMA 
> port 5672 2010-10-27 15:28:32 notice Rdma: Listening on RDMA 
> port 5672 2010-10-27 15:28:32 notice Rdma: Listening on RDMA 
> port 5672 5672 2010-10-27 15:28:32 notice Broker running 
> 2010-10-27 15:28:32 notice Broker running [New Thread 
> 0x7ffff4cd6710 (LWP 2851)] [New Thread 0x7ffff42d5710 (LWP 
> 2852)] [New Thread 0x7ffff38d4710 (LWP 2853)]
> 
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7ffff38d4710 (LWP 2853)] 
> 0x00007ffff67cacb2 in operator() (function_obj_ptr=<value optimized
> out>, a0=...)
>     at /usr/include/boost/bind/mem_fn_template.hpp:49
> 49              BOOST_MEM_FN_RETURN (p->*f_)();
> (gdb) thread apply all bt
> 
> Thread 5 (Thread 0x7ffff38d4710 (LWP 2853)):
> #0  0x00007ffff67cacb2 in operator() (function_obj_ptr=<value 
> optimized out>, a0=...)
>     at /usr/include/boost/bind/mem_fn_template.hpp:49
> #1  operator()<boost::_mfi::mf0<void, Rdma::AsynchIO>, 
> boost::_bi::list1<qpid::sys::DispatchHandle&> > (
>     function_obj_ptr=<value optimized out>, a0=...) at 
> /usr/include/boost/bind/bind.hpp:246
> #2  operator()<qpid::sys::DispatchHandle> 
> (function_obj_ptr=<value optimized out>, a0=...)
>     at /usr/include/boost/bind/bind_template.hpp:32
> #3  
> boost::detail::function::void_function_obj_invoker1<boost::_bi
> ::bind_t<void,
> boost::_mfi::mf0<void, Rdma::AsynchIO>, 
> boost::_bi::list1<boost::_bi::value<Rdma::AsynchIO*> > >, 
> void, qpid::sys::DispatchHandle&>::invoke (
>     function_obj_ptr=<value optimized out>, a0=...) at 
> /usr/include/boost/function/function_template.hpp:153
> #4  0x00007ffff78f5298 in boost::function1<void,
> qpid::sys::DispatchHandle&>::operator() (this=<value optimized out>,
>     a0=<value optimized out>) at 
> /usr/include/boost/function/function_template.hpp:1013
> #5  0x00007ffff78f45b9 in qpid::sys::DispatchHandle::processEvent
> (this=0x7fffe80021a0,
>     type=qpid::sys::Poller::READABLE) at 
> qpid/sys/DispatchHandle.cpp:278 #6  0x00007ffff78450d2 in 
> process (this=0x62fc50) at ./qpid/sys/Poller.h:131 #7  
> qpid::sys::Poller::run (this=0x62fc50) at 
> qpid/sys/epoll/EpollPoller.cpp:519
> #8  0x00007ffff783cf6a in qpid::sys::(anonymous 
> namespace)::runRunnable (p=<value optimized out>)
>     at qpid/sys/posix/Thread.cpp:35
> #9  0x000000305a207761 in start_thread () from 
> /lib64/libpthread.so.0 #10 0x0000003059ee14fd in clone () 
> from /lib64/libc.so.6
> 
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:dev-subscribe@qpid.apache.org
> 
> 


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org