You are viewing a plain text version of this content. The canonical link for it is here.
Posted to proton@qpid.apache.org by "Justin Ross (JIRA)" <ji...@apache.org> on 2016/01/08 00:22:40 UTC

[jira] [Updated] (PROTON-639) pn_messenger_recv hangs / spins on connection refused

     [ https://issues.apache.org/jira/browse/PROTON-639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Justin Ross updated PROTON-639:
-------------------------------
    Labels: messenger  (was: )

> pn_messenger_recv hangs / spins on connection refused
> -----------------------------------------------------
>
>                 Key: PROTON-639
>                 URL: https://issues.apache.org/jira/browse/PROTON-639
>             Project: Qpid Proton
>          Issue Type: Bug
>          Components: proton-c
>    Affects Versions: 0.7, 0.8
>         Environment: Red Hat Enterprise Linux 6.5
> kernel: 2.6.32-431.1.2.el6.x86_64
> qpid-proton 0.7 and 9939b8a990cd53c1b5e099c083bdcf61ad22232b git-svn-id: https://svn.apache.org/repos/asf/qpid/proton/trunk@1613151 13f79535-47bb-0310-9956-ffa450edef68
>            Reporter: Rohan McGovern
>              Labels: messenger
>
> If I try to connect to a closed port with a messenger, pn_messenger_recv outputs messages to stderr and then spins at high CPU usage, rather than returning with an error as expected.
> This seems to be impacted by kernel version.  I have a RHEL 6.5 machine which demonstrates this problem reliably when using kernel 2.6.32-431.1.2.el6.x86_64 and not when using 3.10.28-1.el6.elrepo.x86_64 .
> This can be easily reproduced using the "recv" example in the qpid-proton sources.
> {noformat:title=kernel 2.6.32 - broken}
> $ build/examples/messenger/c/recv amqp://127.0.0.1:1
> recv: Connection refused
> [0x63d8e0]:ERROR amqp:connection:framing-error SASL header mismatch: ''
> CONNECTION ERROR connection aborted (remote)
> # hangs at this point with high CPU usage
> {noformat}
> Compare with the behavior on a later kernel version, which seems right:
> {noformat:title=kernel 3.10.28 - expected behavior}
> $ build/examples/messenger/c/recv amqp://127.0.0.1:1
> recv: Connection refused
> [0x15af8e0]:ERROR amqp:connection:framing-error SASL header mismatch: ''
> CONNECTION ERROR connection aborted (remote)
> send: Broken pipe
> /home/rmcgover/src/qpid-proton/examples/messenger/c/recv.c:132: no valid sources
> # exits with exit code 1
> {noformat}
> Here's a sample backtrace when the hang is occurring:
> {noformat}
> (gdb) bt
> #0  0x00007ffff7ffea11 in clock_gettime ()
> #1  0x0000003a51e03e46 in clock_gettime () from /lib64/librt.so.1
> #2  0x00007ffff7de6b5e in pn_i_now () from /home/rmcgover/src/qpid-proton/build/proton-c/libqpid-proton.so.2
> #3  0x00007ffff7de4c06 in pn_selector_select () from /home/rmcgover/src/qpid-proton/build/proton-c/libqpid-proton.so.2
> #4  0x00007ffff7ddf736 in pni_wait () from /home/rmcgover/src/qpid-proton/build/proton-c/libqpid-proton.so.2
> #5  0x00007ffff7ddf869 in pn_messenger_tsync () from /home/rmcgover/src/qpid-proton/build/proton-c/libqpid-proton.so.2
> #6  0x00007ffff7ddf8df in pn_messenger_sync () from /home/rmcgover/src/qpid-proton/build/proton-c/libqpid-proton.so.2
> #7  0x00007ffff7de1676 in pn_messenger_recv () from /home/rmcgover/src/qpid-proton/build/proton-c/libqpid-proton.so.2
> #8  0x00000000004014b2 in main ()
> {noformat}
> There's a while(true) loop in pn_messenger_tsync which seems like it never escapes.  strace also shows that the process is repeatedly doing a poll.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)