You are viewing a plain text version of this content. The canonical link for it is here.
Posted to proton@qpid.apache.org by "Gordon Sim (JIRA)" <ji...@apache.org> on 2015/07/01 19:43:04 UTC

[jira] [Updated] (PROTON-907) Qpid Proton Point to Point Hang on CentOS 6 pn_messenger_send

     [ https://issues.apache.org/jira/browse/PROTON-907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gordon Sim updated PROTON-907:
------------------------------
    Attachment: PROTON-907-workaround.patch

The issue appears to be that on the affected platforms, when unable to connect, the file descriptor is not marked as writeable.

Though it hits the read error, messenger only closes the 'tail' of the transport as a result. The head is closed when an error is returned from send, but as the socket is not writeable, send is never called.

I don't know what the real fix for this is, messenger is an area of the code I'm even less familiar with. Fwiw the attached patch works around the issue and passes all the existing tests. It works by explicitly closing the head of the transport if there is an error on reading from the socket and the connection has not been closed by the peer.

> Qpid Proton Point to Point Hang on CentOS 6 pn_messenger_send
> -------------------------------------------------------------
>
>                 Key: PROTON-907
>                 URL: https://issues.apache.org/jira/browse/PROTON-907
>             Project: Qpid Proton
>          Issue Type: Bug
>          Components: proton-c
>    Affects Versions: 0.8, 0.9.1
>         Environment: CentOS 6 (both VM and native 64-bit) and RHEL 6
>            Reporter: Frank Quinn
>            Priority: Critical
>         Attachments: PROTON-907-workaround.patch
>
>
> See thread at http://qpid.2158936.n2.nabble.com/Strange-behaviour-for-pn-messenger-send-on-CentOS-6-td7625846.html.
> Key points:
> * pn_messenger_send will hang on CentOS 6 if the destination is not yet up
> * Works fine on Fedora 21 and 22 (by 'fine', i mean it will attempt to send, fail and move on)
> * Can be recreated by running the send.c application when recv.c is not yet running
> * Proton burns CPU as it hangs
> This effectively deadlocks our application. So far, I’ve tried compiling qpid proton c myself (both 0.8 and 0.9.1), setting pn_messenger_send timeout to 1 (it was previously -1), turning off iptables entirely and disabling selinux and rebooting but no luck. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)