You are viewing a plain text version of this content. The canonical link for it is here.
Posted to proton@qpid.apache.org by "Pavel Moravec (JIRA)" <ji...@apache.org> on 2015/09/21 22:48:04 UTC

[jira] [Commented] (PROTON-1000) Connection leak on heartbeat-timeouted connections

    [ https://issues.apache.org/jira/browse/PROTON-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14901368#comment-14901368 ] 

Pavel Moravec commented on PROTON-1000:
---------------------------------------

I think I have reproducer based on Proton Reactor (derived from what gofer does):

{code}
#!/usr/bin/python

from time import sleep
from uuid import uuid4

from proton import ConnectionException
from proton import SSLDomain, SSLException

from proton.utils import BlockingConnection

import fileinput

domain = None
conn = BlockingConnection("proton+amqp://localhost:5672", ssl_domain=domain, heartbeat=5)
rec = conn.create_receiver("some_address", name=str(uuid4()), dynamic=False, options=None)
try:
  sleep(9)
  snd = conn.create_sender("another_address", name=str(uuid4()))
except ConnectionException:
  try:
    conn.close()
  except Exception, e:
    print e
    pass
_in = raw_input("Check for CLOSE_WAIT before pressing Enter: ")
{code}

Execute that code and on prompt, check thatf the python process has CLOSE_WAIT connection.

Backtrace of the caught exception "e" is:

{code}
  File "proton-1000.py", line 24, in <module>
    conn.close()
  File "/usr/lib64/python2.7/site-packages/proton/utils.py", line 219, in close
    msg="Closing connection")
  File "/usr/lib64/python2.7/site-packages/proton/utils.py", line 231, in wait
    self.container.process()
  File "/usr/lib64/python2.7/site-packages/proton/reactor.py", line 143, in process
    self._check_errors()
  File "/usr/lib64/python2.7/site-packages/proton/__init__.py", line 3737, in dispatch
    ev.dispatch(self.handler)
  File "/usr/lib64/python2.7/site-packages/proton/__init__.py", line 3662, in dispatch
    result = dispatch(handler, type.method, self)
  File "/usr/lib64/python2.7/site-packages/proton/__init__.py", line 3551, in dispatch
    return m(*args)
  File "/usr/lib64/python2.7/site-packages/proton/utils.py", line 257, in on_transport_tail_closed
    self.on_transport_closed(event)
  File "/usr/lib64/python2.7/site-packages/proton/utils.py", line 261, in on_transport_closed
    raise ConnectionException("Connection %s disconnected" % self.url);
{code}

Worth playing with SSL as well where I noticed little bit different behaviour - adding SSL stuff to the reproducer should be trivial, though.

> Connection leak on heartbeat-timeouted connections
> --------------------------------------------------
>
>                 Key: PROTON-1000
>                 URL: https://issues.apache.org/jira/browse/PROTON-1000
>             Project: Qpid Proton
>          Issue Type: Bug
>          Components: python-binding
>    Affects Versions: 0.9
>            Reporter: Pavel Moravec
>            Assignee: Gordon Sim
>
> Using gofer/katello-agent that uses BlockingConnection from Proton Reactor with heartbeats set up, if some connection timeouts due to the heartbeats, Proton does not close the TCP connection. That causes TCP connection leak, despite gofer properly called BlockingConnection.close() and forgot any reference to that class instance.
> Checking tcpdump, Proton simply ignores the timeouted connections - it does not respond anyhow to the communication partner whatever it sends (in some scenarios it sends some AMQP performative that Proton was assumed to respond, in other scenario the communication peer dropped the TCP connection by sending FIN+ACK packet but Proton didn't send FIN packet back - the only stuff seen in tcpdump is ACKing on TCP layer made by OS, not by Proton). And Proton ignores an attempt of Proton reactor to close the connection/container, raising:
> Sep 21 15:02:35 my-capsule goferd: File "/usr/lib64/python2.7/site-packages/proton/utils.py", line 263, in on_transport_closed
> Sep 21 15:02:35 my-capsule goferd: raise ConnectionException("Connection %s disconnected" % self.url);
> Sep 21 15:02:35 my-capsule goferd: ConnectionException: Connection amqps://satellite.example.com:5647 disconnected
> for SSL connections, and raising:
> Sep 21 14:56:28 my-capsule goferd: File "/usr/lib64/python2.7/site-packages/proton/utils.py", line 259, in on_transport_tail_closed
> Sep 21 14:56:28 my-capsule goferd: self.on_transport_closed(event)
> Sep 21 14:56:28 my-capsule goferd: File "/usr/lib64/python2.7/site-packages/proton/utils.py", line 263, in on_transport_closed
> Sep 21 14:56:28 my-capsule goferd: raise ConnectionException("Connection %s disconnected" % self.url);
> Sep 21 14:56:28 my-capsule goferd: ConnectionException: Connection amqps://satellite.example.com:5647 disconnected
> (some difference between SSL and nonSSL could come from the fact that in my case the server part - qdrouterd / Qpid Dispatch Router - sends FIN+ACK packet for nonSSL connection, while it does not send anything for SSL connection and continue for sending empty AMQP frames due to heartbeats enabled forever)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)