You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Pavel Moravec (JIRA)" <ji...@apache.org> on 2014/05/06 15:43:14 UTC

[jira] [Created] (QPID-5747) Federated link ends up in Connecting state forever after connecting to shutting down broker

Pavel Moravec created QPID-5747:
-----------------------------------

             Summary:  Federated link ends up in Connecting state forever after connecting to shutting down broker
                 Key: QPID-5747
                 URL: https://issues.apache.org/jira/browse/QPID-5747
             Project: Qpid
          Issue Type: Bug
          Components: C++ Broker
    Affects Versions: 0.26
            Reporter: Pavel Moravec


Description of problem:
Having federation link with source broker S and destination broker D (such that TCP connection is initiated by D and messages flow from S to D), if the link is attempting to reconnect to S while S is just shutting down, there is a probability the link will stay in Connecting state forever.


Version-Release number of selected component (if applicable):
0.18-11, 0.18-14, 0.18-20


How reproducible:
100% after some time


Steps to Reproduce:
1. Mimic broker S by simple python program:

import socket
import sys

# Create a TCP/IP socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Bind the socket to the port
server_address = ('localhost', 10000)
print >>sys.stderr, 'starting up on %s port %s' % server_address
sock.bind(server_address)
# Listen for incoming connections
sock.listen(1)

# Wait for a connection
print >>sys.stderr, 'waiting for a connection'
connection, client_address = sock.accept()

2. In one terminal, run it in a loop:
while true; do python server.py; done

2a. rather for observation: run tcpdump on port 10000

3. In another terminal, create federation link to this "server":
qpid-route link add localhost:5672 localhost:10000

4. Wait few seconds and generate whatever traffic to the broker to make it busy, i.e.:
qpid-send -a amq.fanout -m 1000000 --content-size=1000

5. Check tcpdump when it stops logging new traffic and execute how many times you wish:
qpid-route link list

Actual results:
Everytime and forever, the link status will be Connecting like:

Host            Port    Transport Durable  State             Last Error
=============================================================================
localhost       10000   tcp          N     Connecting        Closed by peer

(expected observation is that python "server" cant bind to port 10000 due to "Address already in use" for some time - that is expected as previous TCP connection is in some FIN_WAIT-like state so far; but even if the "server" can bind to the port after a while, the broker does not attempt to reconnect)


Expected results:
Link status flapps between Waiting and Connecting, until the server is ready again and the link is Operational (wont happen in this scenario due to the "server.py" implementation)


Additional info:
The key is, the qpid broker can't send initial "AMQP 0-10" frame to the peer. I.e. the bug appears if and only if:
- TCP connection is fully established (3way handshake) such that qpid::broker::connect method returns success
- but closed so fast such that Link::established is not invoked / broker does not react on the connection establishment

That is why it helps / speedups reproducer to put the broker under load.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org