You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@qpid.apache.org by Fraser Adams <fr...@blueyonder.co.uk> on 2012/05/22 19:59:02 UTC

Transport endpoint not connected error???

Hi All,
on one of our brokers my colleagues have started seeing errors of the form:

error could not accept socket: Transport endpoint is not connected 
(qpid/sys/posix/Socket.cpp 58)

Does anybody have a good idea what is likely to be causing this??

The broker appears to be functioning and there are only a modest number 
of connections to it, we'd even upped the default connection limit "just 
in case" but the number of connections was way lower than that (only a 
couple of dozen or so).

Because the error seemed to be relating to "accept" I did wonder if 
there was a backlog issue, but it's not as if we seem to be having lots 
of things trying to connect at once so it shouldn't be an issue - 
nonetheless we tried upping the backlog "just in case" again it doesn't 
seem to have changed things.

We still appear to be able to connect to the broker and use the qpid 
tools and even created federated links, but I don't like seeing this 
error. I've got no idea what it relates to and can't see what we might 
be doing wrong.

This is in a qpid 0.8 C++ broker.

I had a look through Socket.cpp and the error seems to be kicked off in 
getName()

     int result = -1;
     if (local) {
         result = ::getsockname(fd, (::sockaddr*)&name, &namelen);
     } else {
         result = ::getpeername(fd, (::sockaddr*)&name, &namelen);
     }

     QPID_POSIX_CHECK(result);

the QPID_POSIX_CHECK is line 58 which is mentioned in the error message.


Looking through man for getsockname and getpeername getsockname doesn't 
seem to have relevant errors but getpeername mentions:

        ENOTCONN
               The socket is not connected.

Which looks to be a likely place for the error to have originated.


So this looks like it was a result of a remote (to the broker) 
connection endpoint not being present but I'm not at all sure what 
circumstances would cause this in a qpid broker. The brokers have been 
restarted and the host has been rebooted and we're still seeing these 
messages appear regularly (every few seconds) we don't have anything 
(knowingly!!) persisted.

Some more digging suggests that Socket::getName seems to be called by 
getPeername and getPeerAddress but grepping getPeername seems to 
indicate that nothing actually calls getPeername, doing the same for 
getPeerAddress indicates qpid/client/TCPConnector.cpp as the most likely 
thing to be calling it, but I'm not an expert in the code base and this 
would seem to be odd as the error is (I believe) in the broker logs.

I'm wondering if there's something fishy up with a federated connection, 
I'm guessing that could potentially show up as a client error presumably 
a source route might be implemented as if it were a client?

I'd love to know if anyone else has seen this.

regards,
Frase










---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Re: Transport endpoint not connected error???

Posted by CLIVE <cl...@ckjltd.co.uk>.
Frase,

It sounds like your endpoint is terminating the connection before accept 
has had a chance to 'accept' the TCP connection and return a valid 
socket descriptor.

Have you tried using tcpdump to determine the TCP handshakes that are 
going on (tcpdump dst port 5672)

Clive

On 22/05/2012 18:59, Fraser Adams wrote:
> Hi All,
> on one of our brokers my colleagues have started seeing errors of the 
> form:
>
> error could not accept socket: Transport endpoint is not connected 
> (qpid/sys/posix/Socket.cpp 58)
>
> Does anybody have a good idea what is likely to be causing this??
>
> The broker appears to be functioning and there are only a modest 
> number of connections to it, we'd even upped the default connection 
> limit "just in case" but the number of connections was way lower than 
> that (only a couple of dozen or so).
>
> Because the error seemed to be relating to "accept" I did wonder if 
> there was a backlog issue, but it's not as if we seem to be having 
> lots of things trying to connect at once so it shouldn't be an issue - 
> nonetheless we tried upping the backlog "just in case" again it 
> doesn't seem to have changed things.
>
> We still appear to be able to connect to the broker and use the qpid 
> tools and even created federated links, but I don't like seeing this 
> error. I've got no idea what it relates to and can't see what we might 
> be doing wrong.
>
> This is in a qpid 0.8 C++ broker.
>
> I had a look through Socket.cpp and the error seems to be kicked off 
> in getName()
>
>     int result = -1;
>     if (local) {
>         result = ::getsockname(fd, (::sockaddr*)&name, &namelen);
>     } else {
>         result = ::getpeername(fd, (::sockaddr*)&name, &namelen);
>     }
>
>     QPID_POSIX_CHECK(result);
>
> the QPID_POSIX_CHECK is line 58 which is mentioned in the error message.
>
>
> Looking through man for getsockname and getpeername getsockname 
> doesn't seem to have relevant errors but getpeername mentions:
>
>        ENOTCONN
>               The socket is not connected.
>
> Which looks to be a likely place for the error to have originated.
>
>
> So this looks like it was a result of a remote (to the broker) 
> connection endpoint not being present but I'm not at all sure what 
> circumstances would cause this in a qpid broker. The brokers have been 
> restarted and the host has been rebooted and we're still seeing these 
> messages appear regularly (every few seconds) we don't have anything 
> (knowingly!!) persisted.
>
> Some more digging suggests that Socket::getName seems to be called by 
> getPeername and getPeerAddress but grepping getPeername seems to 
> indicate that nothing actually calls getPeername, doing the same for 
> getPeerAddress indicates qpid/client/TCPConnector.cpp as the most 
> likely thing to be calling it, but I'm not an expert in the code base 
> and this would seem to be odd as the error is (I believe) in the 
> broker logs.
>
> I'm wondering if there's something fishy up with a federated 
> connection, I'm guessing that could potentially show up as a client 
> error presumably a source route might be implemented as if it were a 
> client?
>
> I'd love to know if anyone else has seen this.
>
> regards,
> Frase
>
>
>
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
>
> .
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org