You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@qpid.apache.org by Jen Andre <ja...@gmail.com> on 2010/02/09 23:28:37 UTC

Connection: read timed out

Hi all,

I get a connection "Read timed out" exception after a while.  I'm testing
using a modified version of a Java example (that just runs in a loop 5000
times: send 1 msg, sleep) running on Linux (connecting via localhost to the
0.5 c++ broker).  I'm using the java client libs from the 0.6 tree (though
it happens with 0.5 as well).  

After a while, a mysterious read time out appears.  Here's the error:

DEBUG   2010-02-09 18:11:11,505 org.apache.qpid.transport.Connection    main   
FLUSH: [conn:15311bd]
ERROR   2010-02-09 18:11:11,552
org.apache.qpid.client.AMQConnectionDelegate_0_10       IoReceiver -
localhost/127.0.0.1:5672   previous exception
org.apache.qpid.transport.ConnectionException: Read timed out
        at
org.apache.qpid.transport.Connection.exception(Connection.java:414)
        at
org.apache.qpid.transport.network.Assembler.exception(Assembler.java:108)
        at
org.apache.qpid.transport.network.InputHandler.exception(InputHandler.java:197)
        at
org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:145)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(Unknown Source)
        at
org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:123)
        ... 1 more
DEBUG   2010-02-09 18:11:11,554 org.apache.qpid.transport.Connection   
IoReceiver - localhost/127.0.0.1:5672   connection closed: conn:15311bd
DEBUG   2010-02-09 18:11:11,558 org.apache.qpid.jms.FailoverPolicy     
IoReceiver - localhost/127.0.0.1:5672   All failover methods exhausted
ERROR   2010-02-09 18:11:11,559
org.apache.qpid.client.AMQConnectionDelegate_0_10       IoReceiver -
localhost/127.0.0.1:5672   connection exception: conn:15311bd
org.apache.qpid.transport.ConnectionException: connection aborted
        at org.apache.qpid.transport.Connection.closed(Connection.java:437)
        at
org.apache.qpid.transport.network.Assembler.closed(Assembler.java:113)
        at
org.apache.qpid.transport.network.InputHandler.closed(InputHandler.java:202)
        at
org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:150)
        at java.lang.Thread.run(Unknown Source)


------------------

there appears to be nothing in the broker log file except the recorded
disconnect:

2010-feb-09 18:11:11 debug DISCONNECTED [127.0.0.1:58825]
2010-feb-09 18:11:11 debug guest@QPID.97865187-7ea0-498b-83e2-45aa59054241:
detached on broker.


I've attached my sample code, configuration file, and the output from the
client trace (client.log) and the broker (qpidd.log).

NOTE: this happens even when connecting to a broker outside of localhost. 
HOWEVER: if I run this program on an external machine and sends messages to
the broker, there is no timeout exception.   It also works fine with a
similar python test (loop and send 1 msg).  I think, therefore, the problem
is in the client lib (not the broker).

I'm using a linux VM with java version "1.6.0_18", if that is helpful.

Any advice on troubleshooting this would be appreciated.

http://n2.nabble.com/file/n4544370/debug.zip debug.zip 


-- 
View this message in context: http://n2.nabble.com/Connection-read-timed-out-tp4544370p4544370.html
Sent from the Apache Qpid users mailing list archive at Nabble.com.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Connection: read timed out

Posted by Rajith Attapattu <ra...@gmail.com>.
On Mon, Feb 15, 2010 at 7:50 PM, Jen Andre <ja...@gmail.com> wrote:
>
> Hello, just to let you know, I found out the problem was VMWare timekeeping
> issues with RHEL 5.2:
>
> When I applied the fix (add divider=10 and clocksource=acpi_pm to the kernel
> parameters) from the guidelines here:
> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006427
> There was no longer an issue with setSoTimeout() and heartbeating.

Nice work !
Thanks for keeping us posted. I will keep this in mind.

> View this message in context: http://n2.nabble.com/Connection-read-timed-out-tp4544370p4578036.html
> Sent from the Apache Qpid users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org
>
>



-- 
Regards,

Rajith Attapattu
Red Hat
http://rajith.2rlabs.com/

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Connection: read timed out

Posted by Jen Andre <ja...@gmail.com>.
Hello, just to let you know, I found out the problem was VMWare timekeeping
issues with RHEL 5.2:

When I applied the fix (add divider=10 and clocksource=acpi_pm to the kernel
parameters) from the guidelines here:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006427

There was no longer an issue with setSoTimeout() and heartbeating.
-- 
View this message in context: http://n2.nabble.com/Connection-read-timed-out-tp4544370p4578036.html
Sent from the Apache Qpid users mailing list archive at Nabble.com.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Connection: read timed out

Posted by Rajith Attapattu <ra...@gmail.com>.
On Thu, Feb 11, 2010 at 6:26 PM, Jen Andre <ja...@gmail.com> wrote:
>
> Thanks! I pulled the version from trunk, and indeed disabling qpid.heartbeat
> did indeed fix it.

Thanks, great to hear that !

> I also wrote a simple java TCP client that connects to a server (in my case,
> just netcat -l) and prints messages it gets and the current timestamp.  I
> ran it on the "bad server", then sent it some messages periodically. if I
> setSoTimeout() on the socket for 120000 ms, I get the
> "SocketTimeoutException: read timed out", even though the last message was
> received 13 seconds ago.  If I don't set the timeout, the program runs fine
> forever.
>
> So: something strange is happening on this VM, possibly.

I really appreciate you trying out this experiment !
It seems there is a possible bug either in the VM or the JVM impl.

>
>
> --
> View this message in context: http://n2.nabble.com/Connection-read-timed-out-tp4544370p4558127.html
> Sent from the Apache Qpid users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org
>
>



-- 
Regards,

Rajith Attapattu
Red Hat
http://rajith.2rlabs.com/

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Connection: read timed out

Posted by Jen Andre <ja...@gmail.com>.
Thanks! I pulled the version from trunk, and indeed disabling qpid.heartbeat
did indeed fix it.

I also wrote a simple java TCP client that connects to a server (in my case,
just netcat -l) and prints messages it gets and the current timestamp.  I
ran it on the "bad server", then sent it some messages periodically. if I
setSoTimeout() on the socket for 120000 ms, I get the
"SocketTimeoutException: read timed out", even though the last message was
received 13 seconds ago.  If I don't set the timeout, the program runs fine
forever. 

So: something strange is happening on this VM, possibly.   



-- 
View this message in context: http://n2.nabble.com/Connection-read-timed-out-tp4544370p4558127.html
Sent from the Apache Qpid users mailing list archive at Nabble.com.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Connection: read timed out

Posted by Rajith Attapattu <ra...@gmail.com>.
On Thu, Feb 11, 2010 at 11:15 AM, Jen Andre <ja...@gmail.com> wrote:
>
> "The "Read timed out" is due to the java client not receiving anything
> from the broker for two consecutive heartbeats. "
>
> It seems from the inner exception (java.net.SocketTimeoutException: Read
> timed out) that the error is a TCP/IP socket timeout... I added some code to
> the that prints out SO_TIMEOUT and that seems to be set to 0 (unlimited) so
> it is indeed strange.

Yes it should be a SocketTimeoutException. That is how we handle the heartbeat.
I set the so_timeout to twice the value of the heartbeat interval.
The so_timeout should not be zero.
In your case, since the heartbeat interval is 120 secs, the so_timeout
should be 240 secs.
So the socket should only through a read timeout if it doesn't see
anything on the pipe for 240 secs.

> This happens reliably (every single time it runs) *only* on this one
> machine.
>
> "However you mentioned that if you run the java client from another
> machine it doesn't happen. What is the difference btw these two
> machines? "
>
> In reality, I'm not sure what the difference is between the machines... they
> are both RHEL Linux VM images that started out the same, but one is in our
> QA environment.  The JVM is the exact same.   I can investigate CPU and
> load, but I imagine they would not be too high since neither is production
> (a quick netstat and lsof doesn't indicate a bizarre # of sockets open).
> Running the code from a windows machine connecting to linux broker results
> in no timeouts either.

We've had other clients using the heartbeats successfully and it seems
that it works well in your other env.
So it could be related to the environment.

Would you like to disable the heartbeats on this env and try?
However you would need to use the trunk version as there is bug in 0.5
and 0.6 where heartbeats cannot be disabled :(

You can disable the heartbeats as follows

As Broker property in the connection URL
=====================================
amqp://guest:guest@client/test?brokerlist='tcp://localhost?heartbeat='0'

Or as a System property
=======================
-Dqpid.heartbeat=0

Lets us know how it goes.

Regards,

Rajith

>
> --
> View this message in context: http://n2.nabble.com/Connection-read-timed-out-tp4544370p4555517.html
> Sent from the Apache Qpid users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org
>
>



-- 
Regards,

Rajith Attapattu
Red Hat
http://rajith.2rlabs.com/

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Connection: read timed out

Posted by Jen Andre <ja...@gmail.com>.
"The "Read timed out" is due to the java client not receiving anything
from the broker for two consecutive heartbeats. "

It seems from the inner exception (java.net.SocketTimeoutException: Read
timed out) that the error is a TCP/IP socket timeout... I added some code to
the that prints out SO_TIMEOUT and that seems to be set to 0 (unlimited) so
it is indeed strange.  

This happens reliably (every single time it runs) *only* on this one
machine.

"However you mentioned that if you run the java client from another
machine it doesn't happen. What is the difference btw these two
machines? "

In reality, I'm not sure what the difference is between the machines... they
are both RHEL Linux VM images that started out the same, but one is in our
QA environment.  The JVM is the exact same.   I can investigate CPU and
load, but I imagine they would not be too high since neither is production
(a quick netstat and lsof doesn't indicate a bizarre # of sockets open).  
Running the code from a windows machine connecting to linux broker results
in no timeouts either.  


-- 
View this message in context: http://n2.nabble.com/Connection-read-timed-out-tp4544370p4555517.html
Sent from the Apache Qpid users mailing list archive at Nabble.com.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Re: Connection: read timed out

Posted by Rajith Attapattu <ra...@gmail.com>.
The "Read timed out" is due to the java client not receiving anything
from the broker for two consecutive heartbeats.
I looked at the logs and was able to figure out that
1). The heartbeat interval was 120 secs (the default)
2). The client "read timeout happened" at 18:11:11
3). The last response the client received from the broker was at 18:10:47
4). The last response the broker sent (as per the broker log) was at 18:10:50

I am not sure why a "read timeout" error was thrown at 18:11:11 when
the client logs indicate a command received at 18:10:47.
The time elapsed btw the two events is less than one heartbeart interval.

Could you reproduce this consistently? You mentioned if you connect to
a broker on another machine you still get the same result.
However you mentioned that if you run the java client from another
machine it doesn't happen. What is the difference btw these two
machines?
Is it the same Java JVM? What about CPU and load on these two machines?

Rajith

On Tue, Feb 9, 2010 at 5:28 PM, Jen Andre <ja...@gmail.com> wrote:
>
> Hi all,
>
> I get a connection "Read timed out" exception after a while.  I'm testing
> using a modified version of a Java example (that just runs in a loop 5000
> times: send 1 msg, sleep) running on Linux (connecting via localhost to the
> 0.5 c++ broker).  I'm using the java client libs from the 0.6 tree (though
> it happens with 0.5 as well).
>
> After a while, a mysterious read time out appears.  Here's the error:
>
> DEBUG   2010-02-09 18:11:11,505 org.apache.qpid.transport.Connection    main
> FLUSH: [conn:15311bd]
> ERROR   2010-02-09 18:11:11,552
> org.apache.qpid.client.AMQConnectionDelegate_0_10       IoReceiver -
> localhost/127.0.0.1:5672   previous exception
> org.apache.qpid.transport.ConnectionException: Read timed out
>        at
> org.apache.qpid.transport.Connection.exception(Connection.java:414)
>        at
> org.apache.qpid.transport.network.Assembler.exception(Assembler.java:108)
>        at
> org.apache.qpid.transport.network.InputHandler.exception(InputHandler.java:197)
>        at
> org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:145)
>        at java.lang.Thread.run(Unknown Source)
> Caused by: java.net.SocketTimeoutException: Read timed out
>        at java.net.SocketInputStream.socketRead0(Native Method)
>        at java.net.SocketInputStream.read(Unknown Source)
>        at
> org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:123)
>        ... 1 more
> DEBUG   2010-02-09 18:11:11,554 org.apache.qpid.transport.Connection
> IoReceiver - localhost/127.0.0.1:5672   connection closed: conn:15311bd
> DEBUG   2010-02-09 18:11:11,558 org.apache.qpid.jms.FailoverPolicy
> IoReceiver - localhost/127.0.0.1:5672   All failover methods exhausted
> ERROR   2010-02-09 18:11:11,559
> org.apache.qpid.client.AMQConnectionDelegate_0_10       IoReceiver -
> localhost/127.0.0.1:5672   connection exception: conn:15311bd
> org.apache.qpid.transport.ConnectionException: connection aborted
>        at org.apache.qpid.transport.Connection.closed(Connection.java:437)
>        at
> org.apache.qpid.transport.network.Assembler.closed(Assembler.java:113)
>        at
> org.apache.qpid.transport.network.InputHandler.closed(InputHandler.java:202)
>        at
> org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:150)
>        at java.lang.Thread.run(Unknown Source)
>
>
> ------------------
>
> there appears to be nothing in the broker log file except the recorded
> disconnect:
>
> 2010-feb-09 18:11:11 debug DISCONNECTED [127.0.0.1:58825]
> 2010-feb-09 18:11:11 debug guest@QPID.97865187-7ea0-498b-83e2-45aa59054241:
> detached on broker.
>
>
> I've attached my sample code, configuration file, and the output from the
> client trace (client.log) and the broker (qpidd.log).
>
> NOTE: this happens even when connecting to a broker outside of localhost.
> HOWEVER: if I run this program on an external machine and sends messages to
> the broker, there is no timeout exception.   It also works fine with a
> similar python test (loop and send 1 msg).  I think, therefore, the problem
> is in the client lib (not the broker).
>
> I'm using a linux VM with java version "1.6.0_18", if that is helpful.
>
> Any advice on troubleshooting this would be appreciated.
>
> http://n2.nabble.com/file/n4544370/debug.zip debug.zip
>
>
> --
> View this message in context: http://n2.nabble.com/Connection-read-timed-out-tp4544370p4544370.html
> Sent from the Apache Qpid users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org
>
>



-- 
Regards,

Rajith Attapattu
Red Hat
http://rajith.2rlabs.com/

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org