You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Keith Wall (JIRA)" <ji...@apache.org> on 2012/05/22 17:40:42 UTC
[jira] [Comment Edited] (QPID-3912) Client failover fails to
reconnect if a previous attempted reconnection has failed 'late' in the
connection start process.
[ https://issues.apache.org/jira/browse/QPID-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281010#comment-13281010 ]
Keith Wall edited comment on QPID-3912 at 5/22/12 3:40 PM:
-----------------------------------------------------------
There is a second dimension to this defect which affects 0-8..0-9-1 only.
It becomes apparent if the failover parameters include a short connectdelay=x parameter. If x is short (<200ms) on my box, and a connection fails, there is a race condition between which means that the state transition from the previous connection (CLOSING_CONNECTION => CLOSED_CONNECTION) can occur whilst the main thread is trying to reconnect (AMQConnectionDelegate_8_0.makeBrokerConnection).
This problem manifests itself in a couple of ways:
1) "CRAM-MD5 authentication already completed". Here the above problem effectively allows a loop to form in the client which emits a stream of ProtocolInitiation messages down the wire. The Broker replies to each with a ConnectionStart, and this goes on to confuse the SASL authentication on the client. It ends with exception:
{code}
java.lang.IllegalStateException: CRAM-MD5 authentication already completed
at com.sun.security.sasl.CramMD5Client.evaluateChallenge(CramMD5Client.java:75)
at org.apache.qpid.client.handler.ConnectionSecureMethodHandler.methodReceived(ConnectionSecureMethodHandler.java:55)
at org.apache.qpid.client.handler.ClientMethodDispatcherImpl.dispatchConnectionSecure(ClientMethodDispatcherImpl.java:216)
at org.apache.qpid.framing.amqp_0_91.ConnectionSecureBodyImpl.execute(ConnectionSecureBodyImpl.java:110)
at org.apache.qpid.client.state.AMQStateManager.methodReceived(AMQStateManager.java:114)
at org.apache.qpid.client.protocol.AMQProtocolHandler.methodBodyReceived(AMQProtocolHandler.java:479)
at org.apache.qpid.client.protocol.AMQProtocolSession.methodFrameReceived(AMQProtocolSession.java:456)
at org.apache.qpid.framing.AMQMethodBodyImpl.handle(AMQMethodBodyImpl.java:97)
at org.apache.qpid.client.protocol.AMQProtocolHandler.received(AMQProtocolHandler.java:436)
at org.apache.qpid.client.protocol.AMQProtocolHandler.received(AMQProtocolHandler.java:121)
at org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:152)
{code}
2) OutOfMemoryError/Unsupported frame type: 10
{code}
IoReceiver - localhost/127.0.0.1:10000 2012-05-21 16:11:59,551 DEBUG [apache.qpid.client.AMQConnection] exceptionReceived done by:IoReceiver - localhost/127.0.0.1:10000
java.lang.OutOfMemoryError: Java heap space
at org.apache.qpid.framing.EncodingUtils.readBytes(EncodingUtils.java:941)
at org.apache.qpid.framing.AMQMethodBodyImpl.readBytes(AMQMethodBodyImpl.java:186)
at org.apache.qpid.framing.amqp_0_91.ConnectionStartBodyImpl.<init>(ConnectionStartBodyImpl.java:77)
at org.apache.qpid.framing.amqp_0_91.ConnectionStartBodyImpl$1.newInstance(ConnectionStartBodyImpl.java:44)
at org.apache.qpid.framing.amqp_0_91.MethodRegistry_0_91.convertToBody(MethodRegistry_0_91.java:214)
at org.apache.qpid.framing.AMQMethodBodyFactory.createBody(AMQMethodBodyFactory.java:44)
at org.apache.qpid.framing.AMQFrame.<init>(AMQFrame.java:45)
at org.apache.qpid.framing.AMQDataBlockDecoder.createAndPopulateFrame(AMQDataBlockDecoder.java:99)
at org.apache.qpid.codec.AMQDecoder.decodeBuffer(AMQDecoder.java:250)
at org.apache.qpid.client.protocol.AMQProtocolHandler.received(AMQProtocolHandler.java:408)
at org.apache.qpid.client.protocol.AMQProtocolHandler.received(AMQProtocolHandler.java:121)
at org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:152)
at java.lang.Thread.run(Thread.java:662)
IoReceiver - localhost/127.0.0.1:10000 2012-05-21 16:11:59,551 ERROR [qpid.client.protocol.AMQProtocolHandler] Exception processing frame
org.apache.qpid.framing.AMQFrameDecodingException: Unsupported frame type: 10
at org.apache.qpid.framing.AMQDataBlockDecoder.createAndPopulateFrame(AMQDataBlockDecoder.java:86)
at org.apache.qpid.codec.AMQDecoder.decodeBuffer(AMQDecoder.java:250)
at org.apache.qpid.client.protocol.AMQProtocolHandler.received(AMQProtocolHandler.java:408)
at org.apache.qpid.client.protocol.AMQProtocolHandler.received(AMQProtocolHandler.java:121)
at org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:152)
at java.lang.Thread.run(Thread.java:662)
{code}
was (Author: k-wall):
There is a second dimension to this defect which affects 0-8..0-9-1 only.
It becomes apparent if the failover parameters include a short connectdelay=x parameter. If x is short (<200ms) on my box, and a connection fails, there is a race condition between which means that the state transition from the previous connection (CLOSING_CONNECTION => CLOSED_CONNECTION) can occur whilst the main thread is trying to reconnect (AMQConnectionDelegate_8_0.makeBrokerConnection).
> Client failover fails to reconnect if a previous attempted reconnection has failed 'late' in the connection start process.
> --------------------------------------------------------------------------------------------------------------------------
>
> Key: QPID-3912
> URL: https://issues.apache.org/jira/browse/QPID-3912
> Project: Qpid
> Issue Type: Bug
> Components: Java Client
> Affects Versions: 0.17
> Reporter: Keith Wall
> Assignee: Keith Wall
> Priority: Minor
> Fix For: 0.17
>
>
> A client uses failover to allow their client to reconnect to a second broker in the event of failure of the primary.
> There is a defect in the Qpid Java client's failover code that means if an attempted reconnection fails 'late' in the connection start process, then the AMQConnection _closed flag get set permanently to true and this prevents all future use of the AMQConnection object, even after a successful reconnection. By 'late' I mean a failure after the TCP/IP connection has been successfully established - such as an authentication or authorisation problem that causes the Broker to decide to close the connection.
> The problem affects both 0-10 and 0-8..0-9-1 code paths.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org