You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Rajith Attapattu (JIRA)" <ji...@apache.org> on 2011/04/20 22:36:10 UTC

[jira] [Issue Comment Edited] (QPID-3214) Deadlock between the failover mutex (in AMQConnection.java) and the current_exception_lock (in AMQSession.java)

    [ https://issues.apache.org/jira/browse/QPID-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022341#comment-13022341 ] 

Rajith Attapattu edited comment on QPID-3214 at 4/20/11 8:34 PM:
-----------------------------------------------------------------

After looking at the code, I believe the exception handling is not exactly correct.

1. For starters we shouldn't really notify the connection's exception listener when there is a session level error. If we avoid that we can certainly get rid of the deadlock. The JMS API doc clearly states,
   "If a JMS provider detects a serious problem with a Connection  object, it informs the Connection object's ExceptionListener, if one has been registered. It does this by calling the listener's onException method, passing it a JMSException argument describing the problem. "

  So certainly execution exceptions which only invalidates the session should not be notified via the exception listener.

2. Going further if one looks at the AMQConnection.java exceptionReceived() method, it again looks wrong to me.
   It seems that an execution exception can in fact close the underlying connection (and the rest of the sessions) due to the following piece of logic.

   if (hardError(cause))
   {
       closer = (!_closed.getAndSet(true)) || closer;
       {
            _logger.info("Closing AMQConnection due to :" + cause);
       }
   }

   Digging further for any AMQException hardError method will return true.
   Therefore we need to rework that piece of code.

3. The 0-10 code client (if using the Java IO transport) has a separate code path to notify the Connection if a connection level error is there.
   Therefore we definitely do not need to notify anything from the session.

I believe the MINA transport notifies the IOException etc via the ProtocolSession, perhaps the reason for this code in the first place.
For now I believe we need to do two things.

1. Remove the following line from AMQSession_0_10.setCurrentException(SessionException se) method 
   _connection.exceptionReceived(amqe);

2. Fix the AMQConnection.isHardError method and the AMQException.isHardError method.



      was (Author: rajith):
    After looking at the code, I believe the exception handling is not exactly correct.

1. For starters we shouldn't really notify the connection's exception listener when there is a session level error. If we avoid that we can certainly get rid of the deadlock. The JMS API doc clearly states,
   "If a JMS provider detects a serious problem with a Connection  object, it informs the Connection object's ExceptionListener, if one has been registered. It does this by calling the listener's onException method, passing it a JMSException argument describing the problem. "

  So certainly execution exceptions which only invalidates the session should not be notified via the exception listener.

2. Going further if one looks at the AMQConnection.java exceptionReceived() method, it again looks wrong to me.
   It seems that an execution exception can in fact close the underlying connection (and the rest of the sessions) due to the following piece of logic.

   if (hardError(cause))
   {
       closer = (!_closed.getAndSet(true)) || closer;
       {
            _logger.info("Closing AMQConnection due to :" + cause);
       }
   }

   Digging further for any AMQException hardError method will return true.
   Therefore we need to rework that piece of code.

3. The 0-10 code client (if using the Java IO transport) has a separate code path to notify the Connection if a connection level error is there.
   Therefore we definitely do not need to notify anything from the session.

I believe the MINA transport notifies the IOException etc via the ProtocolSession, perhaps the reason for this code in the first place.
For now I believe we need to do two things.

1. Remove the following lines from AMQSession_0_10.setCurrentException(SessionException se) method 
   AMQException amqe = new AMQException(AMQConstant.getConstant(code), se.getMessage(), se.getCause());
   _connection.exceptionReceived(amqe);

2. Fix the AMQConnection.isHardError method and the AMQException.isHardError method.


  
> Deadlock between the failover mutex (in AMQConnection.java) and the current_exception_lock (in AMQSession.java)
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: QPID-3214
>                 URL: https://issues.apache.org/jira/browse/QPID-3214
>             Project: Qpid
>          Issue Type: Bug
>          Components: Java Client
>    Affects Versions: 0.10
>            Reporter: Rajith Attapattu
>            Assignee: Rajith Attapattu
>            Priority: Critical
>             Fix For: 0.11
>
>         Attachments: QPID-3214.patch
>
>
> As per the following thread dump you can clearly see the deadlock between the failover mutex in AMQConnection.java and the current_exception_lock in AMQSession.java
> This is a regression and was introduced in rev 985262
>   Found one Java-level deadlock:
>   =============================
>   "IoReceiver - localhost/127.0.0.1:15672":
>     waiting to lock monitor 0x0000002ac2ea3b70 (object 0x0000002ab70156b0, a java.lang.Object),
>     which is held by "main"
>   "main":
>     waiting to lock monitor 0x0000002ac28db1b8 (object 0x0000002ab7048d70, a java.lang.Object),
>     which is held by "IoReceiver - localhost/127.0.0.1:15672"
>   Java stack information for the threads listed above:
>   ===================================================
>   "IoReceiver - localhost/127.0.0.1:15672":
>      at org.apache.qpid.client.AMQConnection.exceptionReceived(AMQConnection.java:1297)
>      - waiting to lock<0x0000002ab70156b0>  (a java.lang.Object)
>      at org.apache.qpid.client.AMQSession_0_10.setCurrentException(AMQSession_0_10.java:1033)
>      - locked<0x0000002ab7048d70>  (a java.lang.Object)
>      at org.apache.qpid.client.AMQSession_0_10.exception(AMQSession_0_10.java:913)
>      at org.apache.qpid.transport.SessionDelegate.executionException(SessionDelegate.java:156)
>      at org.apache.qpid.transport.SessionDelegate.executionException(SessionDelegate.java:32)
>      at org.apache.qpid.transport.ExecutionException.dispatch(ExecutionException.java:112)
>      at org.apache.qpid.transport.SessionDelegate.command(SessionDelegate.java:50)
>      at org.apache.qpid.transport.SessionDelegate.command(SessionDelegate.java:32)
>      at org.apache.qpid.transport.Method.delegate(Method.java:159)
>      at org.apache.qpid.transport.Session.received(Session.java:528)
>      at org.apache.qpid.transport.Connection.dispatch(Connection.java:404)
>      at org.apache.qpid.transport.ConnectionDelegate.handle(ConnectionDelegate.java:64)
>      at org.apache.qpid.transport.ConnectionDelegate.handle(ConnectionDelegate.java:40)
>      at org.apache.qpid.transport.MethodDelegate.executionException(MethodDelegate.java:110)
>      at org.apache.qpid.transport.ExecutionException.dispatch(ExecutionException.java:112)
>      at org.apache.qpid.transport.ConnectionDelegate.command(ConnectionDelegate.java:54)
>      at org.apache.qpid.transport.ConnectionDelegate.command(ConnectionDelegate.java:40)
>      at org.apache.qpid.transport.Method.delegate(Method.java:159)
>      at org.apache.qpid.transport.Connection.received(Connection.java:369)
>      at org.apache.qpid.transport.Connection.received(Connection.java:59)
>      at org.apache.qpid.transport.network.Assembler.emit(Assembler.java:95)
>      at org.apache.qpid.transport.network.Assembler.assemble(Assembler.java:196)
>      at org.apache.qpid.transport.network.Assembler.frame(Assembler.java:129)
>      at org.apache.qpid.transport.network.Frame.delegate(Frame.java:133)
>      at org.apache.qpid.transport.network.Assembler.received(Assembler.java:100)
>      at org.apache.qpid.transport.network.Assembler.received(Assembler.java:42)
>      at org.apache.qpid.transport.network.InputHandler.next(InputHandler.java:187)
>      at org.apache.qpid.transport.network.InputHandler.received(InputHandler.java:103)
>      at org.apache.qpid.transport.network.InputHandler.received(InputHandler.java:42)
>      at org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:128)
>      at java.lang.Thread.run(Thread.java:619)
>   "main":
>      at org.apache.qpid.client.AMQSession_0_10.setCurrentException(AMQSession_0_10.java:1025)
>      - waiting to lock<0x0000002ab7048d70>  (a java.lang.Object)
>      at org.apache.qpid.client.BasicMessageConsumer_0_10.sendCancel(BasicMessageConsumer_0_10.java:193)
>      at org.apache.qpid.client.BasicMessageConsumer.close(BasicMessageConsumer.java:573)
>      - locked<0x0000002ab70156b0>  (a java.lang.Object)
>      at org.apache.qpid.client.BasicMessageConsumer.close(BasicMessageConsumer.java:535)
>      at org.apache.qpid.client.AMQQueueBrowser.close(AMQQueueBrowser.java:102)
>      at org.apache.qpid.test.client.QueueBrowserAutoAckTest.testFailoverWithQueueBrowser(QueueBrowserAutoAckTest.java:501)
>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>      at java.lang.reflect.Method.invoke(Method.java:597)
>      at junit.framework.TestCase.runTest(TestCase.java:154)
>      at junit.framework.TestCase.runBare(TestCase.java:127)
>      at org.apache.qpid.test.utils.QpidBrokerTestCase.runBare(QpidBrokerTestCase.java:234)
>      at junit.framework.TestResult$1.protect(TestResult.java:106)
>      at junit.framework.TestResult.runProtected(TestResult.java:124)
>      at junit.framework.TestResult.run(TestResult.java:109)
>      at junit.framework.TestCase.run(TestCase.java:118)
>      at org.apache.qpid.test.utils.QpidTestCase.run(QpidTestCase.java:120)
>      at junit.framework.TestSuite.runTest(TestSuite.java:208)
>      at junit.framework.TestSuite.run(TestSuite.java:203)
>      at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>      at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>      at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:546)
>   Found 1 deadlock

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org