You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Alex Rudyy (JIRA)" <ji...@apache.org> on 2015/08/05 17:50:06 UTC
[jira] [Comment Edited] (QPID-3521) failover process for the 0-8
client does not clear the pre-dispatch queue
[ https://issues.apache.org/jira/browse/QPID-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653924#comment-14653924 ]
Alex Rudyy edited comment on QPID-3521 at 8/5/15 3:50 PM:
----------------------------------------------------------
It seems that changes implemented in revision [r1693542|https://svn.apache.org/r1693542] might cause a deadlock on 0-9 path when Session is closed whilst failover is in progress. Here is the thread dump demonstrating the issue:
{noformat}
"Failover" prio=10 tid=0x00007fe0d804e000 nid=0x657c waiting on condition [0x00007fe0cf1f0000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000f41c2528> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:236)
at org.apache.qpid.client.AMQSession.drainDispatchQueue(AMQSession.java:2306)
at org.apache.qpid.client.AMQSession.drainDispatchQueueWithDispatcher(AMQSession.java:3697)
at org.apache.qpid.client.AMQSession_0_8.resubscribe(AMQSession_0_8.java:186)
at org.apache.qpid.client.AMQConnectionDelegate_8_0.resubscribeSessions(AMQConnectionDelegate_8_0.java:379)
at org.apache.qpid.client.AMQConnection.resubscribeSessions(AMQConnection.java:1387)
at org.apache.qpid.client.failover.FailoverHandler.run(FailoverHandler.java:221)
- locked <0x00000000f35412c0> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:745)
"Dispatcher-2-Conn-84" prio=10 tid=0x00007fe1341c0000 nid=0x657a waiting for monitor entry [0x00007fe0cf4f3000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.qpid.client.AMQSession$Dispatcher.dispatchMessage(AMQSession.java:3492)
- waiting to lock <0x00000000f35c8e78> (a java.lang.Object)
- locked <0x00000000f3cbea70> (a java.lang.Object)
at org.apache.qpid.client.AMQSession$Dispatcher.access$1000(AMQSession.java:3279)
at org.apache.qpid.client.AMQSession.dispatch(AMQSession.java:3272)
at org.apache.qpid.client.message.UnprocessedMessage.dispatch(UnprocessedMessage.java:54)
at org.apache.qpid.client.AMQSession$Dispatcher.run(AMQSession.java:3410)
- locked <0x00000000f3cbea70> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:745)
"main" prio=10 tid=0x00007fe134008800 nid=0x58f1 waiting for monitor entry [0x00007fe13d822000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.qpid.client.AMQSession.close(AMQSession.java:728)
- waiting to lock <0x00000000f35412c0> (a java.lang.Object)
- locked <0x00000000f35c8e78> (a java.lang.Object)
at org.apache.qpid.client.AMQSession.close(AMQSession.java:447)
at org.apache.qpid.client.failover.FailoverBehaviourTest.sessionCloseWhileFailoverImpl(FailoverBehaviourTest.java:1705)
at org.apache.qpid.client.failover.FailoverBehaviourTest.testClientAcknowledgedSessionCloseWhileFailover(FailoverBehaviourTest.java:702)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:176)
at org.apache.qpid.test.utils.QpidTestCase.runTest(QpidTestCase.java:171)
at junit.framework.TestCase.runBare(TestCase.java:141)
at org.apache.qpid.test.utils.QpidBrokerTestCase.runBare(QpidBrokerTestCase.java:332)
at junit.framework.TestResult$1.protect(TestResult.java:122)
at junit.framework.TestResult.runProtected(TestResult.java:142)
at junit.framework.TestResult.run(TestResult.java:125)
at junit.framework.TestCase.run(TestCase.java:129)
at org.apache.qpid.test.utils.QpidTestCase.run(QpidTestCase.java:156)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
{noformat}
On the thread dump above Session#close() is invoked from the "main" thread. As part of Session#close() _messageDeliveryLock is acquired and main thread is waiting for the failover mutex which is acquired by "Failover" thread which is waiting for a Dispatcher thread to drain the pre-dispatch queue. However, Dispatcher thread requires _messageDeliveryLock to perform the clean up. Thus, it is in BLOCKED state causing the application hang.
was (Author: alex.rufous):
It seems that might changes causes the deadlock on 0-9 path when Session is closed whilst failover is in progress:
{noformat}
"Failover" prio=10 tid=0x00007fe0d804e000 nid=0x657c waiting on condition [0x00007fe0cf1f0000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000f41c2528> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:236)
at org.apache.qpid.client.AMQSession.drainDispatchQueue(AMQSession.java:2306)
at org.apache.qpid.client.AMQSession.drainDispatchQueueWithDispatcher(AMQSession.java:3697)
at org.apache.qpid.client.AMQSession_0_8.resubscribe(AMQSession_0_8.java:186)
at org.apache.qpid.client.AMQConnectionDelegate_8_0.resubscribeSessions(AMQConnectionDelegate_8_0.java:379)
at org.apache.qpid.client.AMQConnection.resubscribeSessions(AMQConnection.java:1387)
at org.apache.qpid.client.failover.FailoverHandler.run(FailoverHandler.java:221)
- locked <0x00000000f35412c0> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:745)
"Dispatcher-2-Conn-84" prio=10 tid=0x00007fe1341c0000 nid=0x657a waiting for monitor entry [0x00007fe0cf4f3000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.qpid.client.AMQSession$Dispatcher.dispatchMessage(AMQSession.java:3492)
- waiting to lock <0x00000000f35c8e78> (a java.lang.Object)
- locked <0x00000000f3cbea70> (a java.lang.Object)
at org.apache.qpid.client.AMQSession$Dispatcher.access$1000(AMQSession.java:3279)
at org.apache.qpid.client.AMQSession.dispatch(AMQSession.java:3272)
at org.apache.qpid.client.message.UnprocessedMessage.dispatch(UnprocessedMessage.java:54)
at org.apache.qpid.client.AMQSession$Dispatcher.run(AMQSession.java:3410)
- locked <0x00000000f3cbea70> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:745)
"main" prio=10 tid=0x00007fe134008800 nid=0x58f1 waiting for monitor entry [0x00007fe13d822000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.qpid.client.AMQSession.close(AMQSession.java:728)
- waiting to lock <0x00000000f35412c0> (a java.lang.Object)
- locked <0x00000000f35c8e78> (a java.lang.Object)
at org.apache.qpid.client.AMQSession.close(AMQSession.java:447)
at org.apache.qpid.client.failover.FailoverBehaviourTest.sessionCloseWhileFailoverImpl(FailoverBehaviourTest.java:1705)
at org.apache.qpid.client.failover.FailoverBehaviourTest.testClientAcknowledgedSessionCloseWhileFailover(FailoverBehaviourTest.java:702)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:176)
at org.apache.qpid.test.utils.QpidTestCase.runTest(QpidTestCase.java:171)
at junit.framework.TestCase.runBare(TestCase.java:141)
at org.apache.qpid.test.utils.QpidBrokerTestCase.runBare(QpidBrokerTestCase.java:332)
at junit.framework.TestResult$1.protect(TestResult.java:122)
at junit.framework.TestResult.runProtected(TestResult.java:142)
at junit.framework.TestResult.run(TestResult.java:125)
at junit.framework.TestCase.run(TestCase.java:129)
at org.apache.qpid.test.utils.QpidTestCase.run(QpidTestCase.java:156)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
{noformat}
On the thread dump above Session#close() is invoked from the "main" thread. As part of Session#close() _messageDeliveryLock was acquired and main thread is waiting for the failover mutex which is acquired by "Failover" thread which is waiting for a Dispatcher thread to drain the pre-dispatch queue. However, Dispatcher thread requires _messageDeliveryLock to perform the clean up. Thus, it is in BLOCKED state causing the application hang.
> failover process for the 0-8 client does not clear the pre-dispatch queue
> -------------------------------------------------------------------------
>
> Key: QPID-3521
> URL: https://issues.apache.org/jira/browse/QPID-3521
> Project: Qpid
> Issue Type: Bug
> Components: Java Client
> Reporter: Robbie Gemmell
> Assignee: Keith Wall
> Labels: failover
> Attachments: clear-dispatch-queue-on-failover.diff
>
>
> failover process for the 0-8 client does not clear the pre-dispatch queue, only the consumer receive queue.
> This is currently masked by an issue with the rollbackMark. The changes made in QPID-3546 to fix the 0-10 client path need to be applied to the 0-8/9/9-1 client path when this issue is resolved.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org