You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2016/04/05 12:59:25 UTC

[jira] [Commented] (QPID-7185) ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved fails sporadically on Apache CI

    [ https://issues.apache.org/jira/browse/QPID-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15226069#comment-15226069 ] 

ASF subversion and git services commented on QPID-7185:
-------------------------------------------------------

Commit 1737820 from [~k-wall] in branch 'java/trunk'
[ https://svn.apache.org/r1737820 ]

QPID-7185: [Java Tests] HA - Avoid possibility of a node restart during testReplicationGroupListenerHearsNodeRemoved

> ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved fails sporadically on Apache CI
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: QPID-7185
>                 URL: https://issues.apache.org/jira/browse/QPID-7185
>             Project: Qpid
>          Issue Type: Bug
>          Components: Java Tests
>            Reporter: Keith Wall
>            Priority: Minor
>             Fix For: qpid-java-6.1
>
>
> The test {{testReplicationGroupListenerHearsNodeRemoved }} failed in the following way on the Apache CI host:
> {noformat}
> org.apache.qpid.server.store.StoreException: Exception on node removal from group
> 	at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426)
> 	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getException(ReplicationGroupAdmin.java:504)
> 	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:474)
> 	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.removeMember(ReplicationGroupAdmin.java:245)
> 	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.removeNodeFromGroup(ReplicatedEnvironmentFacade.java:1284)
> 	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved(ReplicatedEnvironmentFacadeTest.java:377)
> {noformat}
> The underlying exception was as follows:
> {noformat}
> 2016-04-03 23:19:00,667 ERROR [main] o.a.q.s.u.ServerScopedRuntimeException Exception on node removal from group
> com.sleepycat.je.EnvironmentFailureException: (JE 5.0.104) (JE 5.0.104) Transaction -20 cannot execute write operations because this node is no longer a master UNEXPECTED_STATE: Unexpected internal state, may have side effects.
> 	at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:426) ~[je-5.0.104.jar:5.0.104]
> 	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.getException(ReplicationGroupAdmin.java:504) ~[je-5.0.104.jar:5.0.104]
> 	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.doMessageExchange(ReplicationGroupAdmin.java:474) ~[je-5.0.104.jar:5.0.104]
> 	at com.sleepycat.je.rep.util.ReplicationGroupAdmin.removeMember(ReplicationGroupAdmin.java:245) ~[je-5.0.104.jar:5.0.104]
> 	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.removeNodeFromGroup(ReplicatedEnvironmentFacade.java:1284) ~[classes/:na]
> 	at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacadeTest.testReplicationGroupListenerHearsNodeRemoved(ReplicatedEnvironmentFacadeTest.java:377) [test-classes/:na]
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.7.0_80]
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[na:1.7.0_80]
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_80]
> 	at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_80]
> 	at junit.framework.TestCase.runTest(TestCase.java:176) [junit-4.11.jar:na]
> 	at org.apache.qpid.test.utils.QpidTestCase.runTest(QpidTestCase.java:171) [qpid-test-utils-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
> 	at junit.framework.TestCase.runBare(TestCase.java:141) [junit-4.11.jar:na]
> 	at junit.framework.TestResult$1.protect(TestResult.java:122) [junit-4.11.jar:na]
> 	at junit.framework.TestResult.runProtected(TestResult.java:142) [junit-4.11.jar:na]
> 	at junit.framework.TestResult.run(TestResult.java:125) [junit-4.11.jar:na]
> 	at junit.framework.TestCase.run(TestCase.java:129) [junit-4.11.jar:na]
> 	at org.apache.qpid.test.utils.QpidTestCase.run(QpidTestCase.java:156) [qpid-test-utils-6.1.0-SNAPSHOT.jar:6.1.0-SNAPSHOT]
> 	at junit.framework.TestSuite.runTest(TestSuite.java:255) [junit-4.11.jar:na]
> 	at junit.framework.TestSuite.run(TestSuite.java:250) [junit-4.11.jar:na]
> 	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84) [junit-4.11.jar:na]
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) [surefire-junit4-2.17.jar:2.17]
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) [surefire-junit4-2.17.jar:2.17]
> 	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) [surefire-junit4-2.17.jar:2.17]
> 	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) [surefire-booter-2.17.jar:2.17]
> 	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) [surefire-booter-2.17.jar:2.17]
> 	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) [surefire-booter-2.17.jar:2.17]
> {noformat}
> The node that was the target of the {{ReplicationGroupAdmin.removeMember}} call was at that moment being restarted as majority had been lost.  This seems to have provoked an unexpected exception from within JE.
> The test is concerned with ensuring the listener fires correctly in response to changes in group membership.  This test can avoid the possibility of a mastership loss simply by setting designated primary to true.
> As changing the consistency of a group whilst a production system is live would be an unusual thing to do, this chances of this manifesting in production are small.  If it were to happen, a node restart would be required to restore service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org