You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "Robbie Gemmell (Jira)" <ji...@apache.org> on 2021/08/20 15:37:00 UTC

[jira] [Commented] (ARTEMIS-3435) ActiveMQTestBase.checkLibaio waits can cause hours to be added to failing test suite

    [ https://issues.apache.org/jira/browse/ARTEMIS-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402275#comment-17402275 ] 

Robbie Gemmell commented on ARTEMIS-3435:
-----------------------------------------

The check is operating on a static valid that is incremented and decremented as the LibaioContext instances are created and closed. If a single instance fails to be cleaned up correct, all subsequent test classes using ActiveMQTestBase fail, after 30 seconds of waiting in the cleanup.

The fact it is static means it isnt easy to account around a prior failure as change may still happen after a delayed time, and it is hard to identify whether it happened in the detecting class or a previous one that perhaps doesnt use ActiveMQTestBase, so I think the simplest thing for now is to make the detail available the clearest it can be, and fail things as fast as possible instead of taking hours extra, helping reduce the vast amount of output and narrowing down a window when the issue occurred that could then be looked at more closely.

I have prepared a patch so the after class check will now record the failure if it is >0 after 1 wait, then future classes will fail immediately without running, reporting the class that detected the issue so its logs can be checked. Prior to this they will also log if the value was 0 before the class started (in case it is a non-ActiveMQTestBase class causing it).

> ActiveMQTestBase.checkLibaio waits can cause hours to be added to failing test suite
> ------------------------------------------------------------------------------------
>
>                 Key: ARTEMIS-3435
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3435
>             Project: ActiveMQ Artemis
>          Issue Type: Test
>          Components: Tests
>    Affects Versions: 2.18.0
>            Reporter: Robbie Gemmell
>            Priority: Major
>
> I recently observered a 'full test' run take over an extra 3 hours, essentially doubling the time to 6hrs 40mins (repeated, it took 3hrs 30mins), with nearly 450 failures in the integration-tests module mostly all due to the same 'test did not close all its files 4096' reason.
> Investigating, I found that all the failures appeared linked and were likely caused by a single prior test having done something awry, with the effect then continuing to fail all subsequent tests classes (after they had run their tests, successfully or otherwise) using the same ActiveMQTestBase parent class for the same reason. Before invoking the fail it waits 30 seconds for the ActiveMQTestBase.checkLibaio check to assert a value hits 0after the test class, which it repeatedly didnt. This added 400+ 30sec waits to the test run, failing those 400+ test classes.
> I wasnt able to get all the classes in the output summary, the continued test output filled the buffer, but I got a couple snippets:
> {noformat}
> [ERROR] Failures: 
> [ERROR] QuorumFailOverTest.testQuorumVotingLiveNotDead:143->ClusterTestBase.verifyReceiveRoundRobinInSomeOrder:1167->Assert.assertNotNull:713->Assert.assertTrue:42->Assert.fail:89 msg must exist
> [ERROR] PluggableQuorumReplicaTimeoutTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] PluggableQuorumReplicatedDistributionTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] PluggableQuorumReplicatedLargeMessageFailoverTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] PluggableQuorumReplicatedLargeMessageWithDelayFailoverTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] PluggableQuorumReplicatedPagingFailoverTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] FailoverWithSharedStoreTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] MultiThreadRandomReattachTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] NettyMultiThreadRandomReattachTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] OrderReattachTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] RandomReattachTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> ...lots missing...
> {noformat}
> {noformat}
> ...lots missing...
> [ERROR] FederationBrokerPluginTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] JvmMetricsTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] MetricsPluginTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] MqttPluginTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] OpenwirePluginTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ResourceBrokerPluginTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompPluginTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] XmlConfigPluginTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ActiveMQActivationTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ActiveMQClusteredTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ActiveMQMessageHandlerSecurityTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ActiveMQMessageHandlerTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ActiveMQMessageHandlerXATest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] IgnoreJTATest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] JMSContextTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] OutgoingConnectionJTATest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] OutgoingConnectionNoJTATest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] OutgoingConnectionTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ResourceAdapterTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] BatchDelayTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] DestroyConsumerTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] DirectDeliverTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] NettyNetworkAddressTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] NettySynchronousCloseTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] PingTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ReconnectTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SynchronousCloseTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] HQClientProtocolManagerTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] PluggableQuorumReplicationFlowControlTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] PluggableQuorumReplicationOrderTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] PluggableQuorumReplicationTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ReplicationOrderTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ReplicationTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SharedNothingReplicationFlowControlTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SharedNothingReplicationTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] RestDeserializationTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] DelayedMessageTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] MultipliedDelayedMessageTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ScheduledMessageTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] BasicSecurityManagerFailoverTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] BasicSecurityManagerTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] JMSXUserIDPluginTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] NettySecurityClientTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] PluggableQuorumBasicSecurityManagerFailoverTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] RejectValidatedUserTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SecurityPerAcceptorJmsTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SecurityPerAcceptorTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SecurityTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] AIOFileLockTimeoutTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ActivationCallbackTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ActivationFailureListenerTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] AddressQueryTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] AddressQueueDeleteDelayTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] AutoCreateDeadLetterResourcesTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] AutoCreateExpiryResourcesTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ConfigurationTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ConnectionLimitTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ExpiryRunnerTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] GracefulShutdownTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] LVQRecoveryTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] LVQTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] MessageExpirationTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] NIOFileLockTimeoutTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] PotentialOOMELoggingTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] PredefinedQueueTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] QueuePeristPauseTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] QueueQueryTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ResourceLimitTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] RetroactiveAddressFailoverTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] RetroactiveAddressTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] RingQueueTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ScaleDown3NodeTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ScaleDownDirectTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ScaleDownRemoveSFTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ScaleDownTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SecureConfigurationTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SimpleStartStopTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SuppliedThreadPoolTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] TempQueueNamespaceTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SpringIntegrationTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] CoreClientOverOneWaySSLTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] CoreClientOverTwoWaySSLTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] DualAuthenticationTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SSLProviderTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SSLProviderTwoWayTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] FQQNStompTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompConnectionCleanupTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompPropertiesInterceptorTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompWebSocketMaxFrameTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompWebSocketTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompWithClientIdValidationTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompWithInterceptorsTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompWithMessageIDTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompWithRejectingInterceptorTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompWithSecurityPerAcceptorTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompWithSecurityTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] ExtraStompTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompV11Test>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompV11Test.testSubscribeWithAutoAck:1946->Assert.assertNull:748->Assert.assertNull:738->Assert.failNotNull:756->Assert.fail:89 expected null, but was:<ActiveMQMessage[ID:8bff2aab-ffae-11eb-8bd9-38baf858f51b]:PERSISTENT/ClientMessageImpl[messageID=18, durable=true, address=testQueue,userID=8bff2aab-ffae-11eb-8bd9-38baf858f51b,properties=TypedProperties[__AMQ_CID=8bfe6757-ffae-11eb-8bd9-38baf858f51b,_AMQ_ROUTING_TYPE=1]]>
> [ERROR] StompV12Test>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] StompV12Test.testAckModeAuto:1235->Assert.assertNull:748->Assert.assertNull:738->Assert.failNotNull:756->Assert.fail:89 expected null, but was:<ActiveMQMessage[ID:e364ab94-ffae-11eb-8bd9-38baf858f51b]:PERSISTENT/ClientMessageImpl[messageID=83, durable=true, address=testQueue,userID=e364ab94-ffae-11eb-8bd9-38baf858f51b,properties=TypedProperties[__AMQ_CID=e3573edf-ffae-11eb-8bd9-38baf858f51b,_AMQ_ROUTING_TYPE=1]]>
> [ERROR] StompV12Test.testRedeliveryWithClientAck:1689->Assert.assertNotNull:723->Assert.assertNotNull:713->Assert.assertTrue:42->Assert.fail:87
> [ERROR] ActiveMQFrameDecoder2Test>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] NettyConnectorWithHTTPUpgradeTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] BasicXaRecoveryTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] BasicXaTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] SessionFailureXATest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [ERROR] XaTimeoutTest>ActiveMQTestBase.checkLibaio:2061->Assert.fail:89 test did not close all its files 4096
> [INFO] 
> [ERROR] Tests run: 9282, Failures: 446, Errors: 0, Skipped: 125
> [INFO] 
> [ERROR] There are test failures.
> {noformat}
>  
> There were also various instaces of 'AMQ229062: Acceptor with id 0 already registered' near the end that may or may not be related, e.g:
> {noformat}
> [Thread-0 (ActiveMQ-scheduled-threads)] 00:09:34,169 WARN [org.apache.activemq.artemis.core.server] AMQ222103: transaction with xid XidImpl (644867901 bq:120.97.54 formatID:1 gtxid:50.101.49.57.101.99.101.48.45.102.102.98.48.45.49.49.101.98.45.56.98.100.57.45.51.56.98.97.102.56.53.56.102.53.49.98 base64:eGE2MmUxOWVjZTAtZmZiMC0xMWViLThiZDktMzhiYWY4NThmNTFiAQAAAA== timed out
> [main] 00:09:34,334 ERROR [org.apache.activemq.artemis.core.server] AMQ224104: Error starting the Acceptor invm \{serverId=0}
> [main] 00:09:34,334 ERROR [org.apache.activemq.artemis.core.server] AMQ224000: Failure in initialisation: java.lang.IllegalArgumentException: AMQ229062: Acceptor with id 0 already registered
>  at org.apache.activemq.artemis.core.remoting.impl.invm.InVMRegistry.registerAcceptor(InVMRegistry.java:32)
>  at org.apache.activemq.artemis.core.remoting.impl.invm.InVMAcceptor.start(InVMAcceptor.java:139)
>  at org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl.startAcceptors(RemotingServiceImpl.java:318)
>  at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.completeActivation(ActiveMQServerImpl.java:3347)
>  at org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.run(LiveOnlyActivation.java:78)
>  at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.internalStart(ActiveMQServerImpl.java:642)
>  at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.start(ActiveMQServerImpl.java:561)
>  at org.apache.activemq.artemis.tests.integration.xa.XaTimeoutTest.setUp(XaTimeoutTest.java:111)
> ...etc...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)