You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "Carsten Ziegeler (JIRA)" <ji...@apache.org> on 2015/11/30 09:44:11 UTC

[jira] [Closed] (SLING-5310) MinEventDelayHandler should have a cancel method

     [ https://issues.apache.org/jira/browse/SLING-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carsten Ziegeler closed SLING-5310.
-----------------------------------

> MinEventDelayHandler should have a cancel method
> ------------------------------------------------
>
>                 Key: SLING-5310
>                 URL: https://issues.apache.org/jira/browse/SLING-5310
>             Project: Sling
>          Issue Type: Improvement
>          Components: Extensions
>    Affects Versions: Discovery Commons 1.0.4
>            Reporter: Stefan Egli
>            Assignee: Stefan Egli
>            Priority: Minor
>             Fix For: Discovery Commons 1.0.6
>
>
> The {{ViewStateManagerImpl}} delegates the feature of delaying a {{TOPOLOGY_CHANGED}} event a few seconds to avoid too frequent switching when multiple instances come and go to the {{MinEventDelayHandler}}. When the ViewStateManagerImpl is stopped however (via {{handleDeactivated}}), then this is not noticed by the MinEventDelayHandler. With the result that it might happily continue in the following loop: {{triggerAsyncDelaying}} schedules a runnable to be triggered after 3 seconds by default. When that is triggered, it checks the state of the view. If the view is not current (which is typically the case after deactivation), then it reschedules itself - thinking that eventually the view would become current/stable again. This is normally the case and a good way to guarantee that eventually the view change can be announced. However after deactivation this will likely not occur and thus the MinEventDelayHandler would just spin happily onwards in this 3sec-loop forever, or until the ViewStateManager is reactivated.
> For normal operations this behavior is not a problem at all (thus priority minor)
> However, for testing this has the side-effect, that this loop will span into subsequent tests - and potentially messing with it. 
> One way of 'messing' has been noticed in the following failing test on jenkins:
> https://builds.apache.org/job/sling-trunk-1.7/org.apache.sling$org.apache.sling.discovery.impl/2751/testReport/org.apache.sling.discovery.impl.common.heartbeat/HeartbeatTest/testPartitioning/
> {code}
> java.lang.AssertionError: expected:<TOPOLOGY_INIT> but was:<TOPOLOGY_CHANGED>
> 	at org.junit.Assert.fail(Assert.java:88)
> 	at org.junit.Assert.failNotEquals(Assert.java:743)
> 	at org.junit.Assert.assertEquals(Assert.java:118)
> 	at org.junit.Assert.assertEquals(Assert.java:144)
> 	at org.apache.sling.discovery.impl.common.heartbeat.HeartbeatTest.doTestPartitioning(HeartbeatTest.java:285)
> 	at org.apache.sling.discovery.impl.common.heartbeat.HeartbeatTest.testPartitioning(HeartbeatTest.java:143)
> {code}
> where one 'issue heartbeat' operation triggered from {{doTestPartitioning}} lasted over 5 seconds:
> {code}
> 17.11.2015 23:50:37.033 *DEBUG* [main] DiscoveryServiceImpl: updateProperties: done.
> 17.11.2015 23:50:37.033 *DEBUG* [main] HeartbeatHandler: issueClusterLocalHeartbeat: storing cluster-local heartbeat to repository for fe88cbb1-f967-48c5-a58d-30fd137909cc
> 17.11.2015 23:50:42.707 *DEBUG* [main] HeartbeatHandler: issueConnectorPings: not issuing remote heartbeat yet, startup not yet finished
> 17.11.2015 23:50:42.724 *DEBUG* [main] fe88cbb1-f967-48c5-a58d-30fd137909cc: analyzeVotings: start. slingId: fe88cbb1-f967-48c5-a58d-30fd137909cc
> 17.11.2015 23:50:43.081 *DEBUG* [main] VotingHelper: listVotings: votings found: 0
> 17.11.2015 23:50:43.081 *DEBUG* [main] fe88cbb1-f967-48c5-a58d-30fd137909cc: analyzeVotings: no ongoing votings at the moment. done.
> 17.11.2015 23:50:43.082 *DEBUG* [main] HeartbeatHandler: doCheckView: established view matches with expected.
> 17.11.2015 23:50:43.082 *DEBUG* [main] HeartbeatHandler: doCheckViewWith: no pending nor winning votes. view is fine. we're all happy.
> {code}
> and the only explanation found so far was that the thread-pool that should normally process background jobs was busy with all those scheduled jobs that were left over from previous jobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)