You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Geode Integration (Jira)" <ji...@apache.org> on 2022/06/03 20:59:00 UTC

[jira] [Commented] (GEODE-8589) make locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted test deterministic

    [ https://issues.apache.org/jira/browse/GEODE-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17547029#comment-17547029 ] 

Geode Integration commented on GEODE-8589:
------------------------------------------

Seen in [integration-test-openjdk11 #395|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/integration-test-openjdk11/builds/395] ... see [test results|http://files.apachegeode-ci.info/builds/apache-develop-main/1.16.0-build.0310/test-results/integrationTest/1654277786/] or download [artifacts|http://files.apachegeode-ci.info/builds/apache-develop-main/1.16.0-build.0310/test-artifacts/1654277786/integrationtestfiles-openjdk11-1.16.0-build.0310.tgz].

> make locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted test deterministic
> -----------------------------------------------------------------------------------
>
>                 Key: GEODE-8589
>                 URL: https://issues.apache.org/jira/browse/GEODE-8589
>             Project: Geode
>          Issue Type: Improvement
>          Components: membership
>            Reporter: Bill Burcham
>            Priority: Major
>              Labels: membership, pull-request-available
>
> A recent commit added a new test: {{MembershipIntegrationTest.locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted()}}
> That's a good test. But it would be better if it ran faster and was less susceptible to timing issues. The problem is that the logic we are trying to test, {{GMSJoinLeave.leave()}} uses {{System.currentTimeMillis()}} to get the current time and it uses {{Thread.sleep()}} to sleep:
> {code:java}
>         long now = System.currentTimeMillis();
> ...
>             if (state.joinedMembersContacted <= 0 && ((now >= locatorGiveUpTime &&
>                 tries >= minimumRetriesBeforeBecomingCoordinator) ||
>                 state.locatorsContacted >= locators.size())) {
> ...
>             if (System.currentTimeMillis() > giveupTime) {
>               break;
>             }
> ...
>               Thread.sleep(retrySleep);
> ...
>               giveupTime = System.currentTimeMillis() + timeout;
> ...
>       if (!this.isJoined) {
>         logger.debug("giving up attempting to join the distributed system after "
>             + (System.currentTimeMillis() - startTime) + "ms");
>       }
> ...
>       if (!this.isJoined && state.hasContactedAJoinedLocator) {
>         throw new MemberStartupException("Unable to join the distributed system in "
>             + (System.currentTimeMillis() - startTime) + "ms");
>       }
> {code}
> The opportunity is to _inject_ objects that handle these two functions (time keeping and sleeping). This will enable us to artifically control these in our test! Sleeping doesn't have to take any time at all. And we can make time pass as quickly as we want to.
> For an example of how this can be done, see [PR #5422|https://github.com/apache/geode/pull/5422] for GEODE-6950.
> This particular test ({{locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted()}}) is a little bit more involved than that one in that more objects are involved. In addition to {{GMSJoinLeave}}, other classes involved in the test also need current time and need to sleep.
> When this ticket is complete:
> * functional interfaces {{MillisecondProvider}} and {{Sleeper}}, currently defined inside {{PrimaryHandler}} will be moved higher in the (internal) hierarchy for use by other membership classes
> * {{GMSJoinLeave}} will take a {{MillisecondProvider}} and {{Sleeper}} in its constructor and will delegate to those instead of calling {{System.currentTimeMillis()}} and {{Thread.sleep()}} directly
> * TBD other classes may require injection of {{MillisecondProvider}} and {{Sleeper}}
> * TBD other changes may be necessary in cases where collaborating classes currently spin up threads or synchronize between threads e.g. calling {{wait()}}
> * {{locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted()}} will no longer employ {{Thread.sleep()}} nor will it call {{CompletableFuture.get(long timeout, TimeUnit unit)}}—instead it will operate deterministically (ideally in the same thread but not necessarily) with respect to the unit under test.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)