You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@geode.apache.org by GitBox <gi...@apache.org> on 2020/06/05 16:50:32 UTC

[GitHub] [geode] kirklund commented on a change in pull request #5182: GEODE-7591: Fix for hang in ClusterDistributionManager

kirklund commented on a change in pull request #5182:
URL: https://github.com/apache/geode/pull/5182#discussion_r436033017



##########
File path: geode-core/src/distributedTest/java/org/apache/geode/distributed/internal/ClusterDistributionManagerDUnitTest.java
##########
@@ -370,6 +371,33 @@ public void testWaitForViewInstallation() {
         .untilAsserted(() -> assertThat(waitForViewInstallationDone.get()).isTrue());
   }
 
+  /**
+   * show that waitForViewInstallation works as expected when distribution manager is closed
+   * while waiting for the latest membership view to install
+   */
+  @Test
+  public void testWaitForViewInstallationDisconnectDS() {
+    InternalDistributedSystem system = getSystem();
+    ClusterDistributionManager dm = (ClusterDistributionManager) system.getDM();
+    MembershipView<InternalDistributedMember> view = dm.getDistribution().getView();
+
+    AtomicBoolean waitForViewInstallationDone = new AtomicBoolean();
+    executorService.submit(() -> {
+      try {
+        dm.waitForViewInstallation(view.getViewId() + 1);
+        waitForViewInstallationDone.set(true);
+      } catch (InterruptedException e) {
+        errorCollector.addError(e);
+      }
+    });
+
+    await().timeout(2000, TimeUnit.MILLISECONDS);

Review comment:
       I don't think await() without an until of some sort does anything. It's the until clauses that start the waiting. I recommend using a CyclicBarrier to sync up the actions of the two threads and a Future to wait for completion:
   ```
   CyclicBarrier cyclicBarrier = new CyclicBarrier(2);
   
   Future<Void> future = executorService.submit(() -> {
     try {
       cyclicBarrier.await(getTimeout().toMillis(), MILLISECONDS);
       dm.waitForViewInstallation(view.getViewId() + 1);
     } catch (InterruptedException e) {
       errorCollector.addError(e);
     }
   });
   
   cyclicBarrier.await(getTimeout().toMillis(), MILLISECONDS);
   system.disconnect();
   
   future.getTimeout().toMillis(), MILLISECONDS);
   ```
   There's still no guarantee that the executor thread will actually be deep inside dm.waitForViewInstallation when the main thread calls disconnect, but there will be some percentage of CI runs that are hitting this scenario as desired. So as long as the test remains passing 100% of the time, we're confident enough that the scenario is tested and passing.
   
   To really guarantee the exact combination of two threads being in specific places would require some crazy test hooks down in dm and/or system which would probably require some unnatural acts of coding (not recommended). Another approach is to write an IntegrationTest or UnitTest in which some of the classes involved are mocks or spies but even then it's nice to have the dunit coverage as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org