You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@geode.apache.org by GitBox <gi...@apache.org> on 2022/04/20 00:00:26 UTC

[GitHub] [geode] boglesby opened a new pull request, #7610: GEODE-10250: Drop lock request if member departed

boglesby opened a new pull request, #7610:
URL: https://github.com/apache/geode/pull/7610

   If a member requests a distributed lock, then departs the distributed
   system, the grantor shouldn't grant that request. If so, it'll get
   into a state where the lock is held by a member that has departed.
   
   <!-- Thank you for submitting a contribution to Apache Geode. -->
   
   <!-- In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken: 
   -->
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
   
   - [ ] Has your PR been rebased against the latest commit within the target branch (typically `develop`)?
   
   - [ ] Is your initial contribution a single, squashed commit?
   
   - [ ] Does `gradlew build` run cleanly?
   
   - [ ] Have you written or updated unit tests to verify your changes?
   
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   
   <!-- Note:
   Please ensure that once the PR is submitted, check Concourse for build issues and
   submit an update to your PR as soon as possible. If you need help, please send an
   email to dev@geode.apache.org.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [geode] metatype commented on pull request #7610: GEODE-10250: Drop lock request if member departed

Posted by GitBox <gi...@apache.org>.
metatype commented on PR #7610:
URL: https://github.com/apache/geode/pull/7610#issuecomment-1172786024

   Closing this out due to age.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [geode] metatype closed pull request #7610: GEODE-10250: Drop lock request if member departed

Posted by GitBox <gi...@apache.org>.
metatype closed pull request #7610: GEODE-10250: Drop lock request if member departed
URL: https://github.com/apache/geode/pull/7610


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [geode] kirklund commented on pull request #7610: GEODE-10250: Drop lock request if member departed

Posted by GitBox <gi...@apache.org>.
kirklund commented on PR #7610:
URL: https://github.com/apache/geode/pull/7610#issuecomment-1113476682

   There might be some new or old corner case that prevents the memberDeparted event from clearing the lock.
   
   Originally, the dlock service would grant the lock and then check if the member is still present in the local view -- if not release it. If the member was still in the local view, then it would own the lock. Afterwards, a memberDeparted event or a releaseLock message would release that lock.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [geode] agingade commented on pull request #7610: GEODE-10250: Drop lock request if member departed

Posted by GitBox <gi...@apache.org>.
agingade commented on PR #7610:
URL: https://github.com/apache/geode/pull/7610#issuecomment-1110035576

   Do we need unit tests? or already exists...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [geode] kirklund commented on a diff in pull request #7610: GEODE-10250: Drop lock request if member departed

Posted by GitBox <gi...@apache.org>.
kirklund commented on code in PR #7610:
URL: https://github.com/apache/geode/pull/7610#discussion_r857858872


##########
geode-core/src/distributedTest/java/org/apache/geode/distributed/RequestDistributedLockWhileClosingCacheDistributedTest.java:
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.distributed;
+
+import static org.apache.geode.distributed.ConfigurationProperties.LOG_LEVEL;
+import static org.assertj.core.api.Assertions.assertThat;
+
+import java.io.Serializable;
+import java.util.Properties;
+import java.util.concurrent.TimeoutException;
+import java.util.stream.Stream;
+
+import org.junit.Rule;
+import org.junit.Test;
+
+import org.apache.geode.distributed.internal.ClusterDistributionManager;
+import org.apache.geode.distributed.internal.DistributionManager;
+import org.apache.geode.distributed.internal.DistributionMessage;
+import org.apache.geode.distributed.internal.DistributionMessageObserver;
+import org.apache.geode.distributed.internal.MembershipListener;
+import org.apache.geode.distributed.internal.locks.DLockGrantor;
+import org.apache.geode.distributed.internal.locks.DLockRequestProcessor;
+import org.apache.geode.distributed.internal.locks.DLockService;
+import org.apache.geode.distributed.internal.membership.InternalDistributedMember;
+import org.apache.geode.test.awaitility.GeodeAwaitility;
+import org.apache.geode.test.dunit.AsyncInvocation;
+import org.apache.geode.test.dunit.rules.ClusterStartupRule;
+import org.apache.geode.test.dunit.rules.DistributedBlackboard;
+import org.apache.geode.test.dunit.rules.MemberVM;
+import org.apache.geode.test.junit.rules.serializable.SerializableTestName;
+
+public class RequestDistributedLockWhileClosingCacheDistributedTest implements Serializable {
+
+  @Rule
+  public ClusterStartupRule cluster = new ClusterStartupRule();
+
+  @Rule
+  public SerializableTestName testName = new SerializableTestName();
+
+  @Rule
+  public DistributedBlackboard blackboard = new DistributedBlackboard();
+
+  private static final String ABOUT_TO_PROCESS_LOCK_REQUEST = "ABOUT_TO_PROCESS_LOCK_REQUEST";
+
+  private static final String MEMBER_DEPARTED = "MEMBER_DEPARTED";
+
+  @Test
+  public void testRequestDistributedLockWhileClosingCache() throws InterruptedException {
+    // Init Blackboard
+    blackboard.initBlackboard();
+
+    // Start the locator
+    MemberVM locator = cluster.startLocatorVM(0);
+
+    // Start the grantor server
+    Properties properties = new Properties();
+    properties.setProperty(LOG_LEVEL, "fine");
+    MemberVM grantorServer = cluster.startServerVM(1, properties, locator.getPort());
+
+    // Become lock grantor for the DistributedLockService
+    grantorServer.invoke(this::becomeLockGrantor);
+
+    // Start the lock requesting server
+    MemberVM lockRequestingServer = cluster.startServerVM(2, properties, locator.getPort());
+
+    // Add the DistributionMessageObserver
+    String lockName = testName.getMethodName() + "_lockName";
+    Stream.of(grantorServer, lockRequestingServer)
+        .forEach(server -> server
+            .invoke(() -> addDistributionMessageObserverAndMembershipListener(lockName)));
+
+    // Asynchronously disconnect the distributed system in the lock requesting server
+    AsyncInvocation asyncDisconnectDistributedSystem =
+        lockRequestingServer.invokeAsync(this::disconnectDistributedSystem);
+
+    // Asynchronously request a lock in the lock requesting server
+    AsyncInvocation asyncGetLock =
+        lockRequestingServer.invokeAsync(() -> requestDistributedLock(lockName));
+
+    // Wait for both invocations to complete
+    asyncDisconnectDistributedSystem.await();
+    asyncGetLock.await();
+
+    // Verify the lock requesting server is disconnected
+    lockRequestingServer.invoke(this::verifyCacheIsClosed);
+
+    // Verify the grantor server holds no locks for the lock requesting server
+    grantorServer.invoke(() -> verifyLockServiceDoesNotHoldToken(lockName));
+  }
+
+  private void becomeLockGrantor() {
+    ClusterStartupRule.getCache().getPartitionedRegionLockService().becomeLockGrantor();
+  }
+
+  private void addDistributionMessageObserverAndMembershipListener(String lockName) {
+    TestDistributionMessageObserver observer = new TestDistributionMessageObserver(lockName);
+    DistributionMessageObserver.setInstance(observer);
+    DistributionManager distributionManager =
+        ClusterStartupRule.getCache().getInternalDistributedSystem().getDistributionManager();
+    distributionManager.addMembershipListener(observer);
+  }
+
+  private void disconnectDistributedSystem() throws InterruptedException, TimeoutException {
+    // Wait for the grantor to signal ABOUT_TO_PROCESS_LOCK_REQUEST
+    blackboard.waitForGate(ABOUT_TO_PROCESS_LOCK_REQUEST);
+
+    // Disconnect the distributed system
+    ClusterStartupRule.getCache().getInternalDistributedSystem().disconnect();
+  }
+
+  private void requestDistributedLock(String lockName) {
+    // Request a distributed lock from the partitioned region lock service
+    DistributedLockService service =
+        ClusterStartupRule.getCache().getPartitionedRegionLockService();
+    try {
+      service.lock(lockName, GeodeAwaitility.getTimeout().toMillis(), -1);
+    } catch (Exception e) {
+      /* ignore */
+    }
+  }
+
+  private void verifyCacheIsClosed() {
+    // Verify the cache is closed
+    assertThat(ClusterStartupRule.getCache().isClosed()).isTrue();
+  }
+
+  private void verifyLockServiceDoesNotHoldToken(Object lockName) {
+    // Verify no the lock token is not held by the grantor server after the lock requesting server
+    // has departed
+    DLockService dLockService =
+        (DLockService) ClusterStartupRule.getCache().getPartitionedRegionLockService();
+    assertThat(dLockService.isLockGrantor()).isTrue();
+    DLockGrantor dLockGrantor = dLockService.getGrantor();
+    DLockGrantor.DLockGrantToken grantToken = dLockGrantor.getGrantToken(lockName);
+    // assertThat(grantToken).isNull();
+  }
+
+  class TestDistributionMessageObserver extends DistributionMessageObserver implements
+      MembershipListener {
+
+    private final String lockName;
+
+    public TestDistributionMessageObserver(String lockName) {
+      this.lockName = lockName;
+    }
+
+    public void memberDeparted(DistributionManager distributionManager,
+        InternalDistributedMember id, boolean crashed) {
+      // Signal the member has departed. This will cause the DLockRequestMessage to be processed.
+      blackboard.signalGate(MEMBER_DEPARTED);
+    }
+
+    public void beforeProcessMessage(ClusterDistributionManager dm, DistributionMessage message) {
+      if (message instanceof DLockRequestProcessor.DLockRequestMessage) {
+        DLockRequestProcessor.DLockRequestMessage dLockRequestMessage =
+            (DLockRequestProcessor.DLockRequestMessage) message;
+        if (dLockRequestMessage.getObjectName().equals(this.lockName)) {
+          // Signal the about to process lock request. This will cause the lock requesting server to
+          // disconnect its distributed system.
+          blackboard.signalGate(ABOUT_TO_PROCESS_LOCK_REQUEST);
+
+          // Wait for member departed before processing the DLockRequestMessage
+          try {
+            blackboard.waitForGate(MEMBER_DEPARTED);
+          } catch (Exception e) {
+            throw new RuntimeException(e);
+          }

Review Comment:
   Please use `DistributedErrorCollector` for any exceptions in an asynchronous context like this instead of just rethrowing it. It'll collect up all exceptions from all DUnit VMs and report them when the test finishes (including causing it to "FAIL" instead of "PASS"). Rethrowing might or probably results in a failure depending on the surrounding product call that invokes this code but the error collector is a lot cleaner. 
   
   Non-dunit tests can use JUnit's `ErrorCollector`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [geode] boglesby commented on pull request #7610: GEODE-10250: Drop lock request if member departed

Posted by GitBox <gi...@apache.org>.
boglesby commented on PR #7610:
URL: https://github.com/apache/geode/pull/7610#issuecomment-1113500376

   Its definitely a corner case. 
   
   Here is the sequence:
   
   1. The lock requesting server requests a lock
   2. Before the grantor server processes that request, the lock requesting server leaves the distributed system
   3. The DistributionManager in the grantor server removes the member and logs the `Member gracefully left` message
   4. The DLockGrantor in the grantor server begins to process the lock request
   5. The DLockGrantor's membersDepartedTime map is updated with an entry for that member
   6. The grantAndRespondToRequest method calls isCurrentMember which returns true because the member is still in the view
   7. The DLockGrantor grants the lock
   8. The view is updated to remove the departed member
   
   At the end of this sequence, the lock is held by a member who has departed.
   
   Here is some logging that shows the sequence from the grantor server's point of view.
   
   Original view:
   ```
   [vm1] [warn 2022/04/28 12:57:27.594 PDT server-1 <unicast receiver,host-47928> tid=0x27] XXX MembershipView.<init> view=View[host(locator-0:46041:locator)<ec><v0>:41001|2] members: [host(locator-0:46041:locator)<ec><v0>:41001, host(server-1:46042)<v1>:41002{lead}, host(server-2:46043)<v2>:41003]
   ```
   Lock requesting server departs:
   ```
   [vm1] [debug 2022/04/28 12:57:29.201 PDT server-1 <Pooled High Priority Message Processor 3> tid=0x4f] DistributionManager: removing member <host(server-2:46043)<v2>:41003>; crashed false; reason = shutdown message received
   
   [vm1] [info 2022/04/28 12:57:29.201 PDT server-1 <Pooled High Priority Message Processor 3> tid=0x4f] Member at host(server-2:46043)<v2>:41003 gracefully left the distributed cache: shutdown message received
   ```
   The membersDepartedTime map is updated:
   ```
   [vm1] [info 2022/04/28 12:57:29.221 PDT server-1 <Pooled Waiting Message Processor 1> tid=0x2f] XXX DLockGrantor.recordMemberDepartedTime recorded membersDepartedTime owner=host(server-2:46043)<v2>:41003; currentTime=1651175849221; membersDepartedTimeSize=1; membersDepartedTime={host(server-2:46043)<v2>:41003=1651175849221}
   ```
   The grantAndRespondToRequest method checks is current member which returns true:
   ```
   [vm1] [warn 2022/04/28 12:57:29.259 PDT server-1 <Pooled Waiting Message Processor 1> tid=0x2f] XXX DLockGrantToken.grantAndRespondToRequest currentMember=host(server-2:46043)<v2>:41003; isCurrentMember=true
   ```
   The DLockGrantor grants the lock:
   ```
   [vm1] [debug 2022/04/28 12:57:29.260 PDT server-1 <Pooled Waiting Message Processor 2> tid=0x4f] Sending (DLockRequestProcessor.DLockResponseMessage responding GRANT; serviceName=__PRLS(version 3); objectName=testRequestDistributedLockWhileClosingCache_lockName; responseCode=0; keyIfFailed=null; leaseExpireTime=9223372036854775807; processorId=21; lockId=21) to 1 peers ([host(server-2:46043)<v2>:41003]) via tcp/ip
   ```
   The view is updated:
   ```
   [vm1] [warn 2022/04/28 12:57:29.532 PDT server-1 <unicast receiver,host-47928> tid=0x27] XXX MembershipView.<init> view=View[host(locator-0:46041:locator)<ec><v0>:41001|3] members: [host(locator-0:46041:locator)<ec><v0>:41001, host(server-1:46042)<v1>:41002{lead}]  shutdown: [host(server-2:46043)<v2>:41003]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [geode] boglesby commented on pull request #7610: GEODE-10250: Drop lock request if member departed

Posted by GitBox <gi...@apache.org>.
boglesby commented on PR #7610:
URL: https://github.com/apache/geode/pull/7610#issuecomment-1110190707

   We definitely need unit tests, but this fix doesn't work yet. The distributed tests are timing out because of this change. I want it to work before I write any unit tests for it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [geode] boglesby commented on a diff in pull request #7610: GEODE-10250: Drop lock request if member departed

Posted by GitBox <gi...@apache.org>.
boglesby commented on code in PR #7610:
URL: https://github.com/apache/geode/pull/7610#discussion_r857929160


##########
geode-core/src/distributedTest/java/org/apache/geode/distributed/RequestDistributedLockWhileClosingCacheDistributedTest.java:
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.distributed;
+
+import static org.apache.geode.distributed.ConfigurationProperties.LOG_LEVEL;
+import static org.assertj.core.api.Assertions.assertThat;
+
+import java.io.Serializable;
+import java.util.Properties;
+import java.util.concurrent.TimeoutException;
+import java.util.stream.Stream;
+
+import org.junit.Rule;
+import org.junit.Test;
+
+import org.apache.geode.distributed.internal.ClusterDistributionManager;
+import org.apache.geode.distributed.internal.DistributionManager;
+import org.apache.geode.distributed.internal.DistributionMessage;
+import org.apache.geode.distributed.internal.DistributionMessageObserver;
+import org.apache.geode.distributed.internal.MembershipListener;
+import org.apache.geode.distributed.internal.locks.DLockGrantor;
+import org.apache.geode.distributed.internal.locks.DLockRequestProcessor;
+import org.apache.geode.distributed.internal.locks.DLockService;
+import org.apache.geode.distributed.internal.membership.InternalDistributedMember;
+import org.apache.geode.test.awaitility.GeodeAwaitility;
+import org.apache.geode.test.dunit.AsyncInvocation;
+import org.apache.geode.test.dunit.rules.ClusterStartupRule;
+import org.apache.geode.test.dunit.rules.DistributedBlackboard;
+import org.apache.geode.test.dunit.rules.MemberVM;
+import org.apache.geode.test.junit.rules.serializable.SerializableTestName;
+
+public class RequestDistributedLockWhileClosingCacheDistributedTest implements Serializable {
+
+  @Rule
+  public ClusterStartupRule cluster = new ClusterStartupRule();
+
+  @Rule
+  public SerializableTestName testName = new SerializableTestName();
+
+  @Rule
+  public DistributedBlackboard blackboard = new DistributedBlackboard();
+
+  private static final String ABOUT_TO_PROCESS_LOCK_REQUEST = "ABOUT_TO_PROCESS_LOCK_REQUEST";
+
+  private static final String MEMBER_DEPARTED = "MEMBER_DEPARTED";
+
+  @Test
+  public void testRequestDistributedLockWhileClosingCache() throws InterruptedException {
+    // Init Blackboard
+    blackboard.initBlackboard();
+
+    // Start the locator
+    MemberVM locator = cluster.startLocatorVM(0);
+
+    // Start the grantor server
+    Properties properties = new Properties();
+    properties.setProperty(LOG_LEVEL, "fine");
+    MemberVM grantorServer = cluster.startServerVM(1, properties, locator.getPort());
+
+    // Become lock grantor for the DistributedLockService
+    grantorServer.invoke(this::becomeLockGrantor);
+
+    // Start the lock requesting server
+    MemberVM lockRequestingServer = cluster.startServerVM(2, properties, locator.getPort());
+
+    // Add the DistributionMessageObserver
+    String lockName = testName.getMethodName() + "_lockName";
+    Stream.of(grantorServer, lockRequestingServer)
+        .forEach(server -> server
+            .invoke(() -> addDistributionMessageObserverAndMembershipListener(lockName)));
+
+    // Asynchronously disconnect the distributed system in the lock requesting server
+    AsyncInvocation asyncDisconnectDistributedSystem =
+        lockRequestingServer.invokeAsync(this::disconnectDistributedSystem);
+
+    // Asynchronously request a lock in the lock requesting server
+    AsyncInvocation asyncGetLock =
+        lockRequestingServer.invokeAsync(() -> requestDistributedLock(lockName));
+
+    // Wait for both invocations to complete
+    asyncDisconnectDistributedSystem.await();
+    asyncGetLock.await();
+
+    // Verify the lock requesting server is disconnected
+    lockRequestingServer.invoke(this::verifyCacheIsClosed);
+
+    // Verify the grantor server holds no locks for the lock requesting server
+    grantorServer.invoke(() -> verifyLockServiceDoesNotHoldToken(lockName));
+  }
+
+  private void becomeLockGrantor() {
+    ClusterStartupRule.getCache().getPartitionedRegionLockService().becomeLockGrantor();
+  }
+
+  private void addDistributionMessageObserverAndMembershipListener(String lockName) {
+    TestDistributionMessageObserver observer = new TestDistributionMessageObserver(lockName);
+    DistributionMessageObserver.setInstance(observer);
+    DistributionManager distributionManager =
+        ClusterStartupRule.getCache().getInternalDistributedSystem().getDistributionManager();
+    distributionManager.addMembershipListener(observer);
+  }
+
+  private void disconnectDistributedSystem() throws InterruptedException, TimeoutException {
+    // Wait for the grantor to signal ABOUT_TO_PROCESS_LOCK_REQUEST
+    blackboard.waitForGate(ABOUT_TO_PROCESS_LOCK_REQUEST);
+
+    // Disconnect the distributed system
+    ClusterStartupRule.getCache().getInternalDistributedSystem().disconnect();
+  }
+
+  private void requestDistributedLock(String lockName) {
+    // Request a distributed lock from the partitioned region lock service
+    DistributedLockService service =
+        ClusterStartupRule.getCache().getPartitionedRegionLockService();
+    try {
+      service.lock(lockName, GeodeAwaitility.getTimeout().toMillis(), -1);
+    } catch (Exception e) {
+      /* ignore */
+    }
+  }
+
+  private void verifyCacheIsClosed() {
+    // Verify the cache is closed
+    assertThat(ClusterStartupRule.getCache().isClosed()).isTrue();
+  }
+
+  private void verifyLockServiceDoesNotHoldToken(Object lockName) {
+    // Verify no the lock token is not held by the grantor server after the lock requesting server
+    // has departed
+    DLockService dLockService =
+        (DLockService) ClusterStartupRule.getCache().getPartitionedRegionLockService();
+    assertThat(dLockService.isLockGrantor()).isTrue();
+    DLockGrantor dLockGrantor = dLockService.getGrantor();
+    DLockGrantor.DLockGrantToken grantToken = dLockGrantor.getGrantToken(lockName);
+    // assertThat(grantToken).isNull();
+  }
+
+  class TestDistributionMessageObserver extends DistributionMessageObserver implements
+      MembershipListener {
+
+    private final String lockName;
+
+    public TestDistributionMessageObserver(String lockName) {
+      this.lockName = lockName;
+    }
+
+    public void memberDeparted(DistributionManager distributionManager,
+        InternalDistributedMember id, boolean crashed) {
+      // Signal the member has departed. This will cause the DLockRequestMessage to be processed.
+      blackboard.signalGate(MEMBER_DEPARTED);
+    }
+
+    public void beforeProcessMessage(ClusterDistributionManager dm, DistributionMessage message) {
+      if (message instanceof DLockRequestProcessor.DLockRequestMessage) {
+        DLockRequestProcessor.DLockRequestMessage dLockRequestMessage =
+            (DLockRequestProcessor.DLockRequestMessage) message;
+        if (dLockRequestMessage.getObjectName().equals(this.lockName)) {
+          // Signal the about to process lock request. This will cause the lock requesting server to
+          // disconnect its distributed system.
+          blackboard.signalGate(ABOUT_TO_PROCESS_LOCK_REQUEST);
+
+          // Wait for member departed before processing the DLockRequestMessage
+          try {
+            blackboard.waitForGate(MEMBER_DEPARTED);
+          } catch (Exception e) {
+            throw new RuntimeException(e);
+          }

Review Comment:
   Thanks a lot, Kirk. I didn't know we had this rule. I just add the exception to the DistributedErrorCollector like this, right?
   ```
   } catch (Exception e) {
     errorCollector.addError(e);
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [geode] kirklund commented on a diff in pull request #7610: GEODE-10250: Drop lock request if member departed

Posted by GitBox <gi...@apache.org>.
kirklund commented on code in PR #7610:
URL: https://github.com/apache/geode/pull/7610#discussion_r861935696


##########
geode-core/src/distributedTest/java/org/apache/geode/distributed/RequestDistributedLockWhileClosingCacheDistributedTest.java:
##########
@@ -0,0 +1,184 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.distributed;
+
+import static org.apache.geode.distributed.ConfigurationProperties.LOG_LEVEL;
+import static org.assertj.core.api.Assertions.assertThat;
+
+import java.io.Serializable;
+import java.util.Properties;
+import java.util.concurrent.TimeoutException;
+import java.util.stream.Stream;
+
+import org.junit.Rule;
+import org.junit.Test;
+
+import org.apache.geode.distributed.internal.ClusterDistributionManager;
+import org.apache.geode.distributed.internal.DistributionManager;
+import org.apache.geode.distributed.internal.DistributionMessage;
+import org.apache.geode.distributed.internal.DistributionMessageObserver;
+import org.apache.geode.distributed.internal.MembershipListener;
+import org.apache.geode.distributed.internal.locks.DLockGrantor;
+import org.apache.geode.distributed.internal.locks.DLockRequestProcessor;
+import org.apache.geode.distributed.internal.locks.DLockService;
+import org.apache.geode.distributed.internal.membership.InternalDistributedMember;
+import org.apache.geode.test.awaitility.GeodeAwaitility;
+import org.apache.geode.test.dunit.AsyncInvocation;
+import org.apache.geode.test.dunit.rules.ClusterStartupRule;
+import org.apache.geode.test.dunit.rules.DistributedBlackboard;
+import org.apache.geode.test.dunit.rules.MemberVM;
+import org.apache.geode.test.junit.rules.serializable.SerializableTestName;
+
+public class RequestDistributedLockWhileClosingCacheDistributedTest implements Serializable {
+
+  @Rule
+  public ClusterStartupRule cluster = new ClusterStartupRule();
+
+  @Rule
+  public SerializableTestName testName = new SerializableTestName();
+
+  @Rule
+  public DistributedBlackboard blackboard = new DistributedBlackboard();
+
+  private static final String ABOUT_TO_PROCESS_LOCK_REQUEST = "ABOUT_TO_PROCESS_LOCK_REQUEST";
+
+  private static final String MEMBER_DEPARTED = "MEMBER_DEPARTED";
+
+  @Test
+  public void testRequestDistributedLockWhileClosingCache() throws InterruptedException {
+    // Init Blackboard
+    blackboard.initBlackboard();
+
+    // Start the locator
+    MemberVM locator = cluster.startLocatorVM(0);
+
+    // Start the grantor server
+    Properties properties = new Properties();
+    properties.setProperty(LOG_LEVEL, "fine");
+    MemberVM grantorServer = cluster.startServerVM(1, properties, locator.getPort());
+
+    // Become lock grantor for the DistributedLockService
+    grantorServer.invoke(this::becomeLockGrantor);
+
+    // Start the lock requesting server
+    MemberVM lockRequestingServer = cluster.startServerVM(2, properties, locator.getPort());
+
+    // Add the DistributionMessageObserver
+    String lockName = testName.getMethodName() + "_lockName";
+    Stream.of(grantorServer, lockRequestingServer)
+        .forEach(server -> server
+            .invoke(() -> addDistributionMessageObserverAndMembershipListener(lockName)));
+
+    // Asynchronously disconnect the distributed system in the lock requesting server
+    AsyncInvocation asyncDisconnectDistributedSystem =
+        lockRequestingServer.invokeAsync(this::disconnectDistributedSystem);
+
+    // Asynchronously request a lock in the lock requesting server
+    AsyncInvocation asyncGetLock =
+        lockRequestingServer.invokeAsync(() -> requestDistributedLock(lockName));
+
+    // Wait for both invocations to complete
+    asyncDisconnectDistributedSystem.await();
+    asyncGetLock.await();
+
+    // Verify the lock requesting server is disconnected
+    lockRequestingServer.invoke(this::verifyCacheIsClosed);
+
+    // Verify the grantor server holds no locks for the lock requesting server
+    grantorServer.invoke(() -> verifyLockServiceDoesNotHoldToken(lockName));
+  }
+
+  private void becomeLockGrantor() {
+    ClusterStartupRule.getCache().getPartitionedRegionLockService().becomeLockGrantor();
+  }
+
+  private void addDistributionMessageObserverAndMembershipListener(String lockName) {
+    TestDistributionMessageObserver observer = new TestDistributionMessageObserver(lockName);
+    DistributionMessageObserver.setInstance(observer);
+    DistributionManager distributionManager =
+        ClusterStartupRule.getCache().getInternalDistributedSystem().getDistributionManager();
+    distributionManager.addMembershipListener(observer);
+  }
+
+  private void disconnectDistributedSystem() throws InterruptedException, TimeoutException {
+    // Wait for the grantor to signal ABOUT_TO_PROCESS_LOCK_REQUEST
+    blackboard.waitForGate(ABOUT_TO_PROCESS_LOCK_REQUEST);
+
+    // Disconnect the distributed system
+    ClusterStartupRule.getCache().getInternalDistributedSystem().disconnect();
+  }
+
+  private void requestDistributedLock(String lockName) {
+    // Request a distributed lock from the partitioned region lock service
+    DistributedLockService service =
+        ClusterStartupRule.getCache().getPartitionedRegionLockService();
+    try {
+      service.lock(lockName, GeodeAwaitility.getTimeout().toMillis(), -1);
+    } catch (Exception e) {
+      /* ignore */
+    }
+  }
+
+  private void verifyCacheIsClosed() {
+    // Verify the cache is closed
+    assertThat(ClusterStartupRule.getCache().isClosed()).isTrue();
+  }
+
+  private void verifyLockServiceDoesNotHoldToken(Object lockName) {
+    // Verify no the lock token is not held by the grantor server after the lock requesting server
+    // has departed
+    DLockService dLockService =
+        (DLockService) ClusterStartupRule.getCache().getPartitionedRegionLockService();
+    assertThat(dLockService.isLockGrantor()).isTrue();
+    DLockGrantor dLockGrantor = dLockService.getGrantor();
+    DLockGrantor.DLockGrantToken grantToken = dLockGrantor.getGrantToken(lockName);
+    // assertThat(grantToken).isNull();
+  }
+
+  class TestDistributionMessageObserver extends DistributionMessageObserver implements
+      MembershipListener {
+
+    private final String lockName;
+
+    public TestDistributionMessageObserver(String lockName) {
+      this.lockName = lockName;
+    }
+
+    public void memberDeparted(DistributionManager distributionManager,
+        InternalDistributedMember id, boolean crashed) {
+      // Signal the member has departed. This will cause the DLockRequestMessage to be processed.
+      blackboard.signalGate(MEMBER_DEPARTED);
+    }
+
+    public void beforeProcessMessage(ClusterDistributionManager dm, DistributionMessage message) {
+      if (message instanceof DLockRequestProcessor.DLockRequestMessage) {
+        DLockRequestProcessor.DLockRequestMessage dLockRequestMessage =
+            (DLockRequestProcessor.DLockRequestMessage) message;
+        if (dLockRequestMessage.getObjectName().equals(this.lockName)) {
+          // Signal the about to process lock request. This will cause the lock requesting server to
+          // disconnect its distributed system.
+          blackboard.signalGate(ABOUT_TO_PROCESS_LOCK_REQUEST);
+
+          // Wait for member departed before processing the DLockRequestMessage
+          try {
+            blackboard.waitForGate(MEMBER_DEPARTED);
+          } catch (Exception e) {
+            throw new RuntimeException(e);
+          }

Review Comment:
   Yep! And if you need to you can always rethrow the exception after adding to errorCollector so that GemFire reacts to it:
   ```
   } catch (Exception e) {
     errorCollector.addError(e);
     throw e;
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org