You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@geode.apache.org by GitBox <gi...@apache.org> on 2020/07/29 07:24:18 UTC

[GitHub] [geode] bschuchardt commented on a change in pull request #5403: GEODE-8385: hang recovering from disk with cyclic dependencies

bschuchardt commented on a change in pull request #5403:
URL: https://github.com/apache/geode/pull/5403#discussion_r461613391



##########
File path: geode-membership/src/main/java/org/apache/geode/distributed/internal/membership/gms/GMSMemberData.java
##########
@@ -399,11 +399,11 @@ public int hashCode() {
   public String toString() {
     StringBuilder sb = new StringBuilder(100);
 
-    sb.append("MemberData[");
+    sb.append("GMSMember[");

Review comment:
       I think that's an artifact of my doing this work on an old SHA and then rebasing onto develop.  I'll revert that change

##########
File path: geode-core/src/main/java/org/apache/geode/distributed/internal/ClusterDistributionManager.java
##########
@@ -2366,10 +2375,10 @@ public void memberDeparted(InternalDistributedMember theId, boolean crashed, Str
           message.setReason(reason); // added for #37950
           handleIncomingDMsg(message);
         }
-        dm.handleManagerDeparture(theId, crashed, reason);
       } catch (DistributedSystemDisconnectedException se) {
         // let's not get huffy about it
       }
+      dm.handleManagerDeparture(theId, crashed, reason, true);

Review comment:
       that change was unnecessary - I'll revert it.

##########
File path: geode-core/src/main/java/org/apache/geode/distributed/internal/ClusterDistributionManager.java
##########
@@ -1784,19 +1784,15 @@ private String prettifyReason(String r) {
   /**
    * Returns true if id was removed. Returns false if it was not in the list of managers.
    */
-  private boolean removeManager(InternalDistributedMember theId, boolean crashed, String p_reason) {
-    String reason = p_reason;
-
-    reason = prettifyReason(reason);
+  private void removeManager(InternalDistributedMember theId, boolean crashed, String p_reason) {

Review comment:
       I've in-lined this method.  Its name no longer made sense and it was only used by handleManagerDeparture()

##########
File path: geode-core/src/main/java/org/apache/geode/distributed/internal/ClusterDistributionManager.java
##########
@@ -1866,39 +1848,56 @@ public void handleConsoleShutdown(InternalDistributedMember theId, boolean crash
   void shutdownMessageReceived(InternalDistributedMember theId, String reason) {
     removeHostedLocators(theId);
     distribution.shutdownMessageReceived(theId, reason);
+    handleManagerDeparture(theId, false, reason);
   }
 
+  /*
+   * handleManagerDeparted may be invoked multiple times for a member identifier.
+   * We allow this and inform listeners on each invocation, but only perform some
+   * actions (such as decrementing the node count) if the change came from a
+   * membership view.
+   */
   @Override
-  public void handleManagerDeparture(InternalDistributedMember theId, boolean p_crashed,
-      String p_reason) {
+  public void handleManagerDeparture(InternalDistributedMember theId, boolean memberCrashed,
+      String reason) {
+    handleManagerDeparture(theId, memberCrashed, reason, false);
+  }
 
+  private void handleManagerDeparture(InternalDistributedMember theId, boolean memberCrashed,
+      String reason, boolean fromViewChange) {
     alertingService.removeAlertListener(theId);
 
+    removeUnfinishedStartup(theId, true);
+
     int vmType = theId.getVmKind();
     if (vmType == ADMIN_ONLY_DM_TYPE) {
-      removeUnfinishedStartup(theId, true);
-      handleConsoleShutdown(theId, p_crashed, p_reason);
+      handleConsoleShutdown(theId, memberCrashed, reason);
       return;
     }
 
-    removeUnfinishedStartup(theId, true);
-
-    if (removeManager(theId, p_crashed, p_reason)) {
-      if (theId.getVmKind() != ClusterDistributionManager.LOCATOR_DM_TYPE) {
-        stats.incNodes(-1);
-      }
-      String msg;
-      if (p_crashed && !shouldInhibitMembershipWarnings()) {
-        msg =
-            "Member at {} unexpectedly left the distributed cache: {}";
-        addMemberEvent(new MemberCrashedEvent(theId, p_reason));
-      } else {
-        msg =
-            "Member at {} gracefully left the distributed cache: {}";
-        addMemberEvent(new MemberDepartedEvent(theId, p_reason));
-      }
-      logger.info(msg, new Object[] {theId, prettifyReason(p_reason)});
+    if (logger.isDebugEnabled()) {
+      logger.debug(
+          "DistributionManager: removing member <{}>; crashed {}; reason = {} fromView = {}", theId,
+          memberCrashed, prettifyReason(reason), fromViewChange);
+    }
+    removeHostedLocators(theId);
+    redundancyZones.remove(theId);
 
+    if (fromViewChange && theId.getVmKind() != ClusterDistributionManager.LOCATOR_DM_TYPE) {
+      stats.incNodes(-1);
+    }
+    String msg;
+    if (memberCrashed && !shouldInhibitMembershipWarnings()) {
+      msg =
+          "Member at {} unexpectedly left the distributed cache: {}";
+      addMemberEvent(new MemberCrashedEvent(theId, reason));
+    } else {
+      msg =
+          "Member at {} gracefully left the distributed cache: {}";
+      addMemberEvent(new MemberDepartedEvent(theId, reason));
+    }
+    if (fromViewChange) {
+      logger.info(msg, new Object[] {theId, prettifyReason(reason)});

Review comment:
       If I move it out of the `if` statement it will log the departure on every invocation of this method, and I don't want to do that.  I think it will be confusing.

##########
File path: geode-core/src/main/java/org/apache/geode/distributed/internal/ClusterDistributionManager.java
##########
@@ -1866,39 +1848,56 @@ public void handleConsoleShutdown(InternalDistributedMember theId, boolean crash
   void shutdownMessageReceived(InternalDistributedMember theId, String reason) {
     removeHostedLocators(theId);
     distribution.shutdownMessageReceived(theId, reason);
+    handleManagerDeparture(theId, false, reason);

Review comment:
       Sorry, I don't understand the question.  We want to decrement the node count once, but this method will be invoked multiple times for the same departing node.  If we restrict the decrement to view-changes it will only be decremented once because we only notify memberDeparted once for a member that's left the view.

##########
File path: geode-core/src/main/java/org/apache/geode/distributed/internal/ClusterDistributionManager.java
##########
@@ -1866,39 +1848,56 @@ public void handleConsoleShutdown(InternalDistributedMember theId, boolean crash
   void shutdownMessageReceived(InternalDistributedMember theId, String reason) {
     removeHostedLocators(theId);
     distribution.shutdownMessageReceived(theId, reason);
+    handleManagerDeparture(theId, false, reason);
   }
 
+  /*
+   * handleManagerDeparted may be invoked multiple times for a member identifier.
+   * We allow this and inform listeners on each invocation, but only perform some
+   * actions (such as decrementing the node count) if the change came from a
+   * membership view.
+   */
   @Override
-  public void handleManagerDeparture(InternalDistributedMember theId, boolean p_crashed,
-      String p_reason) {
+  public void handleManagerDeparture(InternalDistributedMember theId, boolean memberCrashed,
+      String reason) {
+    handleManagerDeparture(theId, memberCrashed, reason, false);
+  }
 
+  private void handleManagerDeparture(InternalDistributedMember theId, boolean memberCrashed,
+      String reason, boolean fromViewChange) {
     alertingService.removeAlertListener(theId);
 
+    removeUnfinishedStartup(theId, true);
+
     int vmType = theId.getVmKind();
     if (vmType == ADMIN_ONLY_DM_TYPE) {
-      removeUnfinishedStartup(theId, true);
-      handleConsoleShutdown(theId, p_crashed, p_reason);
+      handleConsoleShutdown(theId, memberCrashed, reason);
       return;
     }
 
-    removeUnfinishedStartup(theId, true);
-
-    if (removeManager(theId, p_crashed, p_reason)) {
-      if (theId.getVmKind() != ClusterDistributionManager.LOCATOR_DM_TYPE) {
-        stats.incNodes(-1);
-      }
-      String msg;
-      if (p_crashed && !shouldInhibitMembershipWarnings()) {
-        msg =
-            "Member at {} unexpectedly left the distributed cache: {}";
-        addMemberEvent(new MemberCrashedEvent(theId, p_reason));
-      } else {
-        msg =
-            "Member at {} gracefully left the distributed cache: {}";
-        addMemberEvent(new MemberDepartedEvent(theId, p_reason));
-      }
-      logger.info(msg, new Object[] {theId, prettifyReason(p_reason)});
+    if (logger.isDebugEnabled()) {
+      logger.debug(
+          "DistributionManager: removing member <{}>; crashed {}; reason = {} fromView = {}", theId,
+          memberCrashed, prettifyReason(reason), fromViewChange);
+    }
+    removeHostedLocators(theId);
+    redundancyZones.remove(theId);
 
+    if (fromViewChange && theId.getVmKind() != ClusterDistributionManager.LOCATOR_DM_TYPE) {
+      stats.incNodes(-1);
+    }
+    String msg;
+    if (memberCrashed && !shouldInhibitMembershipWarnings()) {
+      msg =
+          "Member at {} unexpectedly left the distributed cache: {}";
+      addMemberEvent(new MemberCrashedEvent(theId, reason));
+    } else {
+      msg =
+          "Member at {} gracefully left the distributed cache: {}";
+      addMemberEvent(new MemberDepartedEvent(theId, reason));
+    }
+    if (fromViewChange) {
+      logger.info(msg, new Object[] {theId, prettifyReason(reason)});

Review comment:
       I've simplified the method now, so it no longer has fromViewChange.  We'll be notifying listeners of departures when we get a shutdown message.  The subsequent view change won't trigger listener invocation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org