You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2020/10/02 20:51:28 UTC

[GitHub] [helix] dasahcc commented on a change in pull request #1413: Feature: Offline Node Timeout During Maintenance Mode

dasahcc commented on a change in pull request #1413:
URL: https://github.com/apache/helix/pull/1413#discussion_r499044458



##########
File path: helix-core/src/main/java/org/apache/helix/controller/dataproviders/BaseControllerDataProvider.java
##########
@@ -728,6 +782,100 @@ private void updateDisabledInstances() {
     }
   }
 
+  /*
+   * Check if the instance is timed-out during maintenance mode. An instance is timed-out if it has
+   * been offline for longer than the user defined timeout window.
+   * @param timeOutWindow - the timeout window; guaranteed to be non-negative
+   */
+  private boolean isInstanceTimedOutDuringMaintenance(HelixDataAccessor accessor, String instance,
+      long timeOutWindow) {
+    ParticipantHistory history =
+        accessor.getProperty(accessor.keyBuilder().participantHistory(instance));
+    long currentTime = System.currentTimeMillis();
+
+    // Parse online timestamps from online list
+    List<Long> onlineTimestamps = new ArrayList<>();
+    List<String> historyOnlineList = history.getHistory();
+    if (historyOnlineList != null) {
+      long tailOnlineTime;
+      for (int i = historyOnlineList.size() - 1; i >= 0; i--) {
+        tailOnlineTime =
+            ParticipantHistory.extractTimeFromSessionHistoryString(historyOnlineList.get(i));
+        // Ignore bad format case
+        if (tailOnlineTime != -1) {
+          onlineTimestamps.add(0, tailOnlineTime);
+          // We only need one online timestamp before maintenance mode starts
+          if (tailOnlineTime <= _maintenanceSignal.getTimestamp()) {
+            break;
+          }
+        }
+      }
+    }
+    if (onlineTimestamps.isEmpty() || onlineTimestamps.get(0) > _maintenanceSignal.getTimestamp()) {
+      // Node is a new node in this maintenance mode, no need to time out since no partitions are
+      // assigned to it
+      return false;
+    }
+
+    // Parse offline timestamps from offline list
+    List<Long> offlineTimestamps = new ArrayList<>();
+    List<String> historyOfflineList = history.getOffline();
+    if (historyOfflineList != null) {
+      long tailOfflineTime;
+      for (int i = historyOfflineList.size() - 1; i >= 0; i--) {
+        tailOfflineTime =
+            ParticipantHistory.parseHistoryDateStringToLong(historyOfflineList.get(i));
+        // Ignore bad format case
+        if (tailOfflineTime != -1) {
+          if (tailOfflineTime <= onlineTimestamps.get(0)) {
+            break;
+          }
+          offlineTimestamps.add(0, tailOfflineTime);
+        }
+      }
+    }
+
+    // At this point, onlineTimestamps contains at least 1 timestamp that's before maintenance mode;
+    // offlineTimestamps contains 0+ timestamp that's > the first online timestamp.
+    if (!offlineTimestamps.isEmpty()
+        && offlineTimestamps.get(offlineTimestamps.size() - 1) > onlineTimestamps
+        .get(onlineTimestamps.size() - 1)) {
+      // Make sure for each offline timestamp, 2 online timestamps creates a range around it;
+      // this is usually true unless the timestamps are malformed (missing online timestamps)
+      onlineTimestamps.add(currentTime);
+    }
+
+    // Hop between each pair of online timestamps and find the maximum offline window
+    for (int onlineTimestampIndex = 1, offlineTimestampIndex = 0;
+        onlineTimestampIndex < onlineTimestamps.size() && offlineTimestampIndex < offlineTimestamps
+            .size(); onlineTimestampIndex++) {
+      long onlineTimestamp = onlineTimestamps.get(onlineTimestampIndex);
+      long offlineTimeStamp = offlineTimestamps.get(offlineTimestampIndex);
+
+      // If the offline timestamp isn't within this pair of online timestamp, continue
+      if (offlineTimeStamp > onlineTimestamp) {
+        continue;
+      }
+
+      // Check the largest offline window against the timeout window
+      if (onlineTimestamp - offlineTimeStamp > timeOutWindow) {

Review comment:
       It has the edge case for first interval. It is the interval online time just before maintenance mode, but somehow we dont check whether the offline time is before maintenance.
   
   Let's say if the instance is offline before maintenance we need the potential timeout = online time - Math.max(offlinetimestamp, maintenance entering time);




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org