You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2021/09/15 00:14:22 UTC

[GitHub] [solr] anshumg commented on a change in pull request #238: SOLR-15286 A brand new follower in the legacy mode should wait to replicate index before reporting healthy

anshumg commented on a change in pull request #238:
URL: https://github.com/apache/solr/pull/238#discussion_r708744873



##########
File path: solr/core/src/java/org/apache/solr/handler/admin/HealthCheckHandler.java
##########
@@ -135,6 +153,80 @@ public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throw
     rsp.add(STATUS, OK);
   }
 
+  private void healthCheckLegacyMode(SolrQueryRequest req, SolrQueryResponse rsp) {
+    Integer maxGenerationLag = req.getParams().getInt(HealthCheckRequest.PARAM_MAX_GENERATION_LAG);
+    List<String> laggingCoresInfo = new ArrayList<>();
+    boolean allCoresAreInSync = true;
+
+    // check only if max generation lag is specified
+    if(maxGenerationLag != null) {
+      for(SolrCore core : coreContainer.getCores()) {
+        ReplicationHandler replicationHandler =
+          (ReplicationHandler) core.getRequestHandler(ReplicationHandler.PATH);
+        // if maxGeneration lag is not specified don't check if follower is in sync.
+        if(replicationHandler.isFollower()) {
+          boolean isCoreInSync =
+            generationLagFromLeader(core, replicationHandler, maxGenerationLag, laggingCoresInfo);
+
+          allCoresAreInSync &= isCoreInSync;
+        }
+      }
+    }
+
+    if(allCoresAreInSync) {
+      rsp.add("message",
+        String.format(Locale.ROOT, "All the followers are in sync with leader (within maxGenerationLag: %d) " +
+          "or all the cores are acting as leader", maxGenerationLag));
+      rsp.add(STATUS, OK);
+    } else {
+      rsp.add("message",
+        String.format(Locale.ROOT,"Cores violating maxGenerationLag:%d.\n%s", maxGenerationLag,
+          String.join(",\n", laggingCoresInfo)));
+      rsp.add(STATUS, FAILURE);
+    }
+  }
+
+  private boolean generationLagFromLeader(final SolrCore core, ReplicationHandler replicationHandler,
+                                          int maxGenerationLag, List<String> laggingCoresInfo) {
+    IndexFetcher indexFetcher = null;
+    try {
+      // may not be the best way to get leader's replicableCommit
+      @SuppressWarnings({"rawtypes"})
+      NamedList follower =
+        ReplicationHandler.getObjectWithBackwardCompatibility(replicationHandler.getInitArgs(), "follower",
+                  "slave");
+      indexFetcher = new IndexFetcher(follower, replicationHandler, core);
+
+      @SuppressWarnings({"rawtypes"})
+      NamedList replicableCommitOnLeader = indexFetcher.getLatestVersion();
+      long leaderGeneration = (Long) replicableCommitOnLeader.get(GENERATION);
+
+      // Get our own commit and generation from the commit
+      IndexCommit commit = core.getDeletionPolicy().getLatestCommit();
+      if(commit != null) {
+        long followerGeneration = commit.getGeneration();
+        long generationDiff = leaderGeneration - followerGeneration;
+
+        // generationDiff should be within the threshold and generationDiff shouldn't be negative

Review comment:
       while I agree with you @praste , could this check happen in that small window when the leader switch happens? Perhaps it's a good idea to log that?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org