You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/02/12 14:56:22 UTC

[GitHub] [lucene-solr] chatman commented on a change in pull request #2318: SOLR-15138: PerReplicaStates does not scale to large collections as well as state.json

chatman commented on a change in pull request #2318:
URL: https://github.com/apache/lucene-solr/pull/2318#discussion_r575282512



##########
File path: solr/core/src/java/org/apache/solr/cloud/api/collections/CreateCollectionCmd.java
##########
@@ -256,6 +280,23 @@ public void call(ClusterState clusterState, ZkNodeProps message, @SuppressWarnin
       shardRequestTracker.processResponses(results, shardHandler, false, null, Collections.emptySet());
       @SuppressWarnings({"rawtypes"})
       boolean failure = results.get("failure") != null && ((SimpleOrderedMap)results.get("failure")).size() > 0;
+      if(isPrs) {
+        TimeOut timeout = new TimeOut(Integer.getInteger("solr.waitToSeeReplicasInStateTimeoutSeconds", 120), TimeUnit.SECONDS, timeSource); // could be a big cluster
+        PerReplicaStates prs = PerReplicaStates.fetch(collectionPath, ocmh.zkStateReader.getZkClient(), null);
+        while (!timeout.hasTimedOut()) {
+          if(prs.allActive()) break;
+          Thread.sleep(100);
+          prs = PerReplicaStates.fetch(collectionPath, ocmh.zkStateReader.getZkClient(), null);
+        }
+        if (prs.allActive()) {
+          // we have successfully found all replicas to be ACTIVE
+          // Now ask Overseer to fetch the latest state of collection
+          // from ZK
+          ocmh.overseer.submit(new RefreshCollectionMessage(collectionName));
+        } else {
+          failure = true;
+        }
+      }
       if (failure) {
         // Let's cleanup as we hit an exception
         // We shouldn't be passing 'results' here for the cleanup as the response would then contain 'success'

Review comment:
       I can confirm that this is no longer an issue after the latest commits. I ran CreateCollectionCleanupTest with the following patch [0] and it passed consistently. While it ran, I checked the coverage report to verify that these lines were covered.
   
   [0] - https://paste.centos.org/view/09e3434d




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org