You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/06/10 01:34:17 UTC

[GitHub] [lucene-solr] murblanc opened a new pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

murblanc opened a new pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561


   # Description
   
   Make sure Collection API tasks are processed in enqueue order by the Overseer and OverseerCollectionMessageHandler
   
   # Solution
   
   The LockTree.Session class already existed to address that need but there were a few missing pieces preventing it from doing so. Added these pieces.
   
   # Tests
   
   A previously failing test MultiThreadedOCPTest (SOLR-14524) no longer fails. Also ran local massive enqueue tests into Collection API ZK queue and instrumented OverseerTaskProcessor to verify dequeue is in order (requires heavy handed code hacks so this was only run locally to help diagnose and verify).
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [X] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [X] I have created a Jira issue and added the issue ID to my pull request title.
   - [X] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [X] I have developed this patch against the `master` branch.
   - [X] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] murblanc merged pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
murblanc merged pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] murblanc commented on pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
murblanc commented on pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#issuecomment-647793592


   I guess I didn't take the shortest route to rebase this PR, but at least the diff is easy to see now.
   I plan to merge this tomorrow, in case somebody wants to have a look by then.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] murblanc commented on a change in pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
murblanc commented on a change in pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#discussion_r438970686



##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -95,16 +95,25 @@
 
   private volatile Stats stats;
 
-  // Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
-  // It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
-  // deleted from the work-queue as that is a batched operation.
+  /**
+   * Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
+   * It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
+   * deleted from the work-queue as that is a batched operation.
+   */
   final private Set<String> runningZKTasks;
-  // This map may contain tasks which are read from work queue but could not
-  // be executed because they are blocked or the execution queue is full
-  // This is an optimization to ensure that we do not read the same tasks
-  // again and again from ZK.
+
+  /**
+   * This map may contain tasks which are read from work queue but could not
+   * be executed because they are blocked or the execution queue is full
+   * This is an optimization to ensure that we do not read the same tasks
+   * again and again from ZK.
+   */
   final private Map<String, QueueEvent> blockedTasks = Collections.synchronizedMap(new LinkedHashMap<>());

Review comment:
       We'd need a concurrent linked hash map because we need iteration order == insert order...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] madrob commented on a change in pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
madrob commented on a change in pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#discussion_r439007892



##########
File path: solr/solrj/src/java/org/apache/solr/common/params/CollectionParams.java
##########
@@ -42,31 +42,30 @@
 
 
   enum LockLevel {
-    CLUSTER(0),
-    COLLECTION(1),
-    SHARD(2),
-    REPLICA(3),
-    NONE(10);
-
-    public final int level;
-
-    LockLevel(int i) {
-      this.level = i;
+    NONE(10, null),

Review comment:
       Didn't consider that; yea, that's a good reason for reordering.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] madrob commented on a change in pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
madrob commented on a change in pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#discussion_r438907365



##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerMessageHandler.java
##########
@@ -50,7 +50,7 @@
   /**Try to provide an exclusive lock for this particular task
    * return null if locking is not possible. If locking is not necessary

Review comment:
       This javadoc includes a sentence fragment, can we complete the thought while we're improving documentation in this area?

##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -95,16 +95,25 @@
 
   private volatile Stats stats;
 
-  // Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
-  // It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
-  // deleted from the work-queue as that is a batched operation.
+  /**
+   * Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
+   * It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
+   * deleted from the work-queue as that is a batched operation.
+   */
   final private Set<String> runningZKTasks;

Review comment:
       Since there is so much synchronized access to this, should it be a `ConcurrentHashMap.newKeySet();`

##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -95,16 +95,25 @@
 
   private volatile Stats stats;
 
-  // Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
-  // It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
-  // deleted from the work-queue as that is a batched operation.
+  /**
+   * Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
+   * It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
+   * deleted from the work-queue as that is a batched operation.
+   */
   final private Set<String> runningZKTasks;
-  // This map may contain tasks which are read from work queue but could not
-  // be executed because they are blocked or the execution queue is full
-  // This is an optimization to ensure that we do not read the same tasks
-  // again and again from ZK.
+
+  /**
+   * This map may contain tasks which are read from work queue but could not
+   * be executed because they are blocked or the execution queue is full
+   * This is an optimization to ensure that we do not read the same tasks
+   * again and again from ZK.
+   */
   final private Map<String, QueueEvent> blockedTasks = Collections.synchronizedMap(new LinkedHashMap<>());
-  final private Predicate<String> excludedTasks = new Predicate<String>() {
+
+  /**
+   * Predicate used to filter out tasks from the Zookeeper queue that should not be returned for processing.
+   */
+  final private Predicate<String> excludedTasks = new Predicate<>() {
     @Override
     public boolean test(String s) {
       return runningTasks.contains(s) || blockedTasks.containsKey(s);

Review comment:
       This reference to runningTasks isn't synchronized. Is that an issue?

##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -95,16 +95,25 @@
 
   private volatile Stats stats;
 
-  // Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
-  // It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
-  // deleted from the work-queue as that is a batched operation.
+  /**
+   * Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
+   * It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
+   * deleted from the work-queue as that is a batched operation.
+   */
   final private Set<String> runningZKTasks;
-  // This map may contain tasks which are read from work queue but could not
-  // be executed because they are blocked or the execution queue is full
-  // This is an optimization to ensure that we do not read the same tasks
-  // again and again from ZK.
+
+  /**
+   * This map may contain tasks which are read from work queue but could not
+   * be executed because they are blocked or the execution queue is full
+   * This is an optimization to ensure that we do not read the same tasks
+   * again and again from ZK.
+   */
   final private Map<String, QueueEvent> blockedTasks = Collections.synchronizedMap(new LinkedHashMap<>());

Review comment:
       Similar here, can this be a ConcurrentHashMap instead of a synchronized map?

##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -253,20 +277,22 @@ public void run() {
             continue;
           }
 
-          blockedTasks.clear(); // clear it now; may get refilled below.
+          // clear the blocked tasks, may get refilled below. Given blockedTasks can only get entries from heads and heads
+          // has at most MAX_BLOCKED_TASKS tasks, blockedTasks will never exceed MAX_BLOCKED_TASKS entries.
+          // Note blockedTasks can't be cleared too early as it is used in the excludedTasks Predicate above.
+          blockedTasks.clear();
+
+          // Trigger the creation of a new Session used for locking when/if a lock is later acquired on the OverseerCollectionMessageHandler
+          batchSessionId++;
 
-          taskBatch.batchId++;
           boolean tooManyTasks = false;
           for (QueueEvent head : heads) {
             if (!tooManyTasks) {
-              synchronized (runningTasks) {
                 tooManyTasks = runningTasksSize() >= MAX_PARALLEL_TASKS;
-              }
             }
             if (tooManyTasks) {
               // Too many tasks are running, just shove the rest into the "blocked" queue.
-              if(blockedTasks.size() < MAX_BLOCKED_TASKS)
-                blockedTasks.put(head.getId(), head);
+              blockedTasks.put(head.getId(), head);

Review comment:
       Why do we no longer need to check against the upper limit of BLOCKED_TASKS?

##########
File path: solr/core/src/java/org/apache/solr/cloud/api/collections/OverseerCollectionMessageHandler.java
##########
@@ -867,26 +866,25 @@ public String getTaskKey(ZkNodeProps message) {
   }
 
 
+  // -1 is not a possible batchSessionId so -1 will force initialization of lockSession
   private long sessionId = -1;
   private LockTree.Session lockSession;
 
   @Override
-  public Lock lockTask(ZkNodeProps message, OverseerTaskProcessor.TaskBatch taskBatch) {
-    if (lockSession == null || sessionId != taskBatch.getId()) {
+  public Lock lockTask(ZkNodeProps message, long batchSessionId) {
+    if (sessionId != batchSessionId) {
       //this is always called in the same thread.
       //Each batch is supposed to have a new taskBatch
       //So if taskBatch changes we must create a new Session
-      // also check if the running tasks are empty. If yes, clear lockTree
-      // this will ensure that locks are not 'leaked'
-      if(taskBatch.getRunningTasks() == 0) lockTree.clear();

Review comment:
       This is the only usage of clear in the code, can we remove that method completely? Is this safe to not call clear?

##########
File path: solr/core/src/java/org/apache/solr/cloud/LockTree.java
##########
@@ -89,22 +98,29 @@ public Lock lock(CollectionParams.CollectionAction action, List<String> path) {
       this.level = level;
     }
 
-    void markBusy(List<String> path, int depth) {
-      if (path.size() == depth) {
+    /**
+     * Marks busy the SessionNode corresponding to lockLevel (node names coming from <code>path</code>).
+     * @param path size is at least <code>lockLevel.getHeight()</code>, to capture which node should be marked busy

Review comment:
       I don't understand what this comment means.

##########
File path: solr/solrj/src/java/org/apache/solr/common/params/CollectionParams.java
##########
@@ -42,31 +42,30 @@
 
 
   enum LockLevel {
-    CLUSTER(0),
-    COLLECTION(1),
-    SHARD(2),
-    REPLICA(3),
-    NONE(10);
-
-    public final int level;
-
-    LockLevel(int i) {
-      this.level = i;
+    NONE(10, null),

Review comment:
       I don't think it makes sense to reorder them here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] murblanc commented on a change in pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
murblanc commented on a change in pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#discussion_r438966974



##########
File path: solr/core/src/java/org/apache/solr/cloud/api/collections/OverseerCollectionMessageHandler.java
##########
@@ -867,26 +866,25 @@ public String getTaskKey(ZkNodeProps message) {
   }
 
 
+  // -1 is not a possible batchSessionId so -1 will force initialization of lockSession
   private long sessionId = -1;
   private LockTree.Session lockSession;
 
   @Override
-  public Lock lockTask(ZkNodeProps message, OverseerTaskProcessor.TaskBatch taskBatch) {
-    if (lockSession == null || sessionId != taskBatch.getId()) {
+  public Lock lockTask(ZkNodeProps message, long batchSessionId) {
+    if (sessionId != batchSessionId) {
       //this is always called in the same thread.
       //Each batch is supposed to have a new taskBatch
       //So if taskBatch changes we must create a new Session
-      // also check if the running tasks are empty. If yes, clear lockTree
-      // this will ensure that locks are not 'leaked'
-      if(taskBatch.getRunningTasks() == 0) lockTree.clear();

Review comment:
       I hope (and think) it is... A lock can leak if an executor thread dies in a place where it shouldn't be dying (just before the try with the lock released in the finally.
   Clearing all locks is not a solution IMO. If we do end up with lock leaks we should address those in a more elegant way (fix the leak).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] madrob commented on a change in pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
madrob commented on a change in pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#discussion_r439007539



##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -95,16 +95,25 @@
 
   private volatile Stats stats;
 
-  // Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
-  // It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
-  // deleted from the work-queue as that is a batched operation.
+  /**
+   * Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
+   * It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
+   * deleted from the work-queue as that is a batched operation.
+   */
   final private Set<String> runningZKTasks;
-  // This map may contain tasks which are read from work queue but could not
-  // be executed because they are blocked or the execution queue is full
-  // This is an optimization to ensure that we do not read the same tasks
-  // again and again from ZK.
+
+  /**
+   * This map may contain tasks which are read from work queue but could not
+   * be executed because they are blocked or the execution queue is full
+   * This is an optimization to ensure that we do not read the same tasks
+   * again and again from ZK.
+   */
   final private Map<String, QueueEvent> blockedTasks = Collections.synchronizedMap(new LinkedHashMap<>());

Review comment:
       Would a ConcurrentLinkedQueue work?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] murblanc commented on a change in pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
murblanc commented on a change in pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#discussion_r438965089



##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -253,20 +277,22 @@ public void run() {
             continue;
           }
 
-          blockedTasks.clear(); // clear it now; may get refilled below.
+          // clear the blocked tasks, may get refilled below. Given blockedTasks can only get entries from heads and heads
+          // has at most MAX_BLOCKED_TASKS tasks, blockedTasks will never exceed MAX_BLOCKED_TASKS entries.
+          // Note blockedTasks can't be cleared too early as it is used in the excludedTasks Predicate above.
+          blockedTasks.clear();
+
+          // Trigger the creation of a new Session used for locking when/if a lock is later acquired on the OverseerCollectionMessageHandler
+          batchSessionId++;
 
-          taskBatch.batchId++;
           boolean tooManyTasks = false;
           for (QueueEvent head : heads) {
             if (!tooManyTasks) {
-              synchronized (runningTasks) {
                 tooManyTasks = runningTasksSize() >= MAX_PARALLEL_TASKS;
-              }
             }
             if (tooManyTasks) {
               // Too many tasks are running, just shove the rest into the "blocked" queue.
-              if(blockedTasks.size() < MAX_BLOCKED_TASKS)
-                blockedTasks.put(head.getId(), head);
+              blockedTasks.put(head.getId(), head);

Review comment:
       Commented line 280 above.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] murblanc commented on a change in pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
murblanc commented on a change in pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#discussion_r438967387



##########
File path: solr/solrj/src/java/org/apache/solr/common/params/CollectionParams.java
##########
@@ -42,31 +42,30 @@
 
 
   enum LockLevel {
-    CLUSTER(0),
-    COLLECTION(1),
-    SHARD(2),
-    REPLICA(3),
-    NONE(10);
-
-    public final int level;
-
-    LockLevel(int i) {
-      this.level = i;
+    NONE(10, null),

Review comment:
       Compiler complained of forward reference when I didn't.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] madrob commented on a change in pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
madrob commented on a change in pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#discussion_r439010217



##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -253,20 +277,22 @@ public void run() {
             continue;
           }
 
-          blockedTasks.clear(); // clear it now; may get refilled below.
+          // clear the blocked tasks, may get refilled below. Given blockedTasks can only get entries from heads and heads
+          // has at most MAX_BLOCKED_TASKS tasks, blockedTasks will never exceed MAX_BLOCKED_TASKS entries.
+          // Note blockedTasks can't be cleared too early as it is used in the excludedTasks Predicate above.
+          blockedTasks.clear();
+
+          // Trigger the creation of a new Session used for locking when/if a lock is later acquired on the OverseerCollectionMessageHandler
+          batchSessionId++;
 
-          taskBatch.batchId++;
           boolean tooManyTasks = false;
           for (QueueEvent head : heads) {
             if (!tooManyTasks) {
-              synchronized (runningTasks) {
                 tooManyTasks = runningTasksSize() >= MAX_PARALLEL_TASKS;
-              }
             }
             if (tooManyTasks) {
               // Too many tasks are running, just shove the rest into the "blocked" queue.
-              if(blockedTasks.size() < MAX_BLOCKED_TASKS)
-                blockedTasks.put(head.getId(), head);
+              blockedTasks.put(head.getId(), head);

Review comment:
       Ah, ok, I saw that but then missed the connection by the time I got to this method.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] murblanc commented on a change in pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
murblanc commented on a change in pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#discussion_r439009885



##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -95,16 +95,25 @@
 
   private volatile Stats stats;
 
-  // Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
-  // It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
-  // deleted from the work-queue as that is a batched operation.
+  /**
+   * Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
+   * It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
+   * deleted from the work-queue as that is a batched operation.
+   */
   final private Set<String> runningZKTasks;
-  // This map may contain tasks which are read from work queue but could not
-  // be executed because they are blocked or the execution queue is full
-  // This is an optimization to ensure that we do not read the same tasks
-  // again and again from ZK.
+
+  /**
+   * This map may contain tasks which are read from work queue but could not
+   * be executed because they are blocked or the execution queue is full
+   * This is an optimization to ensure that we do not read the same tasks
+   * again and again from ZK.
+   */
   final private Map<String, QueueEvent> blockedTasks = Collections.synchronizedMap(new LinkedHashMap<>());

Review comment:
       We need a map for the predicate to check presence of an id (map keys also used for logs, but if it was the only use we could work around).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] murblanc commented on a change in pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
murblanc commented on a change in pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#discussion_r438976552



##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -95,16 +95,25 @@
 
   private volatile Stats stats;
 
-  // Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
-  // It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
-  // deleted from the work-queue as that is a batched operation.
+  /**
+   * Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
+   * It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
+   * deleted from the work-queue as that is a batched operation.
+   */
   final private Set<String> runningZKTasks;
-  // This map may contain tasks which are read from work queue but could not
-  // be executed because they are blocked or the execution queue is full
-  // This is an optimization to ensure that we do not read the same tasks
-  // again and again from ZK.
+
+  /**
+   * This map may contain tasks which are read from work queue but could not
+   * be executed because they are blocked or the execution queue is full
+   * This is an optimization to ensure that we do not read the same tasks
+   * again and again from ZK.
+   */
   final private Map<String, QueueEvent> blockedTasks = Collections.synchronizedMap(new LinkedHashMap<>());
-  final private Predicate<String> excludedTasks = new Predicate<String>() {
+
+  /**
+   * Predicate used to filter out tasks from the Zookeeper queue that should not be returned for processing.
+   */
+  final private Predicate<String> excludedTasks = new Predicate<>() {
     @Override
     public boolean test(String s) {
       return runningTasks.contains(s) || blockedTasks.containsKey(s);

Review comment:
       Yes it is. Can likely change this one into a `ConcurrentHashMap.newKeySet()` as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] murblanc commented on pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
murblanc commented on pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#issuecomment-647690709


   Something went wrong here it seems when I tried to add just one commit for CHANGES.txt
   Need to sort this out, possibly cancelling this PR and starting fresh if I don't manage to clean up differently...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] murblanc commented on a change in pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
murblanc commented on a change in pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#discussion_r438974719



##########
File path: solr/core/src/java/org/apache/solr/cloud/OverseerTaskProcessor.java
##########
@@ -95,16 +95,25 @@
 
   private volatile Stats stats;
 
-  // Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
-  // It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
-  // deleted from the work-queue as that is a batched operation.
+  /**
+   * Set of tasks that have been picked up for processing but not cleaned up from zk work-queue.
+   * It may contain tasks that have completed execution, have been entered into the completed/failed map in zk but not
+   * deleted from the work-queue as that is a batched operation.
+   */
   final private Set<String> runningZKTasks;

Review comment:
       Yes, will change that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] murblanc commented on pull request #1561: SOLR-14546: OverseerTaskProcessor can process messages out of order

Posted by GitBox <gi...@apache.org>.
murblanc commented on pull request #1561:
URL: https://github.com/apache/lucene-solr/pull/1561#issuecomment-647633475


   @madrob do you want to have another look or shall I merge it? (now that I can :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org