You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/12/27 22:08:20 UTC

[GitHub] [druid] jasonk000 opened a new pull request #12099: Task queue unblock

jasonk000 opened a new pull request #12099:
URL: https://github.com/apache/druid/pull/12099


   ### Description
   
   Improves the stability of Overlord and all tasks in a cluster when there are large (1000+) task counts, by reducing contention between the management thread and the reception of status updates from the cluster.
   
   #### Introduce GuardedBy to TaskQueue
   
   .. and fix some existing missed spots.
   
   #### Introduce TaskQueueScaleTest
   
   To test scalability of starting and stopping 1000 tasks (set with a 60sec timeout), that currently fails and is fixed in the next commit.
   
   #### Reduce TaskQueue contention
   
   Reduce the duration of holding the `giant` critical lock, which increases responsiveness:
   - Break apart the TaskQueue-Manager manage loop to a critical (locked) section and a section that can be run concurrently with notifications (ie: sending any necessary shutdown requests).
     - This is the most important part of the change, since in the existing code the blocking shutdown requests are performed inside the loop. By moving the blocking calls outside the loop we make it possible for status notifications to be promptly processed.
   - Minimise duration that notifyStatus calls take holding the giant lock.
   - Move other logging etc outside the critical section where possible.
   
   ### Design decisions
   
   I chose a BlockingQueue implementation because it is easy to reason about the submission / poll / offer ordering. Other options would be Semaphore etc.
   
   There is potential future work:
   - Current behaviour, if a task `shutdown()` call is slow it slows down submission of tasks across the whole loop - this is not improved by the PR.
   - This could be mitigated by introducing an Executor, or by applying a decision that shutdown() call implementations are non-blocking.
   - If recommended, I'd suggest we do this as a separate PR.
   
   This follows the mailing list discussions here:
   https://lists.apache.org/thread/9jgdwrodwsfcg98so6kzfhdmn95gzyrj
   
   This PR has:
   - [x] been self-reviewed.
      - [x] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [x] added documentation for new or modified features or behaviors.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [x] been tested in a test Druid cluster. (as part of another block of changes).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ccaominh commented on a change in pull request #12099: Task queue unblock

Posted by GitBox <gi...@apache.org>.
ccaominh commented on a change in pull request #12099:
URL: https://github.com/apache/druid/pull/12099#discussion_r779292422



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java
##########
@@ -93,7 +100,8 @@
   private final ServiceEmitter emitter;
 
   private final ReentrantLock giant = new ReentrantLock(true);
-  private final Condition managementMayBeNecessary = giant.newCondition();
+  //noinspection MismatchedCollectionQueryUpdate

Review comment:
       Can you try suppressing `MismatchedQueryAndUpdateOfCollection` instead?
   
   From the IntelliJ source code, that looks like the id of relevant inspection: https://github.com/JetBrains/intellij-community/blob/2fc98d149965e20e9810b8d94f51309923fa4969/plugins/InspectionGadgets/src/com/siyeh/ig/bugs/MismatchedCollectionQueryUpdateInspection.java#L128




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lgtm-com[bot] commented on pull request #12099: Task queue unblock

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #12099:
URL: https://github.com/apache/druid/pull/12099#issuecomment-1001796657


   This pull request **fixes 1 alert** when merging ef94f4fd8a5204ddd81cfb2465055604e2ba6ea8 into 476d0bf4be4199e97695bd568d165cda98523d37 - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-ea32f6d110e47caa81444f8c62825d1232219f58)
   
   **fixed alerts:**
   
   * 1 for Useless comparison test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lgtm-com[bot] commented on pull request #12099: Task queue unblock

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #12099:
URL: https://github.com/apache/druid/pull/12099#issuecomment-1006833838


   This pull request **fixes 1 alert** when merging 1ee151d817b0aaf29e88e9aa7334412209caace7 into c28b2834a11e88c020fcb5411424a89be1030eef - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-e5966ac262b22a3ce3301c2de6ddec9f04b705bd)
   
   **fixed alerts:**
   
   * 1 for Useless comparison test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ccaominh commented on a change in pull request #12099: Task queue unblock

Posted by GitBox <gi...@apache.org>.
ccaominh commented on a change in pull request #12099:
URL: https://github.com/apache/druid/pull/12099#discussion_r779292422



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java
##########
@@ -93,7 +100,8 @@
   private final ServiceEmitter emitter;
 
   private final ReentrantLock giant = new ReentrantLock(true);
-  private final Condition managementMayBeNecessary = giant.newCondition();
+  //noinspection MismatchedCollectionQueryUpdate

Review comment:
       Can you try suppressing `MismatchedQueryAndUpdateOfCollection` instead?
   
   From the IntelliJ source code, that looks like the id of the relevant inspection: https://github.com/JetBrains/intellij-community/blob/2fc98d149965e20e9810b8d94f51309923fa4969/plugins/InspectionGadgets/src/com/siyeh/ig/bugs/MismatchedCollectionQueryUpdateInspection.java#L128




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ccaominh commented on a change in pull request #12099: Task queue unblock

Posted by GitBox <gi...@apache.org>.
ccaominh commented on a change in pull request #12099:
URL: https://github.com/apache/druid/pull/12099#discussion_r780000865



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java
##########
@@ -93,7 +100,8 @@
   private final ServiceEmitter emitter;
 
   private final ReentrantLock giant = new ReentrantLock(true);
-  private final Condition managementMayBeNecessary = giant.newCondition();
+  //noinspection MismatchedCollectionQueryUpdate

Review comment:
       Looks like the inspections CI job passes now 👍




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jasonk000 commented on a change in pull request #12099: Task queue unblock

Posted by GitBox <gi...@apache.org>.
jasonk000 commented on a change in pull request #12099:
URL: https://github.com/apache/druid/pull/12099#discussion_r779737340



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java
##########
@@ -93,7 +100,8 @@
   private final ServiceEmitter emitter;
 
   private final ReentrantLock giant = new ReentrantLock(true);
-  private final Condition managementMayBeNecessary = giant.newCondition();
+  //noinspection MismatchedCollectionQueryUpdate

Review comment:
       Thanks, I used your suggested tag and switched from `noinspection` to `@SuppressWarnings` and it now passes on my local. Let's see if it passes on CI.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jasonk000 commented on a change in pull request #12099: Task queue unblock

Posted by GitBox <gi...@apache.org>.
jasonk000 commented on a change in pull request #12099:
URL: https://github.com/apache/druid/pull/12099#discussion_r779257359



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskQueue.java
##########
@@ -93,7 +100,8 @@
   private final ServiceEmitter emitter;
 
   private final ReentrantLock giant = new ReentrantLock(true);
-  private final Condition managementMayBeNecessary = giant.newCondition();
+  //noinspection MismatchedCollectionQueryUpdate

Review comment:
       @ccaominh it seems like the intellij inspector container is ignoring this directive, do you have an opinion on this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lgtm-com[bot] commented on pull request #12099: Task queue unblock

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #12099:
URL: https://github.com/apache/druid/pull/12099#issuecomment-1006229396


   This pull request **fixes 1 alert** when merging 22f633bfe6989827049795c0060d9c3c64be22fa into 6846622080f00270e0722bd77dbbecb15f060ec3 - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-f59ad66c0a5ccda5a659d95da266b1d39ac2236b)
   
   **fixed alerts:**
   
   * 1 for Useless comparison test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org