You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/07/16 17:13:26 UTC

[GitHub] [druid] egor-ryashin opened a new pull request #11456: Overlord limit returned tasks by count

egor-ryashin opened a new pull request #11456:
URL: https://github.com/apache/druid/pull/11456


   <!-- Thanks for trying to help us make Apache Druid be the best it can be! Please fill out as much of the following information as is possible (where relevant, and remove it when irrelevant) to help make the intention and scope of this PR clear in order to ease review. -->
   
   <!-- Please read the doc for contribution (https://github.com/apache/druid/blob/master/CONTRIBUTING.md) before making this PR. Also, once you open a PR, please _avoid using force pushes and rebasing_ since these make it difficult for reviewers to see what you've changed in response to their reviews. See [the 'If your pull request shows conflicts with master' section](https://github.com/apache/druid/blob/master/CONTRIBUTING.md#if-your-pull-request-shows-conflicts-with-master) for more details. -->
   
   UI can produce a load on Overlord if there's a lot of tasks created instantly, the limit by time doesn't help as all
   tasks have the same timestamp or almost the same. This PR introduces a new configuration property that can limit returned tasks by count.
   
   <!-- Replace XXXX with the id of the issue fixed in this PR. Remove this section if there is no corresponding issue. Don't reference the issue in the title of this pull-request. -->
   
   <!-- If you are a committer, follow the PR action item checklist for committers:
   https://github.com/apache/druid/blob/master/dev/committer-instructions.md#pr-and-issue-action-item-checklist-for-committers. -->
   
   ### Description
   
   <!-- Describe the goal of this PR, what problem are you fixing. If there is a corresponding issue (referenced above), it's not necessary to repeat the description here, however, you may choose to keep one summary sentence. -->
   
   <!-- Describe your patch: what did you change in code? How did you fix the problem? -->
   
   <!-- If there are several relatively logically separate changes in this PR, create a mini-section for each of them. For example: -->
   
   #### Fixed the bug ...
   #### Renamed the class ...
   #### Added a forbidden-apis entry ...
   
   <!--
   In each section, please describe design decisions made, including:
    - Choice of algorithms
    - Behavioral aspects. What configuration values are acceptable? How are corner cases and error conditions handled, such as when there are insufficient resources?
    - Class organization and design (how the logic is split between classes, inheritance, composition, design patterns)
    - Method organization and design (how the logic is split between methods, parameters and return types)
    - Naming (class, method, API, configuration, HTTP endpoint, names of emitted metrics)
   -->
   
   
   <!-- It's good to describe an alternative design (or mention an alternative name) for every design (or naming) decision point and compare the alternatives with the designs that you've implemented (or the names you've chosen) to highlight the advantages of the chosen designs and names. -->
   
   <!-- If there was a discussion of the design of the feature implemented in this PR elsewhere (e. g. a "Proposal" issue, any other issue, or a thread in the development mailing list), link to that discussion from this PR description and explain what have changed in your final design compared to your original proposal or the consensus version in the end of the discussion. If something hasn't changed since the original discussion, you can omit a detailed discussion of those aspects of the design here, perhaps apart from brief mentioning for the sake of readability of this PR description. -->
   
   <!-- Some of the aspects mentioned above may be omitted for simple and small changes. -->
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `MyFoo`
    * `OurBar`
    * `TheirBaz`
   
   <hr>
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not all of these items apply to every PR. Remove the items which are not done or not relevant to the PR. None of the items from the checklist below are strictly necessary, but it would be very helpful if you at least self-review the PR. -->
   
   This PR has:
   - [ ] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on pull request #11456:
URL: https://github.com/apache/druid/pull/11456#issuecomment-981630009


   Added unit tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on pull request #11456:
URL: https://github.com/apache/druid/pull/11456#issuecomment-896945699


   @FrankChen021 Assuming in some cases UI gets 504 error then I suspect OverlordResource doesn't respond in time, I'll see if I can bring the limit to a lower level (SQL), I'd like to solve that problem too. Thanks for pointing that out.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on pull request #11456:
URL: https://github.com/apache/druid/pull/11456#issuecomment-983506284


   Fixed, thank you @jihoonson


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
jihoonson commented on pull request #11456:
URL: https://github.com/apache/druid/pull/11456#issuecomment-896492815


   > @egor-ryashin Thanks for opening this PR. If I understand correctly, this PR tries to addresses #11042, #11140 and #11567.
   > 
   > There was a discussion about this problem, and I left [some comments](https://lists.apache.org/thread.html/rc659b7c681499911fb6eff46b9edb5a3d983c11d0c253909afc905b1%40%3Cdev.druid.apache.org%3E) on the dev mailing thread.
   > 
   > One problem is that LIMIT/OFFSET is not used by the WebUI to limit the task count returned by the server. But the core problem I think is that the overlord retrieves all task records from the metadata storage to the SQL layer, the performance bottleneck might be here. So I guess limiting the task count might be less help as we expect.
   > 
   > But if want to limit the count at the SQL layer to make a try, I think it's better to do it at the WebUI side by using LIMIT/OFFSET clause instead of introducing a new configuration at the server side.
   
   @FrankChen021 I believe my suggestion in https://github.com/apache/druid/pull/11456#discussion_r671536987 will address your concern.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on pull request #11456:
URL: https://github.com/apache/druid/pull/11456#issuecomment-897650431


   I've added the limit at SQL query level.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 edited a comment on pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
FrankChen021 edited a comment on pull request #11456:
URL: https://github.com/apache/druid/pull/11456#issuecomment-896488808


   @egor-ryashin Thanks for opening this PR. If I understand correctly, this PR tries to addresses  #11042, #11140 and #11567.
   
   There was a discussion about this problem, and I left [some comments](https://lists.apache.org/thread.html/rc659b7c681499911fb6eff46b9edb5a3d983c11d0c253909afc905b1%40%3Cdev.druid.apache.org%3E) on the dev mailing thread.
   
   One problem is that LIMIT/OFFSET is not used by the WebUI to limit the task count returned by the server. But the core problem I think is that the overlord retrieves all task records from the metadata storage to the SQL layer, the performance bottleneck might be here. So I guess limiting the task count might be less help as we expect.
   
   But if want to limit the count at the SQL layer to make a try, I think it's better to do it at the WebUI side by using LIMIT/OFFSET clause instead of introducing a new configuration at the server side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pjain1 commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
pjain1 commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r737364717



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/overlord/http/OverlordResource.java
##########
@@ -519,28 +519,29 @@ public Response apply(TaskActionClient taskActionClient)
   @GET
   @Path("/waitingTasks")
   @Produces(MediaType.APPLICATION_JSON)
-  public Response getWaitingTasks(@Context final HttpServletRequest req)
+  public Response getWaitingTasks(@QueryParam("n") Integer maxActiveTasks, @Context final HttpServletRequest req)

Review comment:
       I think calling the parameter `maxActiveTasks` will be more clearer




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
jihoonson commented on pull request #11456:
URL: https://github.com/apache/druid/pull/11456#issuecomment-982957395


   ```
   [ERROR] Errors: 
   [ERROR] org.apache.druid.indexing.overlord.http.OverlordResourceTest.testShutdownAllTasksForNonExistingDataSource(org.apache.druid.indexing.overlord.http.OverlordResourceTest)
   [ERROR]   Run 1: OverlordResourceTest.testShutdownAllTasksForNonExistingDataSource:1345 » IllegalState
   [ERROR]   Run 2: OverlordResourceTest.tearDown:158 » Runtime On mock #0 (zero indexed): calling...
   [ERROR]   Run 3: OverlordResourceTest.testShutdownAllTasksForNonExistingDataSource:1345 » IllegalState
   [ERROR]   Run 4: OverlordResourceTest.tearDown:158 » Runtime On mock #0 (zero indexed): calling...
   [ERROR]   Run 5: OverlordResourceTest.testShutdownAllTasksForNonExistingDataSource:1345 » IllegalState
   [ERROR]   Run 6: OverlordResourceTest.tearDown:158 » Runtime On mock #0 (zero indexed): calling...
   [ERROR]   Run 7: OverlordResourceTest.testShutdownAllTasksForNonExistingDataSource:1345 » IllegalState
   [ERROR]   Run 8: OverlordResourceTest.tearDown:158 » Runtime On mock #0 (zero indexed): calling...
   ```
   
   @egor-ryashin it's not test coverage, but seems like a legit failure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r672102393



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/SchemaConfig.java
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.schema;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+public class SchemaConfig
+{
+  @JsonProperty
+  private final int tasksLimit;

Review comment:
       TaskStorageConfig is in `indexing` module, and I will have to add it as dependency for `sql` module because of that change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r689667270



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/overlord/http/OverlordResource.java
##########
@@ -562,6 +563,7 @@ public Response getTasks(
       @QueryParam("datasource") final String dataSource,
       @QueryParam("createdTimeInterval") final String createdTimeInterval,
       @QueryParam("max") final Integer maxCompletedTasks,
+      @QueryParam("maxActiveTasks") final Integer maxActiveTasks,

Review comment:
       There're 2 items I'd like to support though:
   1. It's more flexible and UI has more control over the backend.
   2. It's backward compatible.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on pull request #11456:
URL: https://github.com/apache/druid/pull/11456#issuecomment-896900204


   @FrankChen021 I think introducing the limit in SQL query is most efficient but needs more code changes (for each DB there's different SQL syntax). Introducing the limitation in OverlordResource takes fewer changes and still unloads UI. 
   While changes on UI only - won't solve all the problems because the UI can only limit completed tasks and there's the case with a lot of running tasks.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r672169820



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/SchemaConfig.java
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.schema;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+public class SchemaConfig
+{
+  @JsonProperty
+  private final int tasksLimit;

Review comment:
       Should I move TaskStorageConfig to `core` instead?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r685253372



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/SchemaConfig.java
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.schema;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+public class SchemaConfig
+{
+  @JsonProperty
+  private final int tasksLimit;

Review comment:
       So you propose to limit the result in OverlordResource instead, right? (when OverlordResource.getTasks is called)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on pull request #11456:
URL: https://github.com/apache/druid/pull/11456#issuecomment-982952132


   coverage checks are irrelevant


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r672102393



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/SchemaConfig.java
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.schema;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+public class SchemaConfig
+{
+  @JsonProperty
+  private final int tasksLimit;

Review comment:
       TaskStorageConfig is in `indexing` module, and I will have to add it as dependency for `sql` module because of that change.

##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/SchemaConfig.java
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.schema;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+public class SchemaConfig
+{
+  @JsonProperty
+  private final int tasksLimit;

Review comment:
       Should I move TaskStorageConfig to `core` instead?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r688887804



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/overlord/http/OverlordResource.java
##########
@@ -562,6 +563,7 @@ public Response getTasks(
       @QueryParam("datasource") final String dataSource,
       @QueryParam("createdTimeInterval") final String createdTimeInterval,
       @QueryParam("max") final Integer maxCompletedTasks,
+      @QueryParam("maxActiveTasks") final Integer maxActiveTasks,

Review comment:
       I think we don't need to add this new parameter because there's already a parameter named as 'max'.  The 'max' should limit not only completedTasks but also activeTasks. When parameter 'state' is not given, it limits the total number of tasks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r758354894



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/overlord/http/OverlordResource.java
##########
@@ -519,28 +519,29 @@ public Response apply(TaskActionClient taskActionClient)
   @GET
   @Path("/waitingTasks")
   @Produces(MediaType.APPLICATION_JSON)
-  public Response getWaitingTasks(@Context final HttpServletRequest req)
+  public Response getWaitingTasks(@QueryParam("n") Integer maxActiveTasks, @Context final HttpServletRequest req)

Review comment:
       Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on pull request #11456:
URL: https://github.com/apache/druid/pull/11456#issuecomment-983771476


   looking into integrations tests error


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] egor-ryashin commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
egor-ryashin commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r685253372



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/SchemaConfig.java
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.schema;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+public class SchemaConfig
+{
+  @JsonProperty
+  private final int tasksLimit;

Review comment:
       So you propose to limit the result in OverlordResource instead, right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] pjain1 commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
pjain1 commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r737111100



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/overlord/http/OverlordResource.java
##########
@@ -562,6 +563,7 @@ public Response getTasks(
       @QueryParam("datasource") final String dataSource,
       @QueryParam("createdTimeInterval") final String createdTimeInterval,
       @QueryParam("max") final Integer maxCompletedTasks,
+      @QueryParam("maxActiveTasks") final Integer maxActiveTasks,

Review comment:
       Console changes will also be required to use this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r677095304



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/SchemaConfig.java
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.schema;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+public class SchemaConfig
+{
+  @JsonProperty
+  private final int tasksLimit;

Review comment:
       Perhaps I should have been clearer. [`TaskStorageConfig` is already being used to provide the default value for `durationBeforeNow` for `getRecentlyCreatedAlreadyFinishedTaskInfo()`](https://github.com/apache/druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/overlord/MetadataTaskStorage.java#L232) which is the method used to compute the task statuses to return in `OverlordResource.getTasks()`. You don't have to add a new dependency, but get the default value from `TaskStorageConfig` when `maxTaskStatuses` is null at https://github.com/apache/druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/overlord/MetadataTaskStorage.java#L233.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r686482853



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/SchemaConfig.java
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.schema;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+public class SchemaConfig
+{
+  @JsonProperty
+  private final int tasksLimit;

Review comment:
       Yes, that's my suggestion.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on pull request #11456:
URL: https://github.com/apache/druid/pull/11456#issuecomment-896488808


   @egor-ryashin Thanks for opening this PR. If I understand correctly, this PR tries to addresses  #11042, #11140 and #11567.
   
   There was a discussion about this problem, and I left [some comments](https://lists.apache.org/thread.html/rc659b7c681499911fb6eff46b9edb5a3d983c11d0c253909afc905b1%40%3Cdev.druid.apache.org%3E) on the dev mailing thread.
   
   One problem is that LIMIT/OFFSET is not used by the WebUI to limit the task count returned by the server. But the core problem I think is that the overlord retrieves all task records from the metadata storage to the SQL layer, the performance bottleneck might be here. So I guess limiting the task count might be less help as we expect.
   
   But if want to limit the count at the SQL layer to make a try, I think it's better to do it at the WebUI side by using LIMIT/OFFSET clause instead introducing a new configuration at the server side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #11456: Overlord limit for returned tasks by count

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #11456:
URL: https://github.com/apache/druid/pull/11456#discussion_r671536987



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/SchemaConfig.java
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.calcite.schema;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+public class SchemaConfig
+{
+  @JsonProperty
+  private final int tasksLimit;

Review comment:
       `OverlordResource.getTasks()` which is the API used by the `tasks` table accepts the `max` param to limit the number of tasks returned. How about adding this config in `TaskStorageConfig` which would be used as the default when the `max` param is missing? This will change the behavior of the `getTasks()` API, but I think that's OK.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org