You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/03/21 02:23:02 UTC

[GitHub] [incubator-pinot] Jackie-Jiang opened a new pull request #5169: Table level timeout implementation

Jackie-Jiang opened a new pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169
 
 
   Introduced 3 levels of query timeout, where the first available one will be used:
   - Query-level: Passed under the query options with key 'timeoutMs'
   - Table-level: Configured under the QueryConfig inside TableConfig
   - Instance-level: Configured in the instance config, or default
   
   The timeout will be picked up on the broker side, and passed to the server under the query options.
   If the query already timed out before broker sending the request to the servers, it will early terminate with BROKER_TIMEOUT_ERROR.
   If the query timed out on the server side, broker should already returned with the partial response.
   
   This PR also supports refreshing routing properties on table config change automatically.
   Because the table config refresh is implemented using the Helix USER_DEFINE_MSG, both controller and broker need to be deployed in order to support it.
   No specific deployment sequence is required (warning will be logged in BrokerUserDefinedMessageHandlerFactory when only one component is deployed).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396783285
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/broker/helix/BrokerUserDefinedMessageHandlerFactory.java
 ##########
 @@ -115,17 +121,15 @@ public HelixTaskResult handleMessage() {
 
     @Override
     public void onError(Exception e, ErrorCode errorCode, ErrorType errorType) {
-      LOGGER.error("Caught exception while updating query quota of table: {} (code: {}, type: {})", _tableNameWithType,
-          errorCode, errorType, e);
+      LOGGER.error("Caught exception while refreshing table config for table: {} (code: {}, type: {})",
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396826410
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
 ##########
 @@ -450,11 +458,22 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Returns the timeout in milliseconds for the given table, or {@code null} if the timeout is not configured in the
 
 Review comment:
   ```suggestion
      * Returns the query timeout in milliseconds for the given table, or {@code null} if the timeout is not configured in the
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] jackjlli commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
jackjlli commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396687579
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 ##########
 @@ -312,8 +312,58 @@ public BrokerResponse handleRequest(JsonNode request, @Nullable RequesterIdentit
     long routingEndTimeNs = System.nanoTime();
     _brokerMetrics.addPhaseTiming(rawTableName, BrokerQueryPhase.QUERY_ROUTING, routingEndTimeNs - routingStartTimeNs);
 
+    // Set timeout in the requests
+    long timeSpentMs = TimeUnit.NANOSECONDS.toMillis(routingEndTimeNs - compilationStartTimeNs);
+    // Get the per-query timeout from the broker request query options if exists
+    long queryTimeoutMs = QueryOptions.getTimeoutMs(brokerRequest.getQueryOptions());
+    // Remaining time in milliseconds for the server query execution
+    // Use the max of offline table remaining time and realtime table remaining time for hybrid use case
+    long remainingTimeMs = 0;
+    // NOTE: Use query-level timeout if exists, or use table-level timeout if exists, or use instance-level timeout
+    if (offlineBrokerRequest != null) {
 
 Review comment:
   You can wrap this logic into a method so that offline and realtime table can reuse the same code.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on issue #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#issuecomment-602905648
 
 
   @jackjlli @mcvsubbu Addressed all the comments. I'll use `null` to denote that the timeout is not configured everywhere for readability.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396728226
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
 ##########
 @@ -354,9 +359,12 @@ public synchronized void buildRouting(String tableNameWithType) {
       }
     }
 
+    QueryConfig queryConfig = tableConfig.getQueryConfig();
+    Long timeoutMs = queryConfig != null ? queryConfig.getTimeoutMs() : null;
 
 Review comment:
   For json ser-de reason, it is risky to put `-1` for `null` value (serialize & deserialize will break). I'll just use `null` in both places then

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-io commented on issue #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
codecov-io commented on issue #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#issuecomment-601987881
 
 
   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=h1) Report
   > Merging [#5169](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/d289db85b350e562d3c0b123ad64c7a2637db4ac&el=desc) will **decrease** coverage by `0.17%`.
   > The diff coverage is `77.77%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/5169/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz)](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #5169      +/-   ##
   ============================================
   - Coverage     66.22%   66.04%   -0.18%     
     Complexity       12       12              
   ============================================
     Files          1051     1051              
     Lines         54119    54164      +45     
     Branches       8064     8078      +14     
   ============================================
   - Hits          35840    35774      -66     
   - Misses        15630    15742     +112     
   + Partials       2649     2648       -1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...org/apache/pinot/common/utils/CommonConstants.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvQ29tbW9uQ29uc3RhbnRzLmphdmE=) | `40.81% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [.../helix/BrokerUserDefinedMessageHandlerFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL2hlbGl4L0Jyb2tlclVzZXJEZWZpbmVkTWVzc2FnZUhhbmRsZXJGYWN0b3J5LmphdmE=) | `68.29% <66.66%> (-0.76%)` | `0.00 <0.00> (ø)` | |
   | [...roker/requesthandler/BaseBrokerRequestHandler.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcmVxdWVzdGhhbmRsZXIvQmFzZUJyb2tlclJlcXVlc3RIYW5kbGVyLmphdmE=) | `80.00% <66.66%> (-1.49%)` | `0.00 <0.00> (ø)` | |
   | [...core/query/executor/ServerQueryExecutorV1Impl.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9leGVjdXRvci9TZXJ2ZXJRdWVyeUV4ZWN1dG9yVjFJbXBsLmphdmE=) | `80.28% <71.42%> (+0.28%)` | `0.00 <0.00> (ø)` | |
   | [...not/common/messages/TableConfigRefreshMessage.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWVzc2FnZXMvVGFibGVDb25maWdSZWZyZXNoTWVzc2FnZS5qYXZh) | `81.81% <81.81%> (ø)` | `0.00 <0.00> (?)` | |
   | [...rg/apache/pinot/broker/routing/RoutingManager.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcm91dGluZy9Sb3V0aW5nTWFuYWdlci5qYXZh) | `82.30% <85.71%> (+0.81%)` | `0.00 <0.00> (ø)` | |
   | [...ntroller/helix/core/PinotHelixResourceManager.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL1Bpbm90SGVsaXhSZXNvdXJjZU1hbmFnZXIuamF2YQ==) | `62.04% <92.85%> (+0.02%)` | `0.00 <0.00> (ø)` | |
   | [.../java/org/apache/pinot/core/util/QueryOptions.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS91dGlsL1F1ZXJ5T3B0aW9ucy5qYXZh) | `94.11% <92.85%> (-5.89%)` | `0.00 <0.00> (ø)` | |
   | [...a/org/apache/pinot/common/metrics/BrokerMeter.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Ccm9rZXJNZXRlci5qYXZh) | `100.00% <100.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | [...ller/validation/OfflineSegmentIntervalChecker.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci92YWxpZGF0aW9uL09mZmxpbmVTZWdtZW50SW50ZXJ2YWxDaGVja2VyLmphdmE=) | `32.05% <0.00%> (-51.29%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [36 more](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=footer). Last update [d289db8...4b5c809](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396731693
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 ##########
 @@ -312,8 +312,58 @@ public BrokerResponse handleRequest(JsonNode request, @Nullable RequesterIdentit
     long routingEndTimeNs = System.nanoTime();
     _brokerMetrics.addPhaseTiming(rawTableName, BrokerQueryPhase.QUERY_ROUTING, routingEndTimeNs - routingStartTimeNs);
 
+    // Set timeout in the requests
+    long timeSpentMs = TimeUnit.NANOSECONDS.toMillis(routingEndTimeNs - compilationStartTimeNs);
+    // Get the per-query timeout from the broker request query options if exists
+    long queryTimeoutMs = QueryOptions.getTimeoutMs(brokerRequest.getQueryOptions());
+    // Remaining time in milliseconds for the server query execution
+    // Use the max of offline table remaining time and realtime table remaining time for hybrid use case
+    long remainingTimeMs = 0;
+    // NOTE: Use query-level timeout if exists, or use table-level timeout if exists, or use instance-level timeout
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396724265
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/broker/helix/BrokerUserDefinedMessageHandlerFactory.java
 ##########
 @@ -98,15 +101,18 @@ public void onError(Exception e, ErrorCode errorCode, ErrorType errorType) {
     }
   }
 
-  private class QueryQuotaUpdateMessageHandler extends DefaultMessageHandler {
+  private class RefreshTableConfigMessageHandler extends MessageHandler {
+    final String _tableNameWithType;
 
-    public QueryQuotaUpdateMessageHandler(QueryQuotaUpdateMessage queryQuotaUpdateMessage,
-        NotificationContext context) {
-      super(queryQuotaUpdateMessage, context);
+    RefreshTableConfigMessageHandler(TableConfigRefreshMessage tableConfigRefreshMessage, NotificationContext context) {
+      super(tableConfigRefreshMessage, context);
+      _tableNameWithType = tableConfigRefreshMessage.getTableNameWithType();
     }
 
     @Override
     public HelixTaskResult handleMessage() {
+      // TODO: Fetch the table config here and pass it into the managers, or consider merging these 2 managers
 
 Review comment:
   not sure why we should merge the 2 managers. Fetching table config, checking for disabled table, etc.  maybe yes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396729783
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 ##########
 @@ -312,8 +312,58 @@ public BrokerResponse handleRequest(JsonNode request, @Nullable RequesterIdentit
     long routingEndTimeNs = System.nanoTime();
     _brokerMetrics.addPhaseTiming(rawTableName, BrokerQueryPhase.QUERY_ROUTING, routingEndTimeNs - routingStartTimeNs);
 
+    // Set timeout in the requests
+    long timeSpentMs = TimeUnit.NANOSECONDS.toMillis(routingEndTimeNs - compilationStartTimeNs);
+    // Get the per-query timeout from the broker request query options if exists
+    long queryTimeoutMs = QueryOptions.getTimeoutMs(brokerRequest.getQueryOptions());
+    // Remaining time in milliseconds for the server query execution
+    // Use the max of offline table remaining time and realtime table remaining time for hybrid use case
+    long remainingTimeMs = 0;
+    // NOTE: Use query-level timeout if exists, or use table-level timeout if exists, or use instance-level timeout
+    if (offlineBrokerRequest != null) {
+      long offlineTimeoutMs;
 
 Review comment:
   Can you make a method out of the block between lines 324 and 338?  Logic seems to be repeated. At least, lines 324 to 331?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] jackjlli commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
jackjlli commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396686958
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 ##########
 @@ -312,8 +312,58 @@ public BrokerResponse handleRequest(JsonNode request, @Nullable RequesterIdentit
     long routingEndTimeNs = System.nanoTime();
     _brokerMetrics.addPhaseTiming(rawTableName, BrokerQueryPhase.QUERY_ROUTING, routingEndTimeNs - routingStartTimeNs);
 
+    // Set timeout in the requests
+    long timeSpentMs = TimeUnit.NANOSECONDS.toMillis(routingEndTimeNs - compilationStartTimeNs);
+    // Get the per-query timeout from the broker request query options if exists
+    long queryTimeoutMs = QueryOptions.getTimeoutMs(brokerRequest.getQueryOptions());
+    // Remaining time in milliseconds for the server query execution
+    // Use the max of offline table remaining time and realtime table remaining time for hybrid use case
+    long remainingTimeMs = 0;
+    // NOTE: Use query-level timeout if exists, or use table-level timeout if exists, or use instance-level timeout
 
 Review comment:
   Can we move this line to just below Line 315 since this is the major logic of this code change in `BaseBrokerRequestHandler`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396780766
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 ##########
 @@ -312,8 +312,58 @@ public BrokerResponse handleRequest(JsonNode request, @Nullable RequesterIdentit
     long routingEndTimeNs = System.nanoTime();
     _brokerMetrics.addPhaseTiming(rawTableName, BrokerQueryPhase.QUERY_ROUTING, routingEndTimeNs - routingStartTimeNs);
 
+    // Set timeout in the requests
+    long timeSpentMs = TimeUnit.NANOSECONDS.toMillis(routingEndTimeNs - compilationStartTimeNs);
+    // Get the per-query timeout from the broker request query options if exists
+    long queryTimeoutMs = QueryOptions.getTimeoutMs(brokerRequest.getQueryOptions());
+    // Remaining time in milliseconds for the server query execution
+    // Use the max of offline table remaining time and realtime table remaining time for hybrid use case
+    long remainingTimeMs = 0;
+    // NOTE: Use query-level timeout if exists, or use table-level timeout if exists, or use instance-level timeout
+    if (offlineBrokerRequest != null) {
 
 Review comment:
   DOne

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396730108
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
 ##########
 @@ -450,11 +458,22 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Returns the timeout in milliseconds for the given table, or {@code null} if the timeout is not configured in the
+   * table config.
+   */
+  @Nullable
+  public Long getTimeoutMs(String tableNameWithType) {
 
 Review comment:
   I don't quite understand this comment. This is the timeout for the table specified in the table config

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396828878
 
 

 ##########
 File path: pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/OfflineClusterIntegrationTest.java
 ##########
 @@ -185,6 +191,58 @@ public void testInvalidTableConfig() {
     }
   }
 
+  @Test
+  public void testRefreshTableConfigAndQueryTimeout()
+      throws Exception {
+    TableConfig tableConfig = _helixResourceManager.getOfflineTableConfig(getTableName());
+    assertNotNull(tableConfig);
+
+    // Set timeout as 5ms so that query will timeout
+    tableConfig.setQueryConfig(new QueryConfig(5L));
+    _helixResourceManager.updateTableConfig(tableConfig);
+
+    // Wait for at most 1 minute for broker to receive and process the table config refresh message
+    TestUtils.waitForCondition(aVoid -> {
 
 Review comment:
   seems like a candidate for intermittent failure, good to add a comment on the circumstances under which this test can fail. I cant think of another way to add the test

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396729511
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
 ##########
 @@ -354,9 +359,12 @@ public synchronized void buildRouting(String tableNameWithType) {
       }
     }
 
+    QueryConfig queryConfig = tableConfig.getQueryConfig();
+    Long timeoutMs = queryConfig != null ? queryConfig.getTimeoutMs() : null;
 
 Review comment:
   It should not be `0` (added a check), but `1ms` might be too long for broker compilation to timeout. Let me revise the test, merge the broker & server timeout so that test is not flaky.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396783305
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/broker/helix/BrokerUserDefinedMessageHandlerFactory.java
 ##########
 @@ -136,8 +140,7 @@ public HelixTaskResult handleMessage() {
     }
 
     @Override
-    public void onError(Exception e, ErrorCode errorCode, ErrorType errorType) {
-      LOGGER.error("Caught exception on table: {} (code: {}, type: {})", _tableNameWithType, errorCode, errorType, e);
+    public void onError(Exception e, ErrorCode code, ErrorType type) {
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396831338
 
 

 ##########
 File path: pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/OfflineClusterIntegrationTest.java
 ##########
 @@ -185,6 +191,58 @@ public void testInvalidTableConfig() {
     }
   }
 
+  @Test
+  public void testRefreshTableConfigAndQueryTimeout()
+      throws Exception {
+    TableConfig tableConfig = _helixResourceManager.getOfflineTableConfig(getTableName());
+    assertNotNull(tableConfig);
+
+    // Set timeout as 5ms so that query will timeout
+    tableConfig.setQueryConfig(new QueryConfig(5L));
+    _helixResourceManager.updateTableConfig(tableConfig);
+
+    // Wait for at most 1 minute for broker to receive and process the table config refresh message
+    TestUtils.waitForCondition(aVoid -> {
 
 Review comment:
   I removed the test on uncertain properties (such as `timeUsedMs`), and it should always pass right not.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396825275
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 ##########
 @@ -312,8 +313,29 @@ public BrokerResponse handleRequest(JsonNode request, @Nullable RequesterIdentit
     long routingEndTimeNs = System.nanoTime();
     _brokerMetrics.addPhaseTiming(rawTableName, BrokerQueryPhase.QUERY_ROUTING, routingEndTimeNs - routingStartTimeNs);
 
+    // Set timeout in the requests
+    long timeSpentMs = TimeUnit.NANOSECONDS.toMillis(routingEndTimeNs - compilationStartTimeNs);
+    // Remaining time in milliseconds for the server query execution
+    // NOTE: Use the max of offline table remaining time and realtime table remaining time for hybrid use case if the
+    //       timeout for them are not configured to be the same. Server side will have different remaining time set for
+    //       each table type, and broker should wait for both types to return.
+    long remainingTimeMs = 0;
+    try {
+      if (offlineBrokerRequest != null) {
+        remainingTimeMs = setQueryTimeout(offlineTableName, offlineBrokerRequest.getQueryOptions(), timeSpentMs);
+      }
+      if (realtimeBrokerRequest != null) {
+        remainingTimeMs = Math.max(remainingTimeMs,
 
 Review comment:
   Configuring the two tables to have different timeout values seems to be an admin error, given that pinot treats the table configs independently. Under the given constraints, we can either use `Math.max` or `Math.min`, I suppose. It is worth adding a comment that this is an arbitrary choice. Perhaps `max` is better if we dont want queries suddenly timing out or having partial results due to stricter time limits.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396725274
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/broker/helix/BrokerUserDefinedMessageHandlerFactory.java
 ##########
 @@ -136,8 +140,7 @@ public HelixTaskResult handleMessage() {
     }
 
     @Override
-    public void onError(Exception e, ErrorCode errorCode, ErrorType errorType) {
-      LOGGER.error("Caught exception on table: {} (code: {}, type: {})", _tableNameWithType, errorCode, errorType, e);
+    public void onError(Exception e, ErrorCode code, ErrorType type) {
 
 Review comment:
    It is better to leave the error message here, in case there is some issue with Helix. Most likely it wont be called anyway.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] jackjlli commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
jackjlli commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396689337
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
 ##########
 @@ -450,11 +458,22 @@ public TimeBoundaryInfo getTimeBoundaryInfo(String offlineTableName) {
     return timeBoundaryManager != null ? timeBoundaryManager.getTimeBoundaryInfo() : null;
   }
 
+  /**
+   * Returns the timeout in milliseconds for the given table, or {@code null} if the timeout is not configured in the
+   * table config.
+   */
+  @Nullable
+  public Long getTimeoutMs(String tableNameWithType) {
 
 Review comment:
   Can we specify the timeout here? Is it the max overall timeout for each query?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396830563
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 ##########
 @@ -312,8 +313,29 @@ public BrokerResponse handleRequest(JsonNode request, @Nullable RequesterIdentit
     long routingEndTimeNs = System.nanoTime();
     _brokerMetrics.addPhaseTiming(rawTableName, BrokerQueryPhase.QUERY_ROUTING, routingEndTimeNs - routingStartTimeNs);
 
+    // Set timeout in the requests
+    long timeSpentMs = TimeUnit.NANOSECONDS.toMillis(routingEndTimeNs - compilationStartTimeNs);
+    // Remaining time in milliseconds for the server query execution
+    // NOTE: Use the max of offline table remaining time and realtime table remaining time for hybrid use case if the
+    //       timeout for them are not configured to be the same. Server side will have different remaining time set for
+    //       each table type, and broker should wait for both types to return.
+    long remainingTimeMs = 0;
+    try {
+      if (offlineBrokerRequest != null) {
+        remainingTimeMs = setQueryTimeout(offlineTableName, offlineBrokerRequest.getQueryOptions(), timeSpentMs);
+      }
+      if (realtimeBrokerRequest != null) {
+        remainingTimeMs = Math.max(remainingTimeMs,
 
 Review comment:
   Updated the comments so that it is more clear

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396731102
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 ##########
 @@ -312,8 +312,58 @@ public BrokerResponse handleRequest(JsonNode request, @Nullable RequesterIdentit
     long routingEndTimeNs = System.nanoTime();
     _brokerMetrics.addPhaseTiming(rawTableName, BrokerQueryPhase.QUERY_ROUTING, routingEndTimeNs - routingStartTimeNs);
 
+    // Set timeout in the requests
+    long timeSpentMs = TimeUnit.NANOSECONDS.toMillis(routingEndTimeNs - compilationStartTimeNs);
+    // Get the per-query timeout from the broker request query options if exists
+    long queryTimeoutMs = QueryOptions.getTimeoutMs(brokerRequest.getQueryOptions());
+    // Remaining time in milliseconds for the server query execution
+    // Use the max of offline table remaining time and realtime table remaining time for hybrid use case
+    long remainingTimeMs = 0;
+    // NOTE: Use query-level timeout if exists, or use table-level timeout if exists, or use instance-level timeout
+    if (offlineBrokerRequest != null) {
+      long offlineTimeoutMs;
+      if (queryTimeoutMs > 0) {
+        offlineTimeoutMs = queryTimeoutMs;
+      } else {
+        Long tableTimeoutMs = _routingManager.getTimeoutMs(offlineTableName);
+        offlineTimeoutMs = tableTimeoutMs != null ? tableTimeoutMs : _brokerTimeoutMs;
+      }
+      long offlineRemainingTimeMs = offlineTimeoutMs - timeSpentMs;
+      if (offlineRemainingTimeMs <= 0) {
+        LOGGER.info("Offline table timed out (time spent: {}ms, timeout: {}) before scattering the request {}: {}",
+            timeSpentMs, offlineTimeoutMs, requestId, query);
+        _brokerMetrics.addMeteredTableValue(rawTableName, BrokerMeter.REQUEST_TIMEOUT_BEFORE_SCATTERED_EXCEPTIONS, 1);
+        return new BrokerResponseNative(QueryException.getException(QueryException.BROKER_TIMEOUT_ERROR, String
+            .format("Offline table timed out (time spent: %dms, timeout: %dms) before scattering the request",
+                timeSpentMs, offlineTimeoutMs)));
+      }
+      offlineBrokerRequest.getQueryOptions()
+          .put(Broker.Request.QueryOptionKey.TIMEOUT_MS, Long.toString(offlineRemainingTimeMs));
+      remainingTimeMs = offlineRemainingTimeMs;
+    }
+    if (realtimeBrokerRequest != null) {
+      long realtimeTimeoutMs;
+      if (queryTimeoutMs > 0) {
+        realtimeTimeoutMs = queryTimeoutMs;
+      } else {
+        Long tableTimeoutMs = _routingManager.getTimeoutMs(realtimeTableName);
+        realtimeTimeoutMs = tableTimeoutMs != null ? tableTimeoutMs : _brokerTimeoutMs;
+      }
+      long realtimeRemainingTimeMs = realtimeTimeoutMs - timeSpentMs;
+      if (realtimeRemainingTimeMs <= 0) {
+        LOGGER.info("Realtime table timed out (time spent: {}ms, timeout: {}) before scattering the request {}: {}",
+            timeSpentMs, realtimeTimeoutMs, requestId, query);
+        _brokerMetrics.addMeteredTableValue(rawTableName, BrokerMeter.REQUEST_TIMEOUT_BEFORE_SCATTERED_EXCEPTIONS, 1);
+        return new BrokerResponseNative(QueryException.getException(QueryException.BROKER_TIMEOUT_ERROR, String
+            .format("Realtime table timed out (time spent: %dms, timeout: %dms) before scattering the request",
+                timeSpentMs, realtimeTimeoutMs)));
+      }
+      realtimeBrokerRequest.getQueryOptions()
+          .put(Broker.Request.QueryOptionKey.TIMEOUT_MS, Long.toString(realtimeRemainingTimeMs));
+      remainingTimeMs = Math.max(remainingTimeMs, realtimeRemainingTimeMs);
 
 Review comment:
   Math.min? Should we not take the stricter time to be the timeout value? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] jackjlli commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
jackjlli commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396690977
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
 ##########
 @@ -354,9 +359,12 @@ public synchronized void buildRouting(String tableNameWithType) {
       }
     }
 
+    QueryConfig queryConfig = tableConfig.getQueryConfig();
+    Long timeoutMs = queryConfig != null ? queryConfig.getTimeoutMs() : null;
 
 Review comment:
   Also, I notice in the test you specified `TimeoutMs` to be 0. Does 0 has special meaning? If yes, could you add it to the description of this variable?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang merged pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396723128
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/broker/helix/BrokerUserDefinedMessageHandlerFactory.java
 ##########
 @@ -115,17 +121,15 @@ public HelixTaskResult handleMessage() {
 
     @Override
     public void onError(Exception e, ErrorCode errorCode, ErrorType errorType) {
-      LOGGER.error("Caught exception while updating query quota of table: {} (code: {}, type: {})", _tableNameWithType,
-          errorCode, errorType, e);
+      LOGGER.error("Caught exception while refreshing table config for table: {} (code: {}, type: {})",
 
 Review comment:
   nit: 
   ```suggestion
         LOGGER.error("Caught exception while refreshing table config for table: {} (code: {}, error type: {})",
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] codecov-io edited a comment on issue #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#issuecomment-601987881
 
 
   # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=h1) Report
   > Merging [#5169](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-pinot/commit/d289db85b350e562d3c0b123ad64c7a2637db4ac&el=desc) will **decrease** coverage by `0.17%`.
   > The diff coverage is `77.77%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-pinot/pull/5169/graphs/tree.svg?width=650&height=150&src=pr&token=4ibza2ugkz)](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #5169      +/-   ##
   ============================================
   - Coverage     66.22%   66.04%   -0.18%     
     Complexity       12       12              
   ============================================
     Files          1051     1051              
     Lines         54119    54164      +45     
     Branches       8064     8078      +14     
   ============================================
   - Hits          35840    35774      -66     
   - Misses        15630    15742     +112     
   + Partials       2649     2648       -1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...org/apache/pinot/common/utils/CommonConstants.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvQ29tbW9uQ29uc3RhbnRzLmphdmE=) | `40.81% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | [.../helix/BrokerUserDefinedMessageHandlerFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL2hlbGl4L0Jyb2tlclVzZXJEZWZpbmVkTWVzc2FnZUhhbmRsZXJGYWN0b3J5LmphdmE=) | `68.29% <66.66%> (-0.76%)` | `0.00 <0.00> (ø)` | |
   | [...roker/requesthandler/BaseBrokerRequestHandler.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcmVxdWVzdGhhbmRsZXIvQmFzZUJyb2tlclJlcXVlc3RIYW5kbGVyLmphdmE=) | `80.00% <66.66%> (-1.49%)` | `0.00 <0.00> (ø)` | |
   | [...core/query/executor/ServerQueryExecutorV1Impl.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9leGVjdXRvci9TZXJ2ZXJRdWVyeUV4ZWN1dG9yVjFJbXBsLmphdmE=) | `80.28% <71.42%> (+0.28%)` | `0.00 <0.00> (ø)` | |
   | [...not/common/messages/TableConfigRefreshMessage.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWVzc2FnZXMvVGFibGVDb25maWdSZWZyZXNoTWVzc2FnZS5qYXZh) | `81.81% <81.81%> (ø)` | `0.00 <0.00> (?)` | |
   | [...rg/apache/pinot/broker/routing/RoutingManager.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvcm91dGluZy9Sb3V0aW5nTWFuYWdlci5qYXZh) | `82.30% <85.71%> (+0.81%)` | `0.00 <0.00> (ø)` | |
   | [...ntroller/helix/core/PinotHelixResourceManager.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL1Bpbm90SGVsaXhSZXNvdXJjZU1hbmFnZXIuamF2YQ==) | `62.04% <92.85%> (+0.02%)` | `0.00 <0.00> (ø)` | |
   | [.../java/org/apache/pinot/core/util/QueryOptions.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS91dGlsL1F1ZXJ5T3B0aW9ucy5qYXZh) | `94.11% <92.85%> (-5.89%)` | `0.00 <0.00> (ø)` | |
   | [...a/org/apache/pinot/common/metrics/BrokerMeter.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vbWV0cmljcy9Ccm9rZXJNZXRlci5qYXZh) | `100.00% <100.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | [...ller/validation/OfflineSegmentIntervalChecker.java](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci92YWxpZGF0aW9uL09mZmxpbmVTZWdtZW50SW50ZXJ2YWxDaGVja2VyLmphdmE=) | `32.05% <0.00%> (-51.29%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [36 more](https://codecov.io/gh/apache/incubator-pinot/pull/5169/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=footer). Last update [d289db8...4b5c809](https://codecov.io/gh/apache/incubator-pinot/pull/5169?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396740125
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 ##########
 @@ -312,8 +312,58 @@ public BrokerResponse handleRequest(JsonNode request, @Nullable RequesterIdentit
     long routingEndTimeNs = System.nanoTime();
     _brokerMetrics.addPhaseTiming(rawTableName, BrokerQueryPhase.QUERY_ROUTING, routingEndTimeNs - routingStartTimeNs);
 
+    // Set timeout in the requests
+    long timeSpentMs = TimeUnit.NANOSECONDS.toMillis(routingEndTimeNs - compilationStartTimeNs);
+    // Get the per-query timeout from the broker request query options if exists
+    long queryTimeoutMs = QueryOptions.getTimeoutMs(brokerRequest.getQueryOptions());
+    // Remaining time in milliseconds for the server query execution
+    // Use the max of offline table remaining time and realtime table remaining time for hybrid use case
+    long remainingTimeMs = 0;
+    // NOTE: Use query-level timeout if exists, or use table-level timeout if exists, or use instance-level timeout
+    if (offlineBrokerRequest != null) {
 
 Review comment:
   +1, I have the same comment

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396826849
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 ##########
 @@ -312,8 +313,29 @@ public BrokerResponse handleRequest(JsonNode request, @Nullable RequesterIdentit
     long routingEndTimeNs = System.nanoTime();
     _brokerMetrics.addPhaseTiming(rawTableName, BrokerQueryPhase.QUERY_ROUTING, routingEndTimeNs - routingStartTimeNs);
 
+    // Set timeout in the requests
+    long timeSpentMs = TimeUnit.NANOSECONDS.toMillis(routingEndTimeNs - compilationStartTimeNs);
+    // Remaining time in milliseconds for the server query execution
+    // NOTE: Use the max of offline table remaining time and realtime table remaining time for hybrid use case if the
+    //       timeout for them are not configured to be the same. Server side will have different remaining time set for
+    //       each table type, and broker should wait for both types to return.
+    long remainingTimeMs = 0;
+    try {
+      if (offlineBrokerRequest != null) {
+        remainingTimeMs = setQueryTimeout(offlineTableName, offlineBrokerRequest.getQueryOptions(), timeSpentMs);
+      }
+      if (realtimeBrokerRequest != null) {
+        remainingTimeMs = Math.max(remainingTimeMs,
 
 Review comment:
   IMO, it could be very rare but is legit for some use cases to configure offline part to have lower timeout than the realtime part because in general offline data has better performance. This is not arbitrary choice though. To get the full result, broker has to wait for both parts done before merging the result.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] jackjlli commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
jackjlli commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396688679
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
 ##########
 @@ -354,9 +359,12 @@ public synchronized void buildRouting(String tableNameWithType) {
       }
     }
 
+    QueryConfig queryConfig = tableConfig.getQueryConfig();
+    Long timeoutMs = queryConfig != null ? queryConfig.getTimeoutMs() : null;
 
 Review comment:
   I saw there are two places which use different types of long. Can we unify them together? E.g. always use -1 to denote `getTimeoutMs` is null.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on a change in pull request #5169: Table level timeout implementation
URL: https://github.com/apache/incubator-pinot/pull/5169#discussion_r396785179
 
 

 ##########
 File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 ##########
 @@ -312,8 +312,58 @@ public BrokerResponse handleRequest(JsonNode request, @Nullable RequesterIdentit
     long routingEndTimeNs = System.nanoTime();
     _brokerMetrics.addPhaseTiming(rawTableName, BrokerQueryPhase.QUERY_ROUTING, routingEndTimeNs - routingStartTimeNs);
 
+    // Set timeout in the requests
+    long timeSpentMs = TimeUnit.NANOSECONDS.toMillis(routingEndTimeNs - compilationStartTimeNs);
+    // Get the per-query timeout from the broker request query options if exists
+    long queryTimeoutMs = QueryOptions.getTimeoutMs(brokerRequest.getQueryOptions());
+    // Remaining time in milliseconds for the server query execution
+    // Use the max of offline table remaining time and realtime table remaining time for hybrid use case
+    long remainingTimeMs = 0;
+    // NOTE: Use query-level timeout if exists, or use table-level timeout if exists, or use instance-level timeout
+    if (offlineBrokerRequest != null) {
+      long offlineTimeoutMs;
+      if (queryTimeoutMs > 0) {
+        offlineTimeoutMs = queryTimeoutMs;
+      } else {
+        Long tableTimeoutMs = _routingManager.getTimeoutMs(offlineTableName);
+        offlineTimeoutMs = tableTimeoutMs != null ? tableTimeoutMs : _brokerTimeoutMs;
+      }
+      long offlineRemainingTimeMs = offlineTimeoutMs - timeSpentMs;
+      if (offlineRemainingTimeMs <= 0) {
+        LOGGER.info("Offline table timed out (time spent: {}ms, timeout: {}) before scattering the request {}: {}",
+            timeSpentMs, offlineTimeoutMs, requestId, query);
+        _brokerMetrics.addMeteredTableValue(rawTableName, BrokerMeter.REQUEST_TIMEOUT_BEFORE_SCATTERED_EXCEPTIONS, 1);
+        return new BrokerResponseNative(QueryException.getException(QueryException.BROKER_TIMEOUT_ERROR, String
+            .format("Offline table timed out (time spent: %dms, timeout: %dms) before scattering the request",
+                timeSpentMs, offlineTimeoutMs)));
+      }
+      offlineBrokerRequest.getQueryOptions()
+          .put(Broker.Request.QueryOptionKey.TIMEOUT_MS, Long.toString(offlineRemainingTimeMs));
+      remainingTimeMs = offlineRemainingTimeMs;
+    }
+    if (realtimeBrokerRequest != null) {
+      long realtimeTimeoutMs;
+      if (queryTimeoutMs > 0) {
+        realtimeTimeoutMs = queryTimeoutMs;
+      } else {
+        Long tableTimeoutMs = _routingManager.getTimeoutMs(realtimeTableName);
+        realtimeTimeoutMs = tableTimeoutMs != null ? tableTimeoutMs : _brokerTimeoutMs;
+      }
+      long realtimeRemainingTimeMs = realtimeTimeoutMs - timeSpentMs;
+      if (realtimeRemainingTimeMs <= 0) {
+        LOGGER.info("Realtime table timed out (time spent: {}ms, timeout: {}) before scattering the request {}: {}",
+            timeSpentMs, realtimeTimeoutMs, requestId, query);
+        _brokerMetrics.addMeteredTableValue(rawTableName, BrokerMeter.REQUEST_TIMEOUT_BEFORE_SCATTERED_EXCEPTIONS, 1);
+        return new BrokerResponseNative(QueryException.getException(QueryException.BROKER_TIMEOUT_ERROR, String
+            .format("Realtime table timed out (time spent: %dms, timeout: %dms) before scattering the request",
+                timeSpentMs, realtimeTimeoutMs)));
+      }
+      realtimeBrokerRequest.getQueryOptions()
+          .put(Broker.Request.QueryOptionKey.TIMEOUT_MS, Long.toString(realtimeRemainingTimeMs));
+      remainingTimeMs = Math.max(remainingTimeMs, realtimeRemainingTimeMs);
 
 Review comment:
   Should be max. Updated the comments as following:
   ```
       // NOTE: Use the max of offline table remaining time and realtime table remaining time for hybrid use case if the
       //       timeout for them are not configured to be the same. Server side will have different remaining time set for
       //       each table type, and broker should wait for both types to return.
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org