You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/03/17 08:56:17 UTC

[GitHub] [druid] samarthjain opened a new pull request #11008: Improve performance of queries against SYSTEM.SEGMENT table.

samarthjain opened a new pull request #11008:
URL: https://github.com/apache/druid/pull/11008


   Fixes #11007 .
   
   After hooking up Yourkit against the broker instance running the queries mentioned in the above issue, it turns out that a lot of time is being spent in resizing HashMap and HashSet that have been created with the default constructor. Also, a lot of CPU cycles are wasted unnecessarily looping through and authorizing resources even though the authorizer configured is AllowAllAuthorizer. 
   
   After making these changes, on my test cluster, I see that the query times went down from ~12seconds to ~7seconds. 
   
   
   This PR has:
   - [X] been self-reviewed.
      - [X] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [X ] been tested in a test Druid cluster.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] samarthjain merged pull request #11008: Improve performance of queries against SYSTEM.SEGMENT table.

Posted by GitBox <gi...@apache.org>.
samarthjain merged pull request #11008:
URL: https://github.com/apache/druid/pull/11008


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #11008: Improve performance of queries against SYSTEM.SEGMENT table.

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on a change in pull request #11008:
URL: https://github.com/apache/druid/pull/11008#discussion_r595894211



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/DruidSchema.java
##########
@@ -702,13 +702,14 @@ private static RowSignature analysisToRowSignature(final SegmentAnalysis analysi
 
   Map<SegmentId, AvailableSegmentMetadata> getSegmentMetadataSnapshot()
   {
-    final Map<SegmentId, AvailableSegmentMetadata> segmentMetadata = new HashMap<>();
     synchronized (lock) {
+      final Map<SegmentId, AvailableSegmentMetadata> segmentMetadata = Maps.newHashMapWithExpectedSize(
+          segmentMetadataInfo.values().size());

Review comment:
       should this be `segmentMetadataInfo.values().stream().mapToInt(v -> v.size()).sum()` ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on pull request #11008: Improve performance of queries against SYSTEM.SEGMENT table.

Posted by GitBox <gi...@apache.org>.
jihoonson commented on pull request #11008:
URL: https://github.com/apache/druid/pull/11008#issuecomment-801313620


   +1 after CI


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #11008: Improve performance of queries against SYSTEM.SEGMENT table.

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on a change in pull request #11008:
URL: https://github.com/apache/druid/pull/11008#discussion_r597135242



##########
File path: server/src/main/java/org/apache/druid/server/security/AuthorizationUtils.java
##########
@@ -263,6 +263,10 @@ public static Access authorizeAllResourceActions(
       throw new ISE("No authorizer found with name: [%s].", authenticationResult.getAuthorizerName());
     }
 
+    if (authorizer instanceof AllowAllAuthorizer) {

Review comment:
       earlier this code will filter a resource if `resourceActionGenerator.apply(resource)` returns null but now it won't. does that difference matter? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #11008: Improve performance of queries against SYSTEM.SEGMENT table.

Posted by GitBox <gi...@apache.org>.
abhishekagarwal87 commented on a change in pull request #11008:
URL: https://github.com/apache/druid/pull/11008#discussion_r597779973



##########
File path: server/src/main/java/org/apache/druid/server/security/AuthorizationUtils.java
##########
@@ -263,6 +263,10 @@ public static Access authorizeAllResourceActions(
       throw new ISE("No authorizer found with name: [%s].", authenticationResult.getAuthorizerName());
     }
 
+    if (authorizer instanceof AllowAllAuthorizer) {

Review comment:
       do you intend to get rid of the duplicates or keep them? From the description in https://github.com/apache/druid/issues/11012, it seems we want to de-dup the entries though that wasn't happening earlier. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] samarthjain commented on a change in pull request #11008: Improve performance of queries against SYSTEM.SEGMENT table.

Posted by GitBox <gi...@apache.org>.
samarthjain commented on a change in pull request #11008:
URL: https://github.com/apache/druid/pull/11008#discussion_r597304022



##########
File path: server/src/main/java/org/apache/druid/server/security/AuthorizationUtils.java
##########
@@ -263,6 +263,10 @@ public static Access authorizeAllResourceActions(
       throw new ISE("No authorizer found with name: [%s].", authenticationResult.getAuthorizerName());
     }
 
+    if (authorizer instanceof AllowAllAuthorizer) {

Review comment:
       Yeah, looks like this isn't the right thing to do. 
   Also, the following loop needs to be executed because there could be duplicates in `Iterable<ResourceAction`. Will file an issue and a PR to revert this change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] samarthjain commented on a change in pull request #11008: Improve performance of queries against SYSTEM.SEGMENT table.

Posted by GitBox <gi...@apache.org>.
samarthjain commented on a change in pull request #11008:
URL: https://github.com/apache/druid/pull/11008#discussion_r596220954



##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/DruidSchema.java
##########
@@ -702,13 +702,14 @@ private static RowSignature analysisToRowSignature(final SegmentAnalysis analysi
 
   Map<SegmentId, AvailableSegmentMetadata> getSegmentMetadataSnapshot()
   {
-    final Map<SegmentId, AvailableSegmentMetadata> segmentMetadata = new HashMap<>();
     synchronized (lock) {
+      final Map<SegmentId, AvailableSegmentMetadata> segmentMetadata = Maps.newHashMapWithExpectedSize(
+          segmentMetadataInfo.values().size());

Review comment:
       Thanks for pointing that out. Fixed. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org