You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/01/10 21:59:02 UTC

[GitHub] [druid] capistrant opened a new pull request #9165: Forbid easily misused HashSet and HashMap constructors

capistrant opened a new pull request #9165: Forbid easily misused HashSet and HashMap constructors
URL: https://github.com/apache/druid/pull/9165
 
 
   <!-- Thanks for trying to help us make Apache Druid be the best it can be! Please fill out as much of the following information as is possible (where relevant, and remove it when irrelevant) to help make the intention and scope of this PR clear in order to ease review. -->
   
   Fixes #8423 
   
   <!-- Replace XXXX with the id of the issue fixed in this PR. Remove this section if there is no corresponding issue. Don't reference the issue in the title of this pull-request. -->
   
   <!-- If you are a committer, follow the PR action item checklist for committers:
   https://github.com/apache/druid/blob/master/dev/committer-instructions.md#pr-and-issue-action-item-checklist-for-committers. -->
   
   ### Description
   
   <!-- Describe the goal of this PR, what problem are you fixing. If there is a corresponding issue (referenced above), it's not necessary to repeat the description here, however, you may choose to keep one summary sentence. -->
   
   <!-- Describe your patch: what did you change in code? How did you fix the problem? -->
   
   <!-- If there are several relatively logically separate changes in this PR, create a mini-section for each of them. For example: -->
   
   This PR adds constructors for HashMap, HashSet, and LinkedHashSet that specify an initial size parameter. It replaces all usage of these constructors with Guava helper methods that take the users parameter for size and create a more optimally sized object for them compared to what they get from the native constructors.
   
   HashMap, HashSet and LinkedHashSet objects are often created in an inefficient way when the creator specifies a size. Guava has created utilities to take the size desired by the creator and come up with a proper initial size for the underlying object for them. It is explained more clearly [here](https://stackoverflow.com/a/30220944/648955). The same is also true for LinkedHashMap, but unfortunately we are still using Guava 16, and the Guava implementation for LinkedHashMaps did not come about until Guava 19.
   
   I had to add an ignores item in the pom.xml file for `SomeAvroDatum.class` because it is generated at build time so it can't have a suppression annotation inline. If there is a better way to exclude that I'm not thinking of, please let me know.
   
   <!--
   In each section, please describe design decisions made, including:
    - Choice of algorithms
    - Behavioral aspects. What configuration values are acceptable? How are corner cases and error conditions handled, such as when there are insufficient resources?
    - Class organization and design (how the logic is split between classes, inheritance, composition, design patterns)
    - Method organization and design (how the logic is split between methods, parameters and return types)
    - Naming (class, method, API, configuration, HTTP endpoint, names of emitted metrics)
   -->
   
   
   <!-- It's good to describe an alternative design (or mention an alternative name) for every design (or naming) decision point and compare the alternatives with the designs that you've implemented (or the names you've chosen) to highlight the advantages of the chosen designs and names. -->
   
   <!-- If there was a discussion of the design of the feature implemented in this PR elsewhere (e. g. a "Proposal" issue, any other issue, or a thread in the development mailing list), link to that discussion from this PR description and explain what have changed in your final design compared to your original proposal or the consensus version in the end of the discussion. If something hasn't changed since the original discussion, you can omit a detailed discussion of those aspects of the design here, perhaps apart from brief mentioning for the sake of readability of this PR description. -->
   
   <!-- Some of the aspects mentioned above may be omitted for simple and small changes. -->
   
   <hr>
   
   This PR has:
   - [X] been self-reviewed
   - [ ] been tested in a test Druid cluster.
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not all of these items apply to every PR. Remove the items which are not done or not relevant to the PR. None of the items from the checklist above are strictly necessary, but it would be very helpful if you at least self-review the PR. -->
   
   <hr>
   
   ##### Key changed/added classes in this PR
   
   Many classes were modified, but the changes are all transparent. Just changing the way Map and Set objects are created in some specific cases.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] capistrant commented on a change in pull request #9165: Forbid easily misused HashSet and HashMap constructors

Posted by GitBox <gi...@apache.org>.
capistrant commented on a change in pull request #9165: Forbid easily misused HashSet and HashMap constructors
URL: https://github.com/apache/druid/pull/9165#discussion_r368227412
 
 

 ##########
 File path: core/src/main/java/org/apache/druid/utils/CollectionUtils.java
 ##########
 @@ -39,6 +41,8 @@
 
 public final class CollectionUtils
 {
+  public static final int MAX_EXPECTED_SIZE = (1 << 30);
 
 Review comment:
   Thanks for catching that

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] leventov commented on a change in pull request #9165: Forbid easily misused HashSet and HashMap constructors

Posted by GitBox <gi...@apache.org>.
leventov commented on a change in pull request #9165: Forbid easily misused HashSet and HashMap constructors
URL: https://github.com/apache/druid/pull/9165#discussion_r367281139
 
 

 ##########
 File path: codestyle/druid-forbidden-apis.txt
 ##########
 @@ -31,6 +31,12 @@ java.lang.String#replace(java.lang.CharSequence,java.lang.CharSequence) @ Use on
 java.lang.String#replaceAll(java.lang.String,java.lang.String) @ Use one of the appropriate methods in StringUtils instead, or compile and cache a Pattern explicitly
 java.lang.String#replaceFirst(java.lang.String,java.lang.String) @ Use String.indexOf() and substring methods, or compile and cache a Pattern explicitly
 java.nio.file.Files#createTempDirectory(java.lang.String prefix,java.nio.file.FileAttribute...) @ Use org.apache.druid.java.util.common.FileUtils.createTempDir()
+java.util.HashMap#<init>(int) @ Use com.google.common.collect.Maps#newHashMapWithExpectedSize(int) instead
+java.util.HashMap#<init>(int, float) @ Use com.google.common.collect.Maps#newHashMapWithExpectedSize(int) instead
+java.util.HashSet#<init>(int) @ Use com.google.collect.Sets#newHashSetWithExpectedSize(int) instead
+java.util.HashSet#<init>(int, float) @ Use com.google.collect.Sets#newHashSetWithExpectedSize(int) instead
+java.util.LinkedHashSet#<init>(int) @ Use com.google.collect.Sets#newLinkedHashSatWithExpectedSize(int) instead
+java.util.LinkedHashSet#<init>(int, float) @ Use com.google.collect.Sets#newLinkedHashSatWithExpectedSize(int) instead
 
 Review comment:
   LinkedHashMap should be prohibited, too, even though there is no method ready in Guava. A polyfill method could be added to `CollectionUtils`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] capistrant commented on issue #9165: Forbid easily misused HashSet and HashMap constructors

Posted by GitBox <gi...@apache.org>.
capistrant commented on issue #9165: Forbid easily misused HashSet and HashMap constructors
URL: https://github.com/apache/druid/pull/9165#issuecomment-574383926
 
 
   The travis-ci logs for the failed build steps all appear to be a 5XX error pulling external resources. Any way to get the build restarted short of pushing a dummy commit?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] capistrant commented on issue #9165: Forbid easily misused HashSet and HashMap constructors

Posted by GitBox <gi...@apache.org>.
capistrant commented on issue #9165: Forbid easily misused HashSet and HashMap constructors
URL: https://github.com/apache/druid/pull/9165#issuecomment-576712675
 
 
   Looks like a non-related TravisCI failure due to external dependency pulls

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] leventov commented on a change in pull request #9165: Forbid easily misused HashSet and HashMap constructors

Posted by GitBox <gi...@apache.org>.
leventov commented on a change in pull request #9165: Forbid easily misused HashSet and HashMap constructors
URL: https://github.com/apache/druid/pull/9165#discussion_r368213591
 
 

 ##########
 File path: core/src/main/java/org/apache/druid/utils/CollectionUtils.java
 ##########
 @@ -39,6 +41,8 @@
 
 public final class CollectionUtils
 {
+  public static final int MAX_EXPECTED_SIZE = (1 << 30);
 
 Review comment:
   Nit: should better be private.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] capistrant edited a comment on issue #9165: Forbid easily misused HashSet and HashMap constructors

Posted by GitBox <gi...@apache.org>.
capistrant edited a comment on issue #9165: Forbid easily misused HashSet and HashMap constructors
URL: https://github.com/apache/druid/pull/9165#issuecomment-574383926
 
 
   Edit: Not sure what triggered things to run again, but build appears to be green now 🤷‍♂ 
   
   The travis-ci logs for the failed build steps all appear to be a 5XX error pulling external resources. Any way to get the build restarted short of pushing a dummy commit?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jon-wei commented on issue #9165: Forbid easily misused HashSet and HashMap constructors

Posted by GitBox <gi...@apache.org>.
jon-wei commented on issue #9165: Forbid easily misused HashSet and HashMap constructors
URL: https://github.com/apache/druid/pull/9165#issuecomment-574395058
 
 
   @capistrant I restarted the failed runs earlier, was going to comment but got pulled away for a moment

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] leventov merged pull request #9165: Forbid easily misused HashSet and HashMap constructors

Posted by GitBox <gi...@apache.org>.
leventov merged pull request #9165: Forbid easily misused HashSet and HashMap constructors
URL: https://github.com/apache/druid/pull/9165
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] leventov commented on a change in pull request #9165: Forbid easily misused HashSet and HashMap constructors

Posted by GitBox <gi...@apache.org>.
leventov commented on a change in pull request #9165: Forbid easily misused HashSet and HashMap constructors
URL: https://github.com/apache/druid/pull/9165#discussion_r376250997
 
 

 ##########
 File path: pom.xml
 ##########
 @@ -1341,6 +1341,9 @@
                         <signaturesFile>${project.parent.basedir}/codestyle/joda-time-forbidden-apis.txt</signaturesFile>
                         <signaturesFile>${project.parent.basedir}/codestyle/druid-forbidden-apis.txt</signaturesFile>
                     </signaturesFiles>
+                    <excludes>
+                      <exclude>**/SomeAvroDatum.class</exclude>
 
 Review comment:
   I've created https://issues.apache.org/jira/browse/AVRO-2731 and https://github.com/apache/druid/issues/9331 to track this. FYI @Fokko.
   
   @capistrant please try to create such issues yourself to close the loops and offload info from the heads of people.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org