You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/11/16 12:26:35 UTC

[GitHub] [lucene] thecoop opened a new pull request, #11942: Ensure collections are properly sized on creation

thecoop opened a new pull request, #11942:
URL: https://github.com/apache/lucene/pull/11942

   Change calls to `new HashMap(int)`/`new HashSet(int)` to a method that ensures the backing array won't be resized.
   
   A few other optimisations around map methods I picked up along the way


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] thecoop commented on pull request #11942: Ensure collections are properly sized on creation

Posted by GitBox <gi...@apache.org>.
thecoop commented on PR #11942:
URL: https://github.com/apache/lucene/pull/11942#issuecomment-1325111034

   I've added a comment for that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] thecoop commented on pull request #11942: Ensure collections are properly sized on creation

Posted by GitBox <gi...@apache.org>.
thecoop commented on PR #11942:
URL: https://github.com/apache/lucene/pull/11942#issuecomment-1326208762

   Ah yes, oops, sorry about that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] thecoop commented on pull request #11942: Ensure collections are properly sized on creation

Posted by GitBox <gi...@apache.org>.
thecoop commented on PR #11942:
URL: https://github.com/apache/lucene/pull/11942#issuecomment-1325260207

   Comments all addressed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz merged pull request #11942: Ensure collections are properly sized on creation

Posted by GitBox <gi...@apache.org>.
jpountz merged PR #11942:
URL: https://github.com/apache/lucene/pull/11942


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on pull request #11942: Ensure collections are properly sized on creation

Posted by GitBox <gi...@apache.org>.
jpountz commented on PR #11942:
URL: https://github.com/apache/lucene/pull/11942#issuecomment-1325093360

   In most these cases I would guess that getting the initial size right is not as important as avoiding largely oversizing the hash map, so it's not really an issue that callers are passing a size rather than a capacity to the `HashMap` constructor. Thanks for pointing out the upcoming `HashMap.newHashMap` in JDK 19, it makes me feel better about the method you introduced, can you add a comment that it should get removed when Lucene moves to JDK 19+ as a minimum required version?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on pull request #11942: Ensure collections are properly sized on creation

Posted by GitBox <gi...@apache.org>.
jpountz commented on PR #11942:
URL: https://github.com/apache/lucene/pull/11942#issuecomment-1326243950

   Thanks @thecoop !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] thecoop commented on a diff in pull request #11942: Ensure collections are properly sized on creation

Posted by GitBox <gi...@apache.org>.
thecoop commented on code in PR #11942:
URL: https://github.com/apache/lucene/pull/11942#discussion_r1030556441


##########
lucene/analysis/common/src/java/org/apache/lucene/analysis/custom/CustomAnalyzer.java:
##########
@@ -594,11 +589,9 @@ public CustomAnalyzer build() {
     }
 
     private Map<String, String> applyDefaultParams(Map<String, String> map) {
-      if (defaultMatchVersion.get() != null
-          && !map.containsKey(AbstractAnalysisFactory.LUCENE_MATCH_VERSION_PARAM)) {
-        map.put(
-            AbstractAnalysisFactory.LUCENE_MATCH_VERSION_PARAM,
-            defaultMatchVersion.get().toString());
+      Version v;

Review Comment:
   Ah yes, good spot



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on pull request #11942: Ensure collections are properly sized on creation

Posted by GitBox <gi...@apache.org>.
jpountz commented on PR #11942:
URL: https://github.com/apache/lucene/pull/11942#issuecomment-1326206507

   `gradlew precommit` fails for me due to imports, can you run `gradlew tidy`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] thecoop commented on pull request #11942: Ensure collections are properly sized on creation

Posted by GitBox <gi...@apache.org>.
thecoop commented on PR #11942:
URL: https://github.com/apache/lucene/pull/11942#issuecomment-1318919917

   I think a lot of the existing uses are confusion over what the `int` constructor parameter actually means - it is not the number of items that could be stored before resizing (as it is in `ArrayList`) but the hash backing array size (modulo pow2 rounding), that could be max 75% full before resizing.
   
   Several uses are already calculating the correct number in-line, so that should be pulled into a shared method anyway.
   
   I also note that this is an easy way to migrate to `HashMap.newHashMap(int)` method in JDK19, when lucene moves onto that version.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a diff in pull request #11942: Ensure collections are properly sized on creation

Posted by GitBox <gi...@apache.org>.
jpountz commented on code in PR #11942:
URL: https://github.com/apache/lucene/pull/11942#discussion_r1030516381


##########
lucene/analysis/common/src/java/org/apache/lucene/analysis/custom/CustomAnalyzer.java:
##########
@@ -594,11 +589,9 @@ public CustomAnalyzer build() {
     }
 
     private Map<String, String> applyDefaultParams(Map<String, String> map) {
-      if (defaultMatchVersion.get() != null
-          && !map.containsKey(AbstractAnalysisFactory.LUCENE_MATCH_VERSION_PARAM)) {
-        map.put(
-            AbstractAnalysisFactory.LUCENE_MATCH_VERSION_PARAM,
-            defaultMatchVersion.get().toString());
+      Version v;

Review Comment:
   maybe initialize the version here rather than in the if condition?



##########
lucene/analysis/common/src/java/org/apache/lucene/analysis/custom/CustomAnalyzer.java:
##########
@@ -41,11 +40,7 @@
 import org.apache.lucene.analysis.miscellaneous.ConditionalTokenFilterFactory;
 import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
 import org.apache.lucene.analysis.util.FilesystemResourceLoader;
-import org.apache.lucene.util.ClasspathResourceLoader;
-import org.apache.lucene.util.ResourceLoader;
-import org.apache.lucene.util.ResourceLoaderAware;
-import org.apache.lucene.util.SetOnce;
-import org.apache.lucene.util.Version;
+import org.apache.lucene.util.*;

Review Comment:
   We prefer to avoid star imports, can you re-expand imports?



##########
lucene/codecs/src/java/org/apache/lucene/codecs/simpletext/SimpleTextSegmentInfoFormat.java:
##########
@@ -38,11 +36,7 @@
 import org.apache.lucene.store.Directory;
 import org.apache.lucene.store.IOContext;
 import org.apache.lucene.store.IndexOutput;
-import org.apache.lucene.util.ArrayUtil;
-import org.apache.lucene.util.BytesRef;
-import org.apache.lucene.util.BytesRefBuilder;
-import org.apache.lucene.util.StringHelper;
-import org.apache.lucene.util.Version;
+import org.apache.lucene.util.*;

Review Comment:
   here too



##########
lucene/core/src/java/org/apache/lucene/index/StandardDirectoryReader.java:
##########
@@ -23,18 +23,14 @@
 import java.util.Collection;
 import java.util.Collections;
 import java.util.Comparator;
-import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 import java.util.Set;
 import java.util.concurrent.CopyOnWriteArraySet;
 import org.apache.lucene.store.AlreadyClosedException;
 import org.apache.lucene.store.Directory;
 import org.apache.lucene.store.IOContext;
-import org.apache.lucene.util.Bits;
-import org.apache.lucene.util.IOFunction;
-import org.apache.lucene.util.IOUtils;
-import org.apache.lucene.util.Version;
+import org.apache.lucene.util.*;

Review Comment:
   here too



##########
lucene/core/src/java/org/apache/lucene/util/CollectionUtil.java:
##########
@@ -16,22 +16,37 @@
  */
 package org.apache.lucene.util;
 
-import java.util.Collections;
-import java.util.Comparator;
-import java.util.List;
-import java.util.RandomAccess;
+import java.util.*;

Review Comment:
   here too



##########
lucene/suggest/src/java/org/apache/lucene/search/suggest/document/ContextSuggestField.java:
##########
@@ -17,10 +17,7 @@
 package org.apache.lucene.search.suggest.document;
 
 import java.io.IOException;
-import java.util.Collections;
-import java.util.HashSet;
-import java.util.Iterator;
-import java.util.Set;
+import java.util.*;

Review Comment:
   here too



##########
lucene/core/src/test/org/apache/lucene/search/TestCollectorManager.java:
##########
@@ -20,16 +20,8 @@
 
 import com.carrotsearch.randomizedtesting.generators.RandomNumbers;
 import java.io.IOException;
-import java.util.ArrayList;
-import java.util.Arrays;
-import java.util.Collection;
-import java.util.Collections;
-import java.util.HashSet;
-import java.util.List;
-import java.util.Random;
-import java.util.Set;
+import java.util.*;

Review Comment:
   here too



##########
lucene/test-framework/src/java/org/apache/lucene/tests/index/BaseTermVectorsFormatTestCase.java:
##########
@@ -71,10 +71,7 @@
 import org.apache.lucene.tests.analysis.MockTokenizer;
 import org.apache.lucene.tests.analysis.Token;
 import org.apache.lucene.tests.util.TestUtil;
-import org.apache.lucene.util.AttributeImpl;
-import org.apache.lucene.util.AttributeReflector;
-import org.apache.lucene.util.BytesRef;
-import org.apache.lucene.util.IOUtils;
+import org.apache.lucene.util.*;

Review Comment:
   here too



##########
lucene/suggest/src/java/org/apache/lucene/search/suggest/document/ContextQuery.java:
##########
@@ -26,12 +26,7 @@
 import org.apache.lucene.search.QueryVisitor;
 import org.apache.lucene.search.ScoreMode;
 import org.apache.lucene.search.Weight;
-import org.apache.lucene.util.Accountable;
-import org.apache.lucene.util.BytesRef;
-import org.apache.lucene.util.BytesRefBuilder;
-import org.apache.lucene.util.IntsRef;
-import org.apache.lucene.util.IntsRefBuilder;
-import org.apache.lucene.util.RamUsageEstimator;
+import org.apache.lucene.util.*;

Review Comment:
   here too



##########
lucene/core/src/test/org/apache/lucene/search/TestMultiCollectorManager.java:
##########
@@ -18,12 +18,7 @@
 
 import com.carrotsearch.randomizedtesting.generators.RandomNumbers;
 import java.io.IOException;
-import java.util.ArrayList;
-import java.util.Collection;
-import java.util.HashSet;
-import java.util.List;
-import java.util.Random;
-import java.util.Set;
+import java.util.*;

Review Comment:
   here too



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org