You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "gsmiller (via GitHub)" <gi...@apache.org> on 2023/02/08 16:44:57 UTC

[GitHub] [lucene] gsmiller commented on a diff in pull request #12135: Avoid duplicate sorting and prefix-encoding in KeywordField#newSetQuery

gsmiller commented on code in PR #12135:
URL: https://github.com/apache/lucene/pull/12135#discussion_r1100407897


##########
lucene/core/src/java/org/apache/lucene/search/TermInSetQuery.java:
##########
@@ -81,34 +77,21 @@ public class TermInSetQuery extends Query implements Accountable {
   private final PrefixCodedTerms termData;
   private final int termDataHashCode; // cached hashcode of termData
 
-  /** Creates a new {@link TermInSetQuery} from the given collection of terms. */
-  public TermInSetQuery(String field, Collection<BytesRef> terms) {
-    BytesRef[] sortedTerms = terms.toArray(new BytesRef[0]);
-    // already sorted if we are a SortedSet with natural order
-    boolean sorted =
-        terms instanceof SortedSet && ((SortedSet<BytesRef>) terms).comparator() == null;
-    if (!sorted) {
-      ArrayUtil.timSort(sortedTerms);
-    }
-    PrefixCodedTerms.Builder builder = new PrefixCodedTerms.Builder();
-    BytesRefBuilder previous = null;
-    for (BytesRef term : sortedTerms) {
-      if (previous == null) {
-        previous = new BytesRefBuilder();
-      } else if (previous.get().equals(term)) {
-        continue; // deduplicate
-      }
-      builder.add(field, term);
-      previous.copyBytes(term);
-    }
-    this.field = field;
-    termData = builder.finish();
-    termDataHashCode = termData.hashCode();
+  /** Creates a new {@link TermInSetQuery} from the given prefix-coded terms. */
+  public TermInSetQuery(String field, PrefixCodedTerms terms) {

Review Comment:
   Good callout, thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org