You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by ro...@apache.org on 2021/10/06 16:13:28 UTC

[lucene] branch main updated: LUCENE-9325: Make Sort final (#338)

This is an automated email from the ASF dual-hosted git repository.

romseygeek pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/lucene.git


The following commit(s) were added to refs/heads/main by this push:
     new 9e9c3bd  LUCENE-9325: Make Sort final (#338)
9e9c3bd is described below

commit 9e9c3bd249e777fd0355b695038922482910bf5e
Author: Alan Woodward <ro...@apache.org>
AuthorDate: Wed Oct 6 17:13:24 2021 +0100

    LUCENE-9325: Make Sort final (#338)
    
    Sort is used in all sorts of settings where we assume that it is immutable
    (for example, in IndexWriterConfig). This commit makes it so, plus it also
    updates the severely outdated javadoc.
---
 lucene/CHANGES.txt                                 |  4 +-
 lucene/MIGRATE.md                                  |  5 ++
 .../src/java/org/apache/lucene/search/Sort.java    | 84 ++--------------------
 .../apache/lucene/search/TopFieldCollector.java    |  4 +-
 .../lucene/search/TestSortedNumericSortField.java  | 17 +++--
 .../lucene/search/TestSortedSetSortField.java      |  9 +--
 6 files changed, 31 insertions(+), 92 deletions(-)

diff --git a/lucene/CHANGES.txt b/lucene/CHANGES.txt
index 01d05b5..c26b497 100644
--- a/lucene/CHANGES.txt
+++ b/lucene/CHANGES.txt
@@ -153,7 +153,9 @@ API Changes
 * LUCENE-10115: Add an extension point, BaseQueryParser#getFuzzyDistance, to allow custom
   query parsers to determine the similarity distance for fuzzy queries. (Chris Hegarty)
 
-*  LUCENE-10132: Support addition of diagnostics by custom merge policies (Chris Hegarty)
+* LUCENE-10132: Support addition of diagnostics by custom merge policies (Chris Hegarty)
+
+* LUCENE-9325: Sort is now final, and the `setSort()` method has been removed (Alan Woodward)
 
 Improvements
 
diff --git a/lucene/MIGRATE.md b/lucene/MIGRATE.md
index 9508e7e..6a7087a 100644
--- a/lucene/MIGRATE.md
+++ b/lucene/MIGRATE.md
@@ -438,3 +438,8 @@ They can now be found in the o.a.l.queries.spans package.
 
 SpanBoostQuery was a no-op unless used at the top level of a SpanQuery nested
 structure. Use a standard BoostQuery here instead.
+
+## Sort is immutable (LUCENE-9325)
+
+Rather than using `setSort()` to change sort values, you should instead create
+a new Sort instance with the new values.
diff --git a/lucene/core/src/java/org/apache/lucene/search/Sort.java b/lucene/core/src/java/org/apache/lucene/search/Sort.java
index 97a4aa2..868fbf1 100644
--- a/lucene/core/src/java/org/apache/lucene/search/Sort.java
+++ b/lucene/core/src/java/org/apache/lucene/search/Sort.java
@@ -22,67 +22,14 @@ import java.util.Arrays;
 /**
  * Encapsulates sort criteria for returned hits.
  *
- * <p>The fields used to determine sort order must be carefully chosen. Documents must contain a
- * single term in such a field, and the value of the term should indicate the document's relative
- * position in a given sort order. The field must be indexed, but should not be tokenized, and does
- * not need to be stored (unless you happen to want it back with the rest of your document data). In
- * other words:
- *
- * <p><code>
- * document.add (new Field ("byNumber", Integer.toString(x), Field.Store.NO, Field.Index.NOT_ANALYZED));
- * </code>
- *
- * <h2>Valid Types of Values</h2>
- *
- * <p>There are four possible kinds of term values which may be put into sorting fields: Integers,
- * Longs, Floats, or Strings. Unless {@link SortField SortField} objects are specified, the type of
- * value in the field is determined by parsing the first term in the field.
- *
- * <p>Integer term values should contain only digits and an optional preceding negative sign. Values
- * must be base 10 and in the range <code>Integer.MIN_VALUE</code> and <code>Integer.MAX_VALUE
- * </code> inclusive. Documents which should appear first in the sort should have low value
- * integers, later documents high values (i.e. the documents should be numbered <code>1..n</code>
- * where <code>1</code> is the first and <code>n</code> the last).
- *
- * <p>Long term values should contain only digits and an optional preceding negative sign. Values
- * must be base 10 and in the range <code>Long.MIN_VALUE</code> and <code>Long.MAX_VALUE</code>
- * inclusive. Documents which should appear first in the sort should have low value integers, later
- * documents high values.
- *
- * <p>Float term values should conform to values accepted by {@link Float Float.valueOf(String)}
- * (except that <code>NaN</code> and <code>Infinity</code> are not supported). Documents which
- * should appear first in the sort should have low values, later documents high values.
- *
- * <p>String term values can contain any valid String, but should not be tokenized. The values are
- * sorted according to their {@link Comparable natural order}. Note that using this type of term
- * value has higher memory requirements than the other two types.
- *
- * <h2>Object Reuse</h2>
- *
- * <p>One of these objects can be used multiple times and the sort order changed between usages.
- *
- * <p>This class is thread safe.
- *
- * <h2>Memory Usage</h2>
- *
- * <p>Sorting uses of caches of term values maintained by the internal HitQueue(s). The cache is
- * static and contains an integer or float array of length <code>IndexReader.maxDoc()</code> for
- * each field name for which a sort is performed. In other words, the size of the cache in bytes is:
- *
- * <p><code>4 * IndexReader.maxDoc() * (# of different fields actually used to sort)</code>
- *
- * <p>For String fields, the cache is larger: in addition to the above array, the value of every
- * term in the field is kept in memory. If there are many unique terms in the field, this could be
- * quite large.
- *
- * <p>Note that the size of the cache is not affected by how many fields are in the index and
- * <i>might</i> be used to sort - only by the ones actually used to sort a result set.
- *
- * <p>Created: Feb 12, 2004 10:53:57 AM
+ * <p>A {@link Sort} can be created with an empty constructor, yielding an object that will instruct
+ * searches to return their hits sorted by relevance; or it can be created with one or more {@link
+ * SortField}s.
  *
+ * @see SortField
  * @since lucene 1.4
  */
-public class Sort {
+public final class Sort {
 
   /**
    * Represents sorting by computed relevance. Using this sort criteria returns the same results as
@@ -95,7 +42,7 @@ public class Sort {
   public static final Sort INDEXORDER = new Sort(SortField.FIELD_DOC);
 
   // internal representation of the sort criteria
-  SortField[] fields;
+  private final SortField[] fields;
 
   /**
    * Sorts by computed relevance. This is the same sort criteria as calling {@link
@@ -106,31 +53,12 @@ public class Sort {
     this(SortField.FIELD_SCORE);
   }
 
-  /** Sorts by the criteria in the given SortField. */
-  public Sort(SortField field) {
-    setSort(field);
-  }
-
   /**
    * Sets the sort to the given criteria in succession: the first SortField is checked first, but if
    * it produces a tie, then the second SortField is used to break the tie, etc. Finally, if there
    * is still a tie after all SortFields are checked, the internal Lucene docid is used to break it.
    */
   public Sort(SortField... fields) {
-    setSort(fields);
-  }
-
-  /** Sets the sort to the given criteria. */
-  public void setSort(SortField field) {
-    this.fields = new SortField[] {field};
-  }
-
-  /**
-   * Sets the sort to the given criteria in succession: the first SortField is checked first, but if
-   * it produces a tie, then the second SortField is used to break the tie, etc. Finally, if there
-   * is still a tie after all SortFields are checked, the internal Lucene docid is used to break it.
-   */
-  public void setSort(SortField... fields) {
     if (fields.length == 0) {
       throw new IllegalArgumentException("There must be at least 1 sort field");
     }
diff --git a/lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java b/lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
index 9e3e9ba..db97704 100644
--- a/lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
+++ b/lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java
@@ -434,7 +434,7 @@ public abstract class TopFieldCollector extends TopDocsCollector<Entry> {
       HitsThresholdChecker hitsThresholdChecker,
       MaxScoreAccumulator minScoreAcc) {
 
-    if (sort.fields.length == 0) {
+    if (sort.getSort().length == 0) {
       throw new IllegalArgumentException("Sort must contain at least one field");
     }
 
@@ -447,7 +447,7 @@ public abstract class TopFieldCollector extends TopDocsCollector<Entry> {
       throw new IllegalArgumentException("hitsThresholdChecker should not be null");
     }
 
-    FieldValueHitQueue<Entry> queue = FieldValueHitQueue.create(sort.fields, numHits);
+    FieldValueHitQueue<Entry> queue = FieldValueHitQueue.create(sort.getSort(), numHits);
 
     if (after == null) {
       // inform a comparator that sort is based on this single field
diff --git a/lucene/core/src/test/org/apache/lucene/search/TestSortedNumericSortField.java b/lucene/core/src/test/org/apache/lucene/search/TestSortedNumericSortField.java
index b6cac82..febc209 100644
--- a/lucene/core/src/test/org/apache/lucene/search/TestSortedNumericSortField.java
+++ b/lucene/core/src/test/org/apache/lucene/search/TestSortedNumericSortField.java
@@ -34,15 +34,22 @@ public class TestSortedNumericSortField extends LuceneTestCase {
     IndexSearcher empty = newSearcher(new MultiReader());
     Query query = new TermQuery(new Term("contents", "foo"));
 
-    Sort sort = new Sort();
-    sort.setSort(new SortedNumericSortField("sortednumeric", SortField.Type.LONG));
-    TopDocs td = empty.search(query, 10, sort, true);
+    TopDocs td =
+        empty.search(
+            query,
+            10,
+            new Sort(new SortedNumericSortField("sortednumeric", SortField.Type.LONG)),
+            true);
     assertEquals(0, td.totalHits.value);
 
     // for an empty index, any selector should work
     for (SortedNumericSelector.Type v : SortedNumericSelector.Type.values()) {
-      sort.setSort(new SortedNumericSortField("sortednumeric", SortField.Type.LONG, false, v));
-      td = empty.search(query, 10, sort, true);
+      td =
+          empty.search(
+              query,
+              10,
+              new Sort(new SortedNumericSortField("sortednumeric", SortField.Type.LONG, false, v)),
+              true);
       assertEquals(0, td.totalHits.value);
     }
   }
diff --git a/lucene/core/src/test/org/apache/lucene/search/TestSortedSetSortField.java b/lucene/core/src/test/org/apache/lucene/search/TestSortedSetSortField.java
index b9ca0c2..d70db36 100644
--- a/lucene/core/src/test/org/apache/lucene/search/TestSortedSetSortField.java
+++ b/lucene/core/src/test/org/apache/lucene/search/TestSortedSetSortField.java
@@ -32,16 +32,13 @@ public class TestSortedSetSortField extends LuceneTestCase {
   public void testEmptyIndex() throws Exception {
     IndexSearcher empty = newSearcher(new MultiReader());
     Query query = new TermQuery(new Term("contents", "foo"));
-
-    Sort sort = new Sort();
-    sort.setSort(new SortedSetSortField("sortedset", false));
-    TopDocs td = empty.search(query, 10, sort, true);
+    TopDocs td =
+        empty.search(query, 10, new Sort(new SortedSetSortField("sortedset", false)), true);
     assertEquals(0, td.totalHits.value);
 
     // for an empty index, any selector should work
     for (SortedSetSelector.Type v : SortedSetSelector.Type.values()) {
-      sort.setSort(new SortedSetSortField("sortedset", false, v));
-      td = empty.search(query, 10, sort, true);
+      td = empty.search(query, 10, new Sort(new SortedSetSortField("sortedset", false, v)), true);
       assertEquals(0, td.totalHits.value);
     }
   }