You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/11/09 08:13:38 UTC

[GitHub] [lucene] jpountz commented on a change in pull request #431: LUCENE-10208: Ensure that the minimum competitive score does not decrease in concurrent search

jpountz commented on a change in pull request #431:
URL: https://github.com/apache/lucene/pull/431#discussion_r745367139



##########
File path: lucene/core/src/java/org/apache/lucene/search/MaxScoreAccumulator.java
##########
@@ -35,9 +35,22 @@
     this.modInterval = DEFAULT_INTERVAL;
   }
 
-  void accumulate(int docID, float score) {
-    assert docID >= 0 && score >= 0;
-    long encode = (((long) Float.floatToIntBits(score)) << 32) | docID;
+  private static long maxBinary(long v1, long v2) {

Review comment:
       we call this `long` `encode` elsewhere, so maybe use the same convention for this function name?
   ```suggestion
     /**
      * Return the max encoded DocAndScore in a way that is consistent with {@link DocAndScore#compareTo}.
      */
     private static long maxEncode(long v1, long v2) {
   ```

##########
File path: lucene/core/src/java/org/apache/lucene/search/MaxScoreAccumulator.java
##########
@@ -47,24 +60,25 @@ DocAndScore get() {
       return null;
     }
     float score = Float.intBitsToFloat((int) (value >> 32));
-    int docID = (int) value;
-    return new DocAndScore(docID, score);
+    int docBase = (int) value;
+    return new DocAndScore(docBase, score);
   }
 
   static class DocAndScore implements Comparable<DocAndScore> {
-    final int docID;
+    final int docBase;
     final float score;
 
-    DocAndScore(int docID, float score) {
-      this.docID = docID;
+    DocAndScore(int docBase, float score) {
+      this.docBase = docBase;
       this.score = score;
     }
 
     @Override
     public int compareTo(DocAndScore o) {
       int cmp = Float.compare(score, o.score);
       if (cmp == 0) {
-        return Integer.compare(docID, o.docID);
+        // tie-break on the minimum doc base
+        return -Integer.compare(docBase, o.docBase);

Review comment:
       Doing this instead helps avoid the edge case of `Integer.MIN_VALUE`. I also suggested a comment that would have helped me better understand this code.
   
   ```suggestion
           // For a given minimum competitive score, we want to know the first segment
           // where this score occurred, hence the reverse order here.
           // On segments with a lower docBase, any document whose score is greater
           // than or equal to this score would be competitive, while on segments with a
           // higher docBase, documents need to have a strictly greater score to be
           // competitive since we tie break on doc ID.
           return Integer.compare(o.docBase, docBase);
   ```

##########
File path: lucene/core/src/java/org/apache/lucene/search/MaxScoreAccumulator.java
##########
@@ -35,9 +35,22 @@
     this.modInterval = DEFAULT_INTERVAL;
   }
 
-  void accumulate(int docID, float score) {
-    assert docID >= 0 && score >= 0;
-    long encode = (((long) Float.floatToIntBits(score)) << 32) | docID;
+  private static long maxBinary(long v1, long v2) {
+    float score1 = Float.intBitsToFloat((int) (v1 >> 32));
+    float score2 = Float.intBitsToFloat((int) (v2 >> 32));
+    int cmp = Float.compare(score1, score2);
+    if (cmp == 0) {
+      // tie-break on the minimum doc base
+      return (int) v1 < (int) v2 ? v1 : v2;

Review comment:
       Ah right, that makes sense to me. For a given max score, we want to know the minimum segment ID where this max score occurred (which we identify via its docBase). Maybe add a comment, I think it would make the code easier to read.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org