You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/01/15 06:38:06 UTC

[GitHub] [lucene-solr] zacharymorn opened a new pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

zacharymorn opened a new pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205


   # Description
   
   Deprecate MinShouldMatchSumScorer with WANDScorer since the two are very similar at this point. This is a follow-up PR to https://github.com/apache/lucene-solr/pull/2141 .
   
   # Solution
   
   As `WANDScorer` can already handle the cases where `scoreMode.needsScore() == true`, and `MinShouldMatchSumScorer` can handle the cases where `scoreMode.needsScore() == false` , this PR deprecates `MinShouldMatchSumScorer` with `WANDScorer` primarily by skipping block-max scoring update related logic when `scoreMode.needsScore() == false`.  
   
   
   # Tests
   
   The existing tests as well as 3 newly added tests are passing (currently the nocommit comment fails validation check). 
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request title.
   - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [x] I have run `./gradlew check`.
   - [x] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] jpountz commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r558902897



##########
File path: lucene/core/src/java/org/apache/lucene/search/Boolean2ScorerSupplier.java
##########
@@ -231,10 +231,27 @@ private Scorer opt(
         optionalScorers.add(scorer.get(leadCost));
       }
 
-      if (scoreMode == ScoreMode.TOP_SCORES) {
-        return new WANDScorer(weight, optionalScorers, minShouldMatch);
-      } else if (minShouldMatch > 1) {
-        return new MinShouldMatchSumScorer(weight, optionalScorers, minShouldMatch);
+      // nocommit
+      //
+      // The following updated condition follows the previous one that also utilized
+      // MinShouldMatchSumScorer However, technically speaking, WANDScorer is also able to
+      // handle the case where minShouldMatch == 0 (proven with existing possible condition of
+      // scoreMode == SoreMode.TOP_SCORES && minShouldMatch == 0), but removing the
+      // minShouldMatch > 1 condition would also stop WANDScorer to handle the case
+      // where scoreMode.needsScore() == false, which WANDScorer is also capable of
+      // (proven with existing possible condition of scoreMode != ScoreMode.TOP_SCORES &&
+      // minShouldMatch > 1).
+      //
+      // Ultimately, WANDScorer should be able to handle the following conditions now:
+      // 1. Any ScoreMode (with scoring or not), although it might be a bit slower compared to
+      //   DisjunctionSumScorer due to data structures and algorithms usd
+      // 2. Any minCompetitiveScore ( >= 0 )
+      // 3. Any minShouldMatch ( >= 0 )
+      //
+      // So it seems we might need a different condition check to differentiate usage between
+      // WANDScorer and DisjunctionSumScorer ?

Review comment:
       The current condition looks right to me. Maybe the comment could say something like `WANDScorer can handle all cases but we specialize exhaustive pure disjunctions with DisjunctionSumScorer which is a bit faster`?

##########
File path: lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxScorer.java
##########
@@ -50,7 +50,7 @@
     if (tieBreakerMultiplier < 0 || tieBreakerMultiplier > 1) {
       throw new IllegalArgumentException("tieBreakerMultiplier must be in [0, 1]");
     }
-    if (scoreMode == ScoreMode.TOP_SCORES) {
+    if (scoreMode.needsScores()) {

Review comment:
       why do we need to change this?

##########
File path: lucene/core/src/java/org/apache/lucene/search/WANDScorer.java
##########
@@ -319,7 +356,10 @@ private void pushBackLeads(int target) throws IOException {
       }
     }
     lead = null;
-    leadMaxScore = 0;
+
+    if (needsScore) {
+      leadMaxScore = 0;
+    }

Review comment:
       I wonder if this condition could be removed in order to make this class easier to understand by minimizing differences in code paths between the scoring and the non-scoring cases? Are there other branches we could remove?

##########
File path: lucene/core/src/java/org/apache/lucene/search/WANDScorer.java
##########
@@ -149,23 +167,33 @@ private static long scaleMinScore(float minScore, int scalingFactor) {
     this.doc = -1;
     this.upTo = -1; // will be computed on the first call to nextDoc/advance
 
+    this.needsScore = needsScore;
+
     head = new DisiPriorityQueue(scorers.size());
     // there can be at most num_scorers - 1 scorers beyond the current position
     tail = new DisiWrapper[scorers.size()];
 
-    OptionalInt scalingFactor = OptionalInt.empty();
-    for (Scorer scorer : scorers) {
-      scorer.advanceShallow(0);
-      float maxScore = scorer.getMaxScore(DocIdSetIterator.NO_MORE_DOCS);
-      if (maxScore != 0 && Float.isFinite(maxScore)) {
-        // 0 and +Infty should not impact the scale
-        scalingFactor =
-            OptionalInt.of(
-                Math.min(scalingFactor.orElse(Integer.MAX_VALUE), scalingFactor(maxScore)));
+    if (needsScore) {

Review comment:
       likewise here I would have expected to check `scoreMode == TOP_SCORES` rather than `scoreMode.needsScores()` since calling `advanceShallow` `getMaxScore` only really makes sense if you plan to skip documents based on scores?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] jpountz commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r559043667



##########
File path: lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxScorer.java
##########
@@ -50,7 +50,7 @@
     if (tieBreakerMultiplier < 0 || tieBreakerMultiplier > 1) {
       throw new IllegalArgumentException("tieBreakerMultiplier must be in [0, 1]");
     }
-    if (scoreMode == ScoreMode.TOP_SCORES) {
+    if (scoreMode.needsScores()) {

Review comment:
       > so I was assuming we may need to preserve this behavior when we deprecate MinShouldMatchSumScorer with WANDScorer
   
   Right I'm hoping we can use WANDScorer for minShouldMatch > 1 for all values of ScoreMode!
   
   I believe that this exception you saw should disappear if you make sure that `WANDScorer` only calls `advanceShallow`/`getMaxScore` with `ScoreMode.TOP_SCORES`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r559294059



##########
File path: lucene/core/src/java/org/apache/lucene/search/WANDScorer.java
##########
@@ -257,13 +288,17 @@ public int advance(int target) throws IOException {
             // Advance 'head' as well
             advanceHead(target);
 
-            // Pop the new 'lead' from 'head'
-            moveToNextCandidate(target);
+            if (scoreMode == ScoreMode.TOP_SCORES) {
+              // Update score bounds if necessary so
+              updateMaxScoresIfNecessary(target);
 
-            if (doc == DocIdSetIterator.NO_MORE_DOCS) {
-              return DocIdSetIterator.NO_MORE_DOCS;
+              if (doc == DocIdSetIterator.NO_MORE_DOCS) {

Review comment:
       I just pushed a commit to revert back to the original `moveToNextCandidate` abstraction, and handle the non TOP_SCORES mode inside that method. I think that really helps to localize / simplify the changes (and reduce 1 more branching) a lot!  Please let me know if this looks good to you.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r559041326



##########
File path: lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxScorer.java
##########
@@ -50,7 +50,7 @@
     if (tieBreakerMultiplier < 0 || tieBreakerMultiplier > 1) {
       throw new IllegalArgumentException("tieBreakerMultiplier must be in [0, 1]");
     }
-    if (scoreMode == ScoreMode.TOP_SCORES) {
+    if (scoreMode.needsScores()) {

Review comment:
       Sorry should have put in a nocommit here to explain this change. Basically without this change a NPE would be thrown for test `DisMaxRequestHandlerTest#testSomeStuff` with the following stack trace:
   
   ```
    2> 4261 ERROR (TEST-DisMaxRequestHandlerTest.testSomeStuff-seed#[66968D4353C5888]) [     ] o.a.s.h.RequestHandlerBase java.lang.NullPointerException
     2>    at org.apache.lucene.search.DisjunctionMaxScorer.advanceShallow(DisjunctionMaxScorer.java:78)
     2>    at org.apache.lucene.search.WANDScorer.<init>(WANDScorer.java:179)
     2>    at org.apache.lucene.search.Boolean2ScorerSupplier.opt(Boolean2ScorerSupplier.java:257)
     2>    at org.apache.lucene.search.Boolean2ScorerSupplier.getInternal(Boolean2ScorerSupplier.java:124)
     2>    at org.apache.lucene.search.Boolean2ScorerSupplier.get(Boolean2ScorerSupplier.java:96)
     2>    at org.apache.lucene.search.Boolean2ScorerSupplier.req(Boolean2ScorerSupplier.java:168)
     2>    at org.apache.lucene.search.Boolean2ScorerSupplier.getInternal(Boolean2ScorerSupplier.java:146)
     2>    at org.apache.lucene.search.Boolean2ScorerSupplier.get(Boolean2ScorerSupplier.java:96)
     2>    at org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:353)
     2>    at org.apache.lucene.search.Weight.bulkScorer(Weight.java:166)
     2>    at org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:343)
     2>    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:743)
     2>    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:533)
     2>    at org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:210)
     2>    at org.apache.solr.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:1709)
     2>    at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1412)
     2>    at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:599)
     2>    at org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1508)
     2>    at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:398)
     2>    at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:355)
     2>    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
     2>    at org.apache.solr.core.SolrCore.execute(SolrCore.java:2617)
     2>    at org.apache.solr.util.TestHarness.query(TestHarness.java:346)
     2>    at org.apache.solr.util.TestHarness.query(TestHarness.java:328)
     2>    at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:982)
     2>    at org.apache.solr.DisMaxRequestHandlerTest.doTestSomeStuff(DisMaxRequestHandlerTest.java:141)
     2>    at org.apache.solr.DisMaxRequestHandlerTest.testSomeStuff(DisMaxRequestHandlerTest.java:72)
   ```
   
   The scoreMode and minShouldMatch combination that caused this exception was `scoreMode == COMPLETE`, `minShouldMatch == 3`. 
   
   I think what's happening here is, `MinShouldMatchSumScorer` was already used in test for other scoring ScoreMode in addition to `ScoreMode.TOP_SCORES` (and probably already used by clients in the wild as well), as the original condition before adding `minShouldMatch` support to `WANDScorer` is 
   
   ```
   if (minShouldMatch > 1) {
     return MinShouldMatchSumScorer...
   } else if (scoreMode == ScoreMode.TOP_SCORES) {
     return WANDScorer...
   } else {
     return DisjunctionSumScorer...
   }
   ```
   
   so I was assuming we may need to preserve this behavior when we deprecate `MinShouldMatchSumScorer` with `WANDScorer`?  This was also the reason it was checking for `needsScore` instead of `scoreMode == TOP_SCORES` in `WANDScorer` constructor, and the long nocommit message in `Boolean2ScorerSupplier.java` to see if it may need a different conditional to differentiate between `WANDScorer` and `DisjunctionSumScorer`, as you noted in other review comments.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r559275690



##########
File path: lucene/core/src/java/org/apache/lucene/search/WANDScorer.java
##########
@@ -257,13 +288,17 @@ public int advance(int target) throws IOException {
             // Advance 'head' as well
             advanceHead(target);
 
-            // Pop the new 'lead' from 'head'
-            moveToNextCandidate(target);
+            if (scoreMode == ScoreMode.TOP_SCORES) {
+              // Update score bounds if necessary so
+              updateMaxScoresIfNecessary(target);
 
-            if (doc == DocIdSetIterator.NO_MORE_DOCS) {
-              return DocIdSetIterator.NO_MORE_DOCS;
+              if (doc == DocIdSetIterator.NO_MORE_DOCS) {

Review comment:
       For this one I actually moved the doc advancement to `NO_MORE_DOCS` for condition `head.size() == 0` into `updateMaxScoresIfNecessary`, as initially I felt these two are closely related and only applicable for TOP_SCORES mode (I probably should have updated the method name though) https://github.com/zacharymorn/lucene-solr/blob/9243ef01de6f4a4154275919a32f5703e970f49d/lucene/core/src/java/org/apache/lucene/search/WANDScorer.java#L463-L468 . And thus the original `moveToNextCandidate` implementation was separated into `updateMaxScoresIfNecessary` and `setDocAndFreq`. 
   
   However, in hindsight I see this is not a very good abstraction, and it's probably better to keep the original `moveToNextCandidate` and handle the non-scoring mode inside. Let me give that a try.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r559223900



##########
File path: lucene/core/src/java/org/apache/lucene/search/Boolean2ScorerSupplier.java
##########
@@ -231,10 +231,27 @@ private Scorer opt(
         optionalScorers.add(scorer.get(leadCost));
       }
 
-      if (scoreMode == ScoreMode.TOP_SCORES) {
-        return new WANDScorer(weight, optionalScorers, minShouldMatch);
-      } else if (minShouldMatch > 1) {
-        return new MinShouldMatchSumScorer(weight, optionalScorers, minShouldMatch);
+      // nocommit
+      //
+      // The following updated condition follows the previous one that also utilized
+      // MinShouldMatchSumScorer However, technically speaking, WANDScorer is also able to
+      // handle the case where minShouldMatch == 0 (proven with existing possible condition of
+      // scoreMode == SoreMode.TOP_SCORES && minShouldMatch == 0), but removing the
+      // minShouldMatch > 1 condition would also stop WANDScorer to handle the case
+      // where scoreMode.needsScore() == false, which WANDScorer is also capable of
+      // (proven with existing possible condition of scoreMode != ScoreMode.TOP_SCORES &&
+      // minShouldMatch > 1).
+      //
+      // Ultimately, WANDScorer should be able to handle the following conditions now:
+      // 1. Any ScoreMode (with scoring or not), although it might be a bit slower compared to
+      //   DisjunctionSumScorer due to data structures and algorithms usd
+      // 2. Any minCompetitiveScore ( >= 0 )
+      // 3. Any minShouldMatch ( >= 0 )
+      //
+      // So it seems we might need a different condition check to differentiate usage between
+      // WANDScorer and DisjunctionSumScorer ?

Review comment:
       Updated the comment here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r559224756



##########
File path: lucene/core/src/java/org/apache/lucene/search/WANDScorer.java
##########
@@ -319,7 +356,10 @@ private void pushBackLeads(int target) throws IOException {
       }
     }
     lead = null;
-    leadMaxScore = 0;
+
+    if (needsScore) {
+      leadMaxScore = 0;
+    }

Review comment:
       In the latest commit, I've done some refactoring and removed a few newly added branches that can be no-op for non-scoring mode, but not by a lot. Currently there are about 5 conditional branches left for handling non-scoring cases (checking `scoreMode  == ScoreMode.TOP_SCORES`), 1 is used in constructor, 2 used in assertions, and the last 2 were used in `advance` and `doNextCompetitiveCandidate`, so I felt that might be the minimal number of branches we could have now without major refactoring changes? What do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] jpountz commented on pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
jpountz commented on pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#issuecomment-766588060


   Thank you @zacharymorn !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] jpountz merged pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
jpountz merged pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] jpountz merged pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
jpountz merged pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] jpountz commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r559242170



##########
File path: lucene/core/src/java/org/apache/lucene/search/WANDScorer.java
##########
@@ -149,23 +167,33 @@ private static long scaleMinScore(float minScore, int scalingFactor) {
     this.doc = -1;
     this.upTo = -1; // will be computed on the first call to nextDoc/advance
 
+    this.scoreMode = scoreMode;
+
     head = new DisiPriorityQueue(scorers.size());
     // there can be at most num_scorers - 1 scorers beyond the current position
     tail = new DisiWrapper[scorers.size()];
 
-    OptionalInt scalingFactor = OptionalInt.empty();
-    for (Scorer scorer : scorers) {
-      scorer.advanceShallow(0);
-      float maxScore = scorer.getMaxScore(DocIdSetIterator.NO_MORE_DOCS);
-      if (maxScore != 0 && Float.isFinite(maxScore)) {
-        // 0 and +Infty should not impact the scale
-        scalingFactor =
-            OptionalInt.of(
-                Math.min(scalingFactor.orElse(Integer.MAX_VALUE), scalingFactor(maxScore)));
+    if (this.scoreMode == ScoreMode.TOP_SCORES) {
+      OptionalInt scalingFactor = OptionalInt.empty();
+      for (Scorer scorer : scorers) {
+        scorer.advanceShallow(0);
+        float maxScore = scorer.getMaxScore(DocIdSetIterator.NO_MORE_DOCS);
+        if (maxScore != 0 && Float.isFinite(maxScore)) {
+          // 0 and +Infty should not impact the scale
+          scalingFactor =
+              OptionalInt.of(
+                  Math.min(scalingFactor.orElse(Integer.MAX_VALUE), scalingFactor(maxScore)));
+        }
       }
+
+      this.scalingFactor = scalingFactor.orElse(0);
+      this.maxScorePropagator = new MaxScoreSumPropagator(scorers);
+    } else {
+      this.scalingFactor = 0;
+      this.maxScorePropagator = null;
     }
+
     // Use a scaling factor of 0 if all max scores are either 0 or +Infty

Review comment:
       this comment looks misplaced now?

##########
File path: lucene/core/src/java/org/apache/lucene/search/WANDScorer.java
##########
@@ -257,13 +288,17 @@ public int advance(int target) throws IOException {
             // Advance 'head' as well
             advanceHead(target);
 
-            // Pop the new 'lead' from 'head'
-            moveToNextCandidate(target);
+            if (scoreMode == ScoreMode.TOP_SCORES) {
+              // Update score bounds if necessary so
+              updateMaxScoresIfNecessary(target);
 
-            if (doc == DocIdSetIterator.NO_MORE_DOCS) {
-              return DocIdSetIterator.NO_MORE_DOCS;
+              if (doc == DocIdSetIterator.NO_MORE_DOCS) {

Review comment:
       we may have a problem, or maybe this line is not needed anymore
   
   Before, `moveToNextCandidate` would advance `doc`. But now that you are calling `updateMaxScoresIfNecessary ` instead, which doesn't advance doc, it shouldn't be possible that doc == NO_MORE_DOCS on this line?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] jpountz commented on pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
jpountz commented on pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#issuecomment-766588060


   Thank you @zacharymorn !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r559224756



##########
File path: lucene/core/src/java/org/apache/lucene/search/WANDScorer.java
##########
@@ -319,7 +356,10 @@ private void pushBackLeads(int target) throws IOException {
       }
     }
     lead = null;
-    leadMaxScore = 0;
+
+    if (needsScore) {
+      leadMaxScore = 0;
+    }

Review comment:
       In the latest commit, I've done some refactoring and removed a few newly added branches that can be no harm for non-scoring mode, but not by a lot. Currently there are about 5 conditional branches left for handling non-scoring cases (checking `scoreMode  == ScoreMode.TOP_SCORES`), 1 is used in constructor, 2 used in assertions, and the last 2 were used in `advance` and `doNextCompetitiveCandidate`, so I felt that might be the minimal number of branches we could have now without major refactoring changes? What do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] zacharymorn commented on pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#issuecomment-766615404


   > Thank you @zacharymorn !
   
   No problem! Thanks Adrien for the review and guidance!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] zacharymorn commented on pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#issuecomment-766615404


   > Thank you @zacharymorn !
   
   No problem! Thanks Adrien for the review and guidance!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r559041435



##########
File path: lucene/core/src/java/org/apache/lucene/search/WANDScorer.java
##########
@@ -319,7 +356,10 @@ private void pushBackLeads(int target) throws IOException {
       }
     }
     lead = null;
-    leadMaxScore = 0;
+
+    if (needsScore) {
+      leadMaxScore = 0;
+    }

Review comment:
       Makes sense. Let me take a further look at this and try out.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r559052772



##########
File path: lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxScorer.java
##########
@@ -50,7 +50,7 @@
     if (tieBreakerMultiplier < 0 || tieBreakerMultiplier > 1) {
       throw new IllegalArgumentException("tieBreakerMultiplier must be in [0, 1]");
     }
-    if (scoreMode == ScoreMode.TOP_SCORES) {
+    if (scoreMode.needsScores()) {

Review comment:
       Ahhhh I see now I had a wrong assumption / impression before. For some reason I mistakenly thought since `MinShouldMatchSumScorer` supports `ScoreMode.COMPLETE`, `WANDScorer` should also allow score computation for that mode after deprecation. Will push an update soon to check for `ScoreMode.TOP_SCORES` instead.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2205: LUCENE-9668: Deprecate MinShouldMatchSumScorer with WANDScorer

Posted by GitBox <gi...@apache.org>.
zacharymorn commented on a change in pull request #2205:
URL: https://github.com/apache/lucene-solr/pull/2205#discussion_r559223860



##########
File path: lucene/core/src/java/org/apache/lucene/search/WANDScorer.java
##########
@@ -149,23 +167,33 @@ private static long scaleMinScore(float minScore, int scalingFactor) {
     this.doc = -1;
     this.upTo = -1; // will be computed on the first call to nextDoc/advance
 
+    this.needsScore = needsScore;
+
     head = new DisiPriorityQueue(scorers.size());
     // there can be at most num_scorers - 1 scorers beyond the current position
     tail = new DisiWrapper[scorers.size()];
 
-    OptionalInt scalingFactor = OptionalInt.empty();
-    for (Scorer scorer : scorers) {
-      scorer.advanceShallow(0);
-      float maxScore = scorer.getMaxScore(DocIdSetIterator.NO_MORE_DOCS);
-      if (maxScore != 0 && Float.isFinite(maxScore)) {
-        // 0 and +Infty should not impact the scale
-        scalingFactor =
-            OptionalInt.of(
-                Math.min(scalingFactor.orElse(Integer.MAX_VALUE), scalingFactor(maxScore)));
+    if (needsScore) {

Review comment:
       Updated to check for `scoreMode == TOP_SCORES` instead.

##########
File path: lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxScorer.java
##########
@@ -50,7 +50,7 @@
     if (tieBreakerMultiplier < 0 || tieBreakerMultiplier > 1) {
       throw new IllegalArgumentException("tieBreakerMultiplier must be in [0, 1]");
     }
-    if (scoreMode == ScoreMode.TOP_SCORES) {
+    if (scoreMode.needsScores()) {

Review comment:
       Updated to check for `scoreMode == TOP_SCORES` instead.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org