You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2019/04/10 12:37:00 UTC

[jira] [Created] (LUCENE-8759) BlockMaxConjunctionScorer's simplified way of computing max scores hurts performance

Adrien Grand created LUCENE-8759:
------------------------------------

             Summary: BlockMaxConjunctionScorer's simplified way of computing max scores hurts performance
                 Key: LUCENE-8759
                 URL: https://issues.apache.org/jira/browse/LUCENE-8759
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Adrien Grand


BlockMaxConjunctionScorer computes the minimum value that the score should have after each scorer in order to be able to interrupt scorer as soon as possible. For instance say scorers A, B and C produce maximum scores that are equal to 4, 2 and 1. If the minimum competitive score is X, then the score after scoring A, B and C must be at least X, the score after scoring A and B must be at least X-1 and the score after scoring A must be at least X-1-2.

However this is made a bit more complex than that due to floating-point numbers and the fact that intermediate score values are doubles which only get casted to a float after all values have been summed up. In order to keep things simple, BlockMaxConjunctionScore has the following comment and code

{code}
        // Also compute the minimum required scores for a hit to be competitive
        // A double that is less than 'score' might still be converted to 'score'
        // when casted to a float, so we go to the previous float to avoid this issue
        minScores[minScores.length - 1] = minScore > 0 ? Math.nextDown(minScore) : 0;
{code}

It simplifies the problem by calling Math.nextDown(minScore). However this is problematic because it defeats the fact that TopScoreDocCollector calls setMinCompetitiveScore on the float value that is immediately greater than the k-th greatest hit so far.

nextDown(minScore) is not the value that we need. The value that we need is the smallest double that converts to minScore when casted to a float, which would be half-way between nextDown(minScore) and minScore. In some cases this would help get better performance out of conjunctions, especially if some clauses produce constant scores.

MaxScoreSumPropagator#setMinCompetitiveScore has the same issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org