You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@lucene.apache.org by "Greg Miller (Jira)" <ji...@apache.org> on 2021/07/27 23:55:00 UTC

[jira] [Created] (LUCENE-10037) Explore a single scoring implementation in DrillSidewaysScorer

Greg Miller created LUCENE-10037:
------------------------------------

             Summary: Explore a single scoring implementation in DrillSidewaysScorer
                 Key: LUCENE-10037
                 URL: https://issues.apache.org/jira/browse/LUCENE-10037
             Project: Lucene - Core
          Issue Type: Improvement
          Components: modules/facet
    Affects Versions: main (9.0)
            Reporter: Greg Miller


{{DrillSidewaysScorer}} currently implements three separate strategies for bulk scoring documents: {{doQueryFirstScoring}}, {{doUnionScoring}} and {{doDrillDownAdvanceScoring}}. As far as I can tell, this code dates back to 2013 and two of the three approaches appear to emulate the {{BooleanScorer}} "window scoring" / "term-at-a-time" strategy. While this strategy in {{BooleanScorer}} is still useful in some cases, the primary benefit, from what I can tell, is to avoid re-heap operations in disjunction cases (as recently [described|http://mail-archives.apache.org/mod_mbox/lucene-dev/202106.mbox/%3CCAPsWd%2BMbYckCR2LHxHy4-%3DoZPnvX%3D9Er8hwb%2BG76jHb85JePvw%40mail.gmail.com%3E] by [~jpountz]). I can't see any reason why we'd prefer these two approaches anymore in {{DrillSidewaysScorer}} since we're doing pure conjunctions (no re-heaping to worry about) and {{doQueryFirstScoring}} takes advantage of skipping by advancing postings (while the other two approaches iterate their postings entirely, only relying on nextDoc functionality). Finally, we added an optimization (LUCENE-10030) that can only work for {{doQueryFirstScoring}} that lazily evaluates the {{score}} (where-as {{doUnionScoring}} and {{doDrillDownAdvanceScoring}} eagerly evaluate it).

 

All this is to say we should try sending all scoring through {{doQueryFirstScoring}} and see how it benchmarks. I'm not sure if we have benchmarks setup already for drill sideways, but I'd love to see if we can't optimize {{DrillSidewaysScorer}} while also reducing its code complexity!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org