You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Greg Miller (Jira)" <ji...@apache.org> on 2021/07/20 21:48:00 UTC
[jira] [Commented] (LUCENE-10030) [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring

    [ https://issues.apache.org/jira/browse/LUCENE-10030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384494#comment-17384494 ] 

Greg Miller commented on LUCENE-10030:
--------------------------------------

Thanks for opening this! Left one small comment on the PR. Good find!

> [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-10030
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10030
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Grigoriy Troitskiy
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> *Diff*
> {code:java}
> @@ -195,11 +195,8 @@ class DrillSidewaysScorer extends BulkScorer {
>  
>        collectDocID = docID;
>  
> -      // TODO: we could score on demand instead since we are
> -      // daat here:
> -      collectScore = baseScorer.score();
> -
>        if (failedCollector == null) {
> +        collectScore = baseScorer.score();
>          // Hit passed all filters, so it's "real":
>          collectHit(collector, dims);
>        } else {
> {code}
>  
>  *Motivation*
>  1. Performance degradation: we have quite heavy custom implementation of score(). So when we started using DrillSideways, this call became top-1 in a profiler snapshot (top-3 with default scoring). We tried doUnionScoring and doDrillDownAdvanceScoring, but no luck:
>  doUnionScoring scores all baseQuery docIds
>  doDrillDownAdvanceScoring avoids some redundant docIds scorings, considering symmetric difference of top two iterator's docIds, but still scores some docIds, that will be filtered out by 3rd, 4th, ... dimension iterators
>  doQueryFirstScoring scores near-miss docIds
>  Best way is to score only true hits (where baseQuery and all N drill-down iterators match). So we suggest a small modification of doQueryFirstScoring.
>   
>  2. Speaking of doQueryFirstScoring, it doesn't look like we need to calculate a score for near-miss hit, because it won't be used anywhere.
>  FacetsCollectorManager creates FacetsCollector with default constructor
>  [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollectorManager.java#L35]
>  so FacetCollector has false for keepScores 
>  [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollector.java#L119]
>  and collectScore is not being used
>  [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysScorer.java#L200]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org