You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Grigoriy Troitskiy (Jira)" <ji...@apache.org> on 2021/07/20 11:38:00 UTC

[jira] [Updated] (LUCENE-10030) [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring

     [ https://issues.apache.org/jira/browse/LUCENE-10030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grigoriy Troitskiy updated LUCENE-10030:
----------------------------------------
    Description: 
*Diff*
{code:java}
@@ -195,11 +195,8 @@ class DrillSidewaysScorer extends BulkScorer {
 
       collectDocID = docID;
 
-      // TODO: we could score on demand instead since we are
-      // daat here:
-      collectScore = baseScorer.score();
-
       if (failedCollector == null) {
+        collectScore = baseScorer.score();
         // Hit passed all filters, so it's "real":
         collectHit(collector, dims);
       } else {
{code}
 
 *Motivation*
 1. Performance degradation: we have quite heavy custom implementation of score(). So when we started using DrillSideways, this call became top-1 in a profiler snapshot (top-3 with default scoring). We tried doUnionScoring and doDrillDownAdvanceScoring, but no luck:
 doUnionScoring scores all baseQuery docIds
 doDrillDownAdvanceScoring avoids some redundant docIds scorings, considering symmetric difference of top two iterator's docIds, but still scores some docIds, that will be filtered out by 3rd, 4th, ... dimension iterators
 doQueryFirstScoring scores near-miss docIds
 Best way is to score only true hits (where baseQuery and all N drill-down iterators match). So we suggest a small modification of doQueryFirstScoring.
  
 2. Speaking of doQueryFirstScoring, it doesn't look like we need to calculate a score for near-miss hit, because it won't be used anywhere.
 FacetsCollectorManager creates FacetsCollector with default constructor
 [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollectorManager.java#L35]
 so FacetCollector has false for keepScores 
 [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollector.java#L119]
 and collectScore is not being used
 [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysScorer.java#L200]

  was:
*Diff*
@@ -195,11 +195,8 @@ class DrillSidewaysScorer extends BulkScorer {
 
       collectDocID = docID;
 
-      // TODO: we could score on demand instead since we are
-      // daat here:
-      collectScore = baseScorer.score();
-
       if (failedCollector == null) {
+        collectScore = baseScorer.score();
         // Hit passed all filters, so it's "real":
         collectHit(collector, dims);
       } else {
 
*Motivation*
1. Performance degradation: we have quite heavy custom implementation of score(). So when we started using DrillSideways, this call became top-1 in a profiler snapshot (top-3 with default scoring). We tried doUnionScoring and doDrillDownAdvanceScoring, but no luck:
doUnionScoring scores all baseQuery docIds
doDrillDownAdvanceScoring avoids some redundant docIds scorings, considering symmetric difference of top two iterator's docIds, but still scores some docIds, that will be filtered out by 3rd, 4th, ... dimension iterators
doQueryFirstScoring scores near-miss docIds
Best way is to score only true hits (where baseQuery and all N drill-down iterators match). So we suggest a small modification of doQueryFirstScoring.
 
2. Speaking of doQueryFirstScoring, it doesn't look like we need to calculate a score for near-miss hit, because it won't be used anywhere.
FacetsCollectorManager creates FacetsCollector with default constructor
[https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollectorManager.java#L35]
so FacetCollector has false for keepScores 
[https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollector.java#L119]
and collectScore is not being used
[https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysScorer.java#L200]


> [DrillSidewaysScorer] redundant score() calculations in doQueryFirstScoring
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-10030
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10030
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Grigoriy Troitskiy
>            Priority: Major
>
> *Diff*
> {code:java}
> @@ -195,11 +195,8 @@ class DrillSidewaysScorer extends BulkScorer {
>  
>        collectDocID = docID;
>  
> -      // TODO: we could score on demand instead since we are
> -      // daat here:
> -      collectScore = baseScorer.score();
> -
>        if (failedCollector == null) {
> +        collectScore = baseScorer.score();
>          // Hit passed all filters, so it's "real":
>          collectHit(collector, dims);
>        } else {
> {code}
>  
>  *Motivation*
>  1. Performance degradation: we have quite heavy custom implementation of score(). So when we started using DrillSideways, this call became top-1 in a profiler snapshot (top-3 with default scoring). We tried doUnionScoring and doDrillDownAdvanceScoring, but no luck:
>  doUnionScoring scores all baseQuery docIds
>  doDrillDownAdvanceScoring avoids some redundant docIds scorings, considering symmetric difference of top two iterator's docIds, but still scores some docIds, that will be filtered out by 3rd, 4th, ... dimension iterators
>  doQueryFirstScoring scores near-miss docIds
>  Best way is to score only true hits (where baseQuery and all N drill-down iterators match). So we suggest a small modification of doQueryFirstScoring.
>   
>  2. Speaking of doQueryFirstScoring, it doesn't look like we need to calculate a score for near-miss hit, because it won't be used anywhere.
>  FacetsCollectorManager creates FacetsCollector with default constructor
>  [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollectorManager.java#L35]
>  so FacetCollector has false for keepScores 
>  [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/FacetsCollector.java#L119]
>  and collectScore is not being used
>  [https://github.com/apache/lucene/blob/main/lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysScorer.java#L200]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org