You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Da Huang (JIRA)" <ji...@apache.org> on 2014/05/11 02:58:15 UTC

[jira] [Comment Edited] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses

    [ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994363#comment-13994363 ] 

Da Huang edited comment on LUCENE-4396 at 5/11/14 12:56 AM:
------------------------------------------------------------

luceneutil tasks file to test queries like "+a b c d e ..."

The performance shows as follows.
                ||    TaskQPS || baseline ||     StdDevQPS || my_modified_version ||     StdDev ||               Pct diff ||
       | HighAndManyLowOr    |     8.50    |   (3.3%)    |     1.72    |   (0.3%) |  -79.8% ( -80% -  -78%) | 
      |           PKLookup    |   239.75   |    (0.9%)  |     239.99    |   (0.9%)  |   0.1% (  -1% -    1%) | 
     |    LowAndManyHighOr    |     7.11    |   (1.4%)   |      7.76    |   (1.4%)  |   9.1% (   6% -   12%) | 
  |        LowAndManyLowOr    |    33.83    |   (0.7%)   |     41.03    |   (2.7%)  |  21.3% (  17% -   24%) | 
    |    HighAndManyHighOr    |     0.12   |    (0.7%)      |   0.29   |    (7.8%) |  148.0% ( 138% -  157%) | 



was (Author: dhuang):
luceneutil tasks file to test queries like "+a b c d e ..."

The performance shows as follows.
                    TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff
        HighAndManyLowOr        8.50      (3.3%)        1.72      (0.3%)  -79.8% ( -80% -  -78%)
                PKLookup      239.75      (0.9%)      239.99      (0.9%)    0.1% (  -1% -    1%)
        LowAndManyHighOr        7.11      (1.4%)        7.76      (1.4%)    9.1% (   6% -   12%)
         LowAndManyLowOr       33.83      (0.7%)       41.03      (2.7%)   21.3% (  17% -   24%)
       HighAndManyHighOr        0.12      (0.7%)        0.29      (7.8%)  148.0% ( 138% -  157%)


> BooleanScorer should sometimes be used for MUST clauses
> -------------------------------------------------------
>
>                 Key: LUCENE-4396
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4396
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>         Attachments: AndOr.tasks, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch
>
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared to the other clauses, that BooleanScorer would perform better than BooleanScorer2.  BooleanScorer still has some vestiges from when it used to handle MUST so it shouldn't be hard to bring back this capability ... I think the challenging part might be the heuristics on when to use which (likely we would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs in this case, eg if suddenly the MUST clause skips 1000000 docs then you want to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you are inspired!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org