You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Adrien Grand (JIRA)" <ji...@apache.org> on 2015/02/13 21:33:11 UTC

[jira] [Updated] (LUCENE-6244) Approximations on disjunctions

     [ https://issues.apache.org/jira/browse/LUCENE-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrien Grand updated LUCENE-6244:
---------------------------------
    Attachment: LUCENE-6244.patch

Here is a patch. In order to keep things simple I made DisjunctionScorer handle all clauses as if they supported approximations. When they don't, I just use the scorer itself as an approximation and matches() always returns true.

When you do not need scores, we will stop calling matches() after we have found a single matching clause.

luceneutil looks happy:

{noformat}
                    TaskQPS baseline      StdDev   QPS patch      StdDev                Pct diff
                  Fuzzy2       57.41      (6.2%)       55.70      (9.3%)   -3.0% ( -17% -   13%)
              AndHighMed      204.81      (2.2%)      203.24      (2.0%)   -0.8% (  -4% -    3%)
                 Respell       72.45      (3.1%)       72.04      (3.7%)   -0.6% (  -7% -    6%)
            OrNotHighMed      190.72      (1.5%)      189.87      (1.3%)   -0.4% (  -3% -    2%)
            HighSpanNear       44.31      (3.7%)       44.12      (3.2%)   -0.4% (  -7% -    6%)
            OrHighNotMed       98.12      (2.1%)       97.74      (2.3%)   -0.4% (  -4% -    4%)
             LowSpanNear       24.78      (5.5%)       24.71      (5.6%)   -0.3% ( -10% -   11%)
               MedPhrase      135.21      (2.1%)      134.82      (1.9%)   -0.3% (  -4% -    3%)
              HighPhrase        4.29      (4.3%)        4.28      (4.5%)   -0.3% (  -8% -    8%)
         LowSloppyPhrase       96.57      (3.0%)       96.35      (3.1%)   -0.2% (  -6% -    6%)
               OrHighMed       79.74      (6.3%)       79.56      (5.8%)   -0.2% ( -11% -   12%)
         MedSloppyPhrase       52.22      (2.7%)       52.10      (2.5%)   -0.2% (  -5% -    5%)
           OrNotHighHigh       36.47      (0.9%)       36.38      (0.6%)   -0.2% (  -1% -    1%)
            OrNotHighLow      784.22      (2.9%)      782.46      (3.5%)   -0.2% (  -6% -    6%)
        HighSloppyPhrase       27.94      (3.1%)       27.89      (3.1%)   -0.2% (  -6% -    6%)
           OrHighNotHigh       35.84      (1.4%)       35.78      (1.2%)   -0.2% (  -2% -    2%)
                 Prefix3       74.10      (3.0%)       73.99      (2.6%)   -0.2% (  -5% -    5%)
                 MedTerm      306.95      (1.3%)      306.54      (1.4%)   -0.1% (  -2% -    2%)
                Wildcard       27.77      (2.1%)       27.74      (2.1%)   -0.1% (  -4% -    4%)
              OrHighHigh       41.35      (6.3%)       41.30      (6.4%)   -0.1% ( -12% -   13%)
               OrHighLow       12.41      (6.8%)       12.41      (6.6%)   -0.0% ( -12% -   14%)
               LowPhrase       75.12      (1.5%)       75.14      (1.7%)    0.0% (  -3% -    3%)
                PKLookup      266.10      (2.5%)      266.28      (2.7%)    0.1% (  -5% -    5%)
             MedSpanNear       36.94      (3.8%)       36.99      (3.9%)    0.1% (  -7% -    8%)
             AndHighHigh       90.05      (2.0%)       90.18      (1.8%)    0.1% (  -3% -    3%)
                HighTerm       74.97      (1.5%)       75.09      (1.4%)    0.2% (  -2% -    3%)
            OrHighNotLow       80.68      (2.7%)       80.87      (2.5%)    0.2% (  -4% -    5%)
                  IntNRQ        7.50      (3.1%)        7.52      (2.9%)    0.4% (  -5% -    6%)
                 LowTerm      856.75      (3.0%)      861.68      (3.1%)    0.6% (  -5% -    6%)
              AndHighLow      823.49      (3.6%)      831.80      (3.6%)    1.0% (  -5% -    8%)
                  Fuzzy1       58.81      (8.0%)       59.79     (10.2%)    1.7% ( -15% -   21%)
{noformat}

> Approximations on disjunctions
> ------------------------------
>
>                 Key: LUCENE-6244
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6244
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>             Fix For: Trunk, 5.1
>
>         Attachments: LUCENE-6244.patch
>
>
> Like we just did on exact phrases and conjunctions, we should also support approximations on disjunctions in order to apply "matches()" lazily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org