You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2014/07/10 00:19:05 UTC

[jira] [Commented] (LUCENE-5808) clean up postingsreader

    [ https://issues.apache.org/jira/browse/LUCENE-5808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056846#comment-14056846 ] 

Michael McCandless commented on LUCENE-5808:
--------------------------------------------

I really like these simplifications.

I ran perf test vs trunk:

{noformat}
Report after iter 19:
                    Task    QPS base      StdDev    QPS comp      StdDev                Pct diff
              OrHighHigh        9.74     (13.5%)        8.28     (18.8%)  -15.0% ( -41% -   20%)
            OrNotHighLow       23.71     (13.3%)       21.22     (16.1%)  -10.5% ( -35% -   21%)
               OrHighMed       31.79      (7.6%)       28.73     (14.9%)   -9.6% ( -29% -   13%)
           OrHighNotHigh       13.18     (12.8%)       12.16     (14.1%)   -7.7% ( -30% -   21%)
                HighTerm       67.90     (14.5%)       62.68     (19.0%)   -7.7% ( -35% -   30%)
            OrHighNotLow       29.26     (14.3%)       27.10     (14.2%)   -7.4% ( -31% -   24%)
            OrNotHighMed       22.57     (15.0%)       20.93     (15.5%)   -7.3% ( -32% -   27%)
                 Prefix3       86.86     (10.2%)       81.99     (14.8%)   -5.6% ( -27% -   21%)
           OrNotHighHigh       10.41     (13.2%)        9.87     (14.7%)   -5.2% ( -29% -   26%)
                  Fuzzy1       55.92     (10.0%)       53.24     (13.4%)   -4.8% ( -25% -   20%)
        HighSloppyPhrase        3.42     (14.6%)        3.26     (17.5%)   -4.6% ( -31% -   32%)
            HighSpanNear        9.37     (15.8%)        9.09     (19.9%)   -3.0% ( -33% -   38%)
              HighPhrase        4.33     (10.8%)        4.20     (16.8%)   -2.9% ( -27% -   27%)
               OrHighLow       21.82     (15.5%)       21.38     (13.9%)   -2.1% ( -27% -   32%)
              AndHighMed       34.04      (4.8%)       33.56     (11.8%)   -1.4% ( -17% -   15%)
            OrHighNotMed       33.92     (19.4%)       33.57     (13.1%)   -1.0% ( -28% -   38%)
                 LowTerm      318.33     (15.4%)      318.45     (12.9%)    0.0% ( -24% -   33%)
                 Respell       45.80     (11.8%)       45.85     (14.5%)    0.1% ( -23% -   29%)
             AndHighHigh       28.10      (6.3%)       28.19     (11.3%)    0.3% ( -16% -   19%)
                  Fuzzy2       41.95     (10.1%)       42.40     (16.0%)    1.1% ( -22% -   30%)
                Wildcard       18.84     (11.5%)       19.13     (12.2%)    1.5% ( -19% -   28%)
                  IntNRQ        3.17     (14.0%)        3.22     (17.2%)    1.5% ( -26% -   38%)
               LowPhrase       12.83     (10.3%)       13.08     (16.3%)    2.0% ( -22% -   31%)
                 MedTerm       98.48     (18.2%)      100.63     (16.9%)    2.2% ( -27% -   45%)
               MedPhrase      197.18     (13.3%)      201.95     (12.6%)    2.4% ( -20% -   32%)
         MedSloppyPhrase        3.32     (16.1%)        3.50     (14.2%)    5.4% ( -21% -   42%)
              AndHighLow      352.18     (12.9%)      375.73     (14.4%)    6.7% ( -18% -   39%)
         LowSloppyPhrase       42.72     (12.0%)       46.52     (21.5%)    8.9% ( -22% -   48%)
             LowSpanNear       10.23     (15.7%)       11.24     (18.9%)    9.9% ( -21% -   52%)
             MedSpanNear       31.53     (14.1%)       35.40     (20.7%)   12.3% ( -19% -   54%)
{noformat}

Seems like OR queries lost a bit ...

> clean up postingsreader
> -----------------------
>
>                 Key: LUCENE-5808
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5808
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Robert Muir
>         Attachments: LUCENE-5808.patch
>
>
> The current postingsreader is ~ 1500 lines of code (mostly duplicated) calling something like 4,000 lines of generated decompression code.
> This is really heavyweight and complicated, and bloats the lucene jar. It would be nice to simplify it so we can eventually remove the baggage.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org