You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/08/19 01:53:41 UTC

[GitHub] [lucene] gautamworah96 commented on pull request #179: LUCENE-9476: Add getBulkPath API to DirectoryTaxonomyReader

gautamworah96 commented on pull request #179:
URL: https://github.com/apache/lucene/pull/179#issuecomment-901547047


   Hey @mikemccand , my apologies for leaving this PR hanging! The unexpected benchmark results threw me off..
   
   I've updated the PR to fix the bug you had previously discovered and updated the PR to bring it on par with mainline.
   
   I re-ran benchmarks after fixing the `leafReaderDocBase` not added to the `leafReaderMaxDoc` bug:
   
                           TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                      OrHighLow      394.22      (4.1%)      390.60      (4.7%)   -0.9% (  -9% -    8%) 0.508
                  OrHighNotHigh      393.24      (5.0%)      393.11      (4.4%)   -0.0% (  -9% -    9%) 0.982
           HighIntervalsOrdered        2.23      (1.3%)        2.23      (1.2%)    0.1% (  -2% -    2%) 0.768
                     HighPhrase      169.85      (4.9%)      170.09      (5.7%)    0.1% (  -9% -   11%) 0.932
           BrowseDateTaxoFacets        0.57      (1.6%)        0.57      (1.6%)    0.2% (  -2% -    3%) 0.684
                   HighSpanNear        3.57      (2.2%)        3.58      (2.1%)    0.2% (  -3% -    4%) 0.711
                    LowSpanNear       11.32      (1.8%)       11.35      (1.4%)    0.3% (  -2% -    3%) 0.604
                   OrNotHighMed      398.53      (6.1%)      399.62      (7.5%)    0.3% ( -12% -   14%) 0.900
                    MedSpanNear       11.37      (1.8%)       11.41      (1.2%)    0.3% (  -2% -    3%) 0.522
                        LowTerm      949.41      (5.7%)      952.69      (5.2%)    0.3% (  -9% -   11%) 0.841
                     OrHighHigh       27.45      (2.5%)       27.55      (2.1%)    0.4% (  -4% -    5%) 0.615
                         IntNRQ       19.98     (30.1%)       20.06     (29.6%)    0.4% ( -45% -   85%) 0.967
                        Respell       24.48      (2.4%)       24.58      (2.5%)    0.4% (  -4% -    5%) 0.575
                LowSloppyPhrase       19.15      (2.6%)       19.23      (2.3%)    0.4% (  -4% -    5%) 0.569
                      OrHighMed       23.02      (2.9%)       23.14      (1.8%)    0.5% (  -4% -    5%) 0.482
       BrowseDayOfYearTaxoFacets     6311.77      (3.7%)     6349.39      (4.2%)    0.6% (  -7% -    8%) 0.634
                         Fuzzy1       42.80      (3.9%)       43.06      (5.0%)    0.6% (  -7% -    9%) 0.666
                   OrHighNotMed      446.39      (5.5%)      449.32      (5.6%)    0.7% (  -9% -   12%) 0.707
               HighSloppyPhrase        3.57      (4.4%)        3.59      (3.8%)    0.7% (  -7% -    9%) 0.610
                   OrNotHighLow      418.72      (3.6%)      421.68      (3.6%)    0.7% (  -6% -    8%) 0.535
                      LowPhrase       12.47      (2.8%)       12.56      (2.8%)    0.8% (  -4% -    6%) 0.391
                   OrHighNotLow      494.57      (4.9%)      498.46      (5.4%)    0.8% (  -9% -   11%) 0.629
          HighTermDayOfYearSort        7.07     (12.8%)        7.13     (10.2%)    0.9% ( -19% -   27%) 0.799
                MedSloppyPhrase       21.85      (2.8%)       22.06      (2.5%)    0.9% (  -4% -    6%) 0.266
                         Fuzzy2       35.04      (3.6%)       35.43      (3.7%)    1.1% (  -5% -    8%) 0.330
           HighTermTitleBDVSort       39.93      (8.7%)       40.40      (6.9%)    1.2% ( -13% -   18%) 0.638
                       Wildcard       38.16      (4.4%)       38.60      (3.9%)    1.2% (  -6% -    9%) 0.370
                     AndHighMed       47.87      (2.7%)       48.45      (3.6%)    1.2% (  -4% -    7%) 0.224
                  OrNotHighHigh      367.90      (4.8%)      372.56      (3.9%)    1.3% (  -7% -   10%) 0.362
                    AndHighHigh       20.63      (3.2%)       20.94      (4.6%)    1.5% (  -6% -    9%) 0.221
              HighTermMonthSort       25.15     (12.1%)       25.54     (13.8%)    1.6% ( -21% -   31%) 0.703
                      MedPhrase       54.64      (6.1%)       55.54      (6.9%)    1.6% ( -10% -   15%) 0.424
                     TermDTSort       16.01     (14.6%)       16.28     (16.7%)    1.7% ( -25% -   38%) 0.737
          BrowseMonthSSDVFacets        2.62      (0.9%)        2.66      (2.1%)    1.7% (  -1% -    4%) 0.001
                     AndHighLow      252.16      (3.3%)      256.71      (3.3%)    1.8% (  -4% -    8%) 0.086
                       HighTerm      998.70      (6.8%)     1020.66      (5.8%)    2.2% (  -9% -   15%) 0.271
          BrowseMonthTaxoFacets     6354.02      (5.9%)     6495.78      (5.4%)    2.2% (  -8% -   14%) 0.211
                       PKLookup      113.56      (4.2%)      116.24      (4.4%)    2.4% (  -6% -   11%) 0.084
       BrowseDayOfYearSSDVFacets        2.46      (0.5%)        2.52      (2.9%)    2.5% (   0% -    5%) 0.000
                        MedTerm      815.18      (4.9%)      837.70      (6.0%)    2.8% (  -7% -   14%) 0.109
                        Prefix3      144.90     (10.0%)      149.12     (11.5%)    2.9% ( -16% -   27%) 0.392
   
   The gain in taxonomy based facets is about 2%. I ran it multiple times locally and it gave around 2% gain in those runs as well. SSDV facets still show a 3% gain which is unexplained.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org