You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/01/04 19:00:34 UTC

[GitHub] [lucene-solr] msokolov edited a comment on pull request #2176: Initial rewrite of MMapDirectory for JDK-16 preview (incubating) PANAMA APIs

msokolov edited a comment on pull request #2176:
URL: https://github.com/apache/lucene-solr/pull/2176#issuecomment-754153605


   I ran some vector search performance tests even though this isn't probably the most representative test, but since I have been running these for other reasons lately and I'm all geared up for it, I thought I would fire them up with these changes. I see a pronounced slowdown there, about 40% increase in latency for these KNN searches with this patch (I ran both conditions w/JDK16). Most of the workload from these searches are random-access reads of bytes from the index, conversion to float[], and computing dot products.
   
   I did recently discover that we were (inadvertently) reading and writing the `float[]` vectors that these searches are based on as big-endian, so perhaps if (when) we switch that (I have a pending patch), it would work better in conjunction with this??, although currently on master, at the `IndexInput` level it is just reading bytes on the heap and only afterwards converting those to `float[]`, so it doesn't seem likely there would be any interaction there; I just mention it since it's top of mind, and since I plan to add `DataInput.readFloats()`, which would be needed here too.
   
   Then I also ran luceneutil tests and see interesting mixed results:
   
   ```
                       TaskQPS baseline      StdDevQPS candidate      StdDev                Pct diff p-value
      BrowseMonthTaxoFacets        2.15      (7.7%)        1.19      (2.1%)  -44.5% ( -50% -  -37%) 0.000
   BrowseDayOfYearTaxoFacets        2.03      (8.1%)        1.16      (2.4%)  -43.0% ( -49% -  -35%) 0.000
       BrowseDateTaxoFacets        2.03      (8.2%)        1.16      (2.3%)  -42.9% ( -49% -  -35%) 0.000
                    Respell       52.33      (2.4%)       46.89      (1.7%)  -10.4% ( -14% -   -6%) 0.000
      BrowseMonthSSDVFacets       12.00     (13.2%)       10.81      (6.9%)  -10.0% ( -26% -   11%) 0.003
                   PKLookup      132.89      (2.3%)      120.09      (1.3%)   -9.6% ( -13% -   -6%) 0.000
                AndHighHigh       24.98      (3.0%)       22.61      (2.1%)   -9.5% ( -14% -   -4%) 0.000
            MedSloppyPhrase       15.66      (3.9%)       14.37      (2.0%)   -8.2% ( -13% -   -2%) 0.000
                LowSpanNear       96.34      (2.3%)       88.72      (1.6%)   -7.9% ( -11% -   -4%) 0.000
           HighSloppyPhrase       10.57      (4.6%)        9.75      (2.7%)   -7.7% ( -14% -    0%) 0.000
                    Prefix3       37.23      (7.2%)       34.53      (6.1%)   -7.2% ( -19% -    6%) 0.001
                MedSpanNear       15.63      (1.5%)       14.52      (2.0%)   -7.1% ( -10% -   -3%) 0.000
            LowSloppyPhrase       15.11      (4.4%)       14.06      (2.5%)   -6.9% ( -13% -    0%) 0.000
                 AndHighMed       72.87      (3.4%)       67.88      (3.0%)   -6.8% ( -12% -    0%) 0.000
                 AndHighLow      366.41      (3.2%)      342.99      (3.6%)   -6.4% ( -12% -    0%) 0.000
                  OrHighMed       69.00      (2.6%)       65.09      (2.7%)   -5.7% ( -10% -    0%) 0.000
                    LowTerm      930.90      (5.7%)      878.74      (4.3%)   -5.6% ( -14% -    4%) 0.000
                     Fuzzy2       46.43      (7.2%)       43.94      (5.4%)   -5.4% ( -16% -    7%) 0.008
                     Fuzzy1       64.31      (6.9%)       60.98      (6.5%)   -5.2% ( -17% -    8%) 0.014
               HighSpanNear       19.07      (1.7%)       18.12      (2.6%)   -5.0% (  -9% -    0%) 0.000
                  MedPhrase       21.45      (2.4%)       20.53      (1.9%)   -4.3% (  -8% -    0%) 0.000
       HighTermTitleBDVSort       73.32     (16.1%)       70.33     (17.0%)   -4.1% ( -32% -   34%) 0.437
                 OrHighHigh       16.82      (1.7%)       16.19      (1.9%)   -3.7% (  -7% -    0%) 0.000
   BrowseDayOfYearSSDVFacets        9.56      (8.7%)        9.24      (6.2%)   -3.4% ( -16% -   12%) 0.159
       HighIntervalsOrdered        9.02      (0.6%)        8.78      (0.7%)   -2.7% (  -4% -   -1%) 0.000
               OrNotHighMed      324.93      (2.7%)      316.20      (3.6%)   -2.7% (  -8% -    3%) 0.007
                   Wildcard      129.01      (2.5%)      126.91      (2.6%)   -1.6% (  -6% -    3%) 0.046
      HighTermDayOfYearSort      113.01     (11.9%)      111.68     (10.6%)   -1.2% ( -21% -   24%) 0.741
               OrNotHighLow      539.19      (2.8%)      534.10      (2.8%)   -0.9% (  -6% -    4%) 0.284
                  OrHighLow      314.01      (5.4%)      311.23      (5.9%)   -0.9% ( -11% -   11%) 0.620
          HighTermMonthSort       96.00      (9.6%)       95.48     (10.1%)   -0.5% ( -18% -   21%) 0.863
                 TermDTSort      100.42     (10.9%)      100.52     (13.4%)    0.1% ( -21% -   27%) 0.980
                  LowPhrase      259.52      (1.9%)      262.37      (2.2%)    1.1% (  -2% -    5%) 0.091
                   HighTerm      872.57      (3.9%)      892.07      (3.4%)    2.2% (  -4% -    9%) 0.052
                    MedTerm     1003.36      (4.8%)     1029.34      (4.8%)    2.6% (  -6% -   12%) 0.086
                 HighPhrase      220.30      (1.9%)      228.63      (3.3%)    3.8% (  -1% -    9%) 0.000
              OrHighNotHigh      372.34      (3.4%)      391.32      (4.2%)    5.1% (  -2% -   13%) 0.000
              OrNotHighHigh      389.61      (2.7%)      415.68      (5.5%)    6.7% (  -1% -   15%) 0.000
               OrHighNotMed      409.35      (1.9%)      445.76      (6.9%)    8.9% (   0% -   18%) 0.000
               OrHighNotLow      513.64      (4.2%)      561.96      (7.9%)    9.4% (  -2% -   22%) 0.000
                     IntNRQ      203.68      (2.3%)      224.37      (4.7%)   10.2% (   3% -   17%) 0.000
   ```
   
   note on the setup: I had to tinker with luceneutil a bit to get it to use JDK16 runtime and add the foreign memory access module. gradle seemed to work OK given the environment variable setup discussed above. I did check the command line to make sure it was in fact using JDK16, and I saw it fail once (before I added the module import command line argument), so I'm sure it is.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org