You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Chris Hegarty (Jira)" <ji...@apache.org> on 2022/04/15 14:01:00 UTC
[jira] [Commented] (LUCENE-10517) Improve performance of SortedSetDV faceting by iterating on class types

    [ https://issues.apache.org/jira/browse/LUCENE-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522838#comment-17522838 ] 

Chris Hegarty commented on LUCENE-10517:
----------------------------------------

I my M1 I get the following luceneutil benchmark results.
$ sw_vers
ProductName:	macOS
ProductVersion:	11.5.2
BuildVersion:	20G95

$ uname -a
Darwin chegar-MBP.local 20.6.0 Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:27 PDT 2021; root:xnu-7195.141.2~5/RELEASE_ARM64_T8101 arm64

$ sysctl -n machdep.cpu.brand_string
Apple M1

$ system_profiler SPHardwareDataType
Hardware:

    Hardware Overview:

      Model Name: MacBook Pro
      Model Identifier: MacBookPro17,1
      Chip: Apple M1
      Total Number of Cores: 8 (4 performance and 4 efficiency)
      Memory: 16 GB
      System Firmware Version: 6723.140.2
      OS Loader Version: 6723.140.2
      Serial Number (system): FVFG731MQ05P
      Hardware UUID: 1D7BA696-DBDB-5E9C-BD46-5A18758DE699
      Provisioning UDID: 00008103-000A05E001C0801E
      Activation Lock Status: Disabled
{code:java}
                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                       LowPhrase      148.35      (2.1%)      143.66      (2.6%)   -3.2% (  -7% -    1%) 0.000
             MedIntervalsOrdered      197.27      (3.7%)      191.24      (5.7%)   -3.1% ( -12% -    6%) 0.044
            HighIntervalsOrdered       11.55      (2.6%)       11.33      (3.5%)   -1.9% (  -7% -    4%) 0.055
                      AndHighMed      447.74      (2.1%)      441.26      (2.4%)   -1.4% (  -5% -    3%) 0.042
                        HighTerm     2397.60      (4.0%)     2367.10      (2.4%)   -1.3% (  -7% -    5%) 0.223
                         LowTerm     3939.37      (2.7%)     3890.14      (2.3%)   -1.2% (  -6% -    3%) 0.111
                   OrHighNotHigh     1917.21      (2.8%)     1893.94      (3.2%)   -1.2% (  -6% -    4%) 0.198
                      HighPhrase       32.93      (1.9%)       32.55      (1.1%)   -1.2% (  -4% -    1%) 0.022
                        PKLookup      340.11      (4.5%)      336.69      (4.3%)   -1.0% (  -9% -    8%) 0.471
                      TermDTSort      145.39      (4.1%)      144.09      (2.3%)   -0.9% (  -7% -    5%) 0.394
                    HighSpanNear       10.38      (3.7%)       10.32      (1.9%)   -0.6% (  -5% -    5%) 0.531
                     MedSpanNear      206.69      (2.8%)      205.70      (1.5%)   -0.5% (  -4% -    3%) 0.500
                          Fuzzy2       91.75      (2.5%)       91.41      (1.4%)   -0.4% (  -4% -    3%) 0.562
                    OrHighNotMed     1975.22      (3.5%)     1968.91      (2.7%)   -0.3% (  -6% -    6%) 0.744
                       OrHighMed       66.62      (3.9%)       66.45      (4.8%)   -0.3% (  -8% -    8%) 0.850
                 LowSloppyPhrase       62.60      (2.1%)       62.44      (2.5%)   -0.3% (  -4% -    4%) 0.726
                    OrHighNotLow     1876.16      (2.5%)     1871.56      (2.4%)   -0.2% (  -5% -    4%) 0.756
                      OrHighHigh       55.70      (3.9%)       55.64      (4.9%)   -0.1% (  -8% -    9%) 0.940
                          Fuzzy1      100.97      (2.2%)      100.88      (2.1%)   -0.1% (  -4% -    4%) 0.898
             LowIntervalsOrdered       42.24      (0.7%)       42.21      (1.0%)   -0.1% (  -1% -    1%) 0.766
                       MedPhrase      923.85      (1.3%)      923.14      (1.6%)   -0.1% (  -2% -    2%) 0.867
                    OrNotHighMed     1427.45      (2.0%)     1428.11      (2.5%)    0.0% (  -4% -    4%) 0.949
                         Respell       82.74      (2.6%)       82.81      (1.9%)    0.1% (  -4% -    4%) 0.903
                     LowSpanNear      373.63      (2.6%)      373.97      (1.6%)    0.1% (  -4% -    4%) 0.893
           HighTermDayOfYearSort      199.64      (1.7%)      199.83      (2.5%)    0.1% (  -4% -    4%) 0.887
                   OrNotHighHigh     1523.02      (2.2%)     1526.12      (2.0%)    0.2% (  -3% -    4%) 0.759
         AndHighMedDayTaxoFacets      185.23      (0.9%)      185.79      (1.4%)    0.3% (  -1% -    2%) 0.416
                         MedTerm     3016.98      (3.4%)     3026.53      (3.2%)    0.3% (  -6% -    7%) 0.761
                    OrNotHighLow     1867.65      (2.5%)     1876.63      (2.4%)    0.5% (  -4% -    5%) 0.535
                      AndHighLow     1571.61      (3.1%)     1579.86      (2.6%)    0.5% (  -5% -    6%) 0.564
                       OrHighLow     1485.93      (3.7%)     1494.56      (2.5%)    0.6% (  -5% -    7%) 0.559
                     AndHighHigh       80.42      (2.8%)       81.06      (1.7%)    0.8% (  -3% -    5%) 0.273
                HighSloppyPhrase       50.68      (4.0%)       51.14      (4.7%)    0.9% (  -7% -    9%) 0.506
                 MedSloppyPhrase       40.76      (2.6%)       41.13      (3.6%)    0.9% (  -5% -    7%) 0.356
                        Wildcard      123.13      (7.3%)      124.34      (6.5%)    1.0% ( -11% -   15%) 0.654
        AndHighHighDayTaxoFacets       17.77      (2.8%)       17.95      (2.7%)    1.0% (  -4% -    6%) 0.256
            MedTermDayTaxoFacets       46.83      (2.6%)       47.38      (1.8%)    1.2% (  -3% -    5%) 0.097
               HighTermMonthSort      193.35      (1.5%)      195.77      (5.4%)    1.2% (  -5% -    8%) 0.320
                          IntNRQ       69.13     (17.2%)       70.81     (16.2%)    2.4% ( -26% -   43%) 0.646
            HighTermTitleBDVSort      198.10      (1.7%)      203.76      (7.8%)    2.9% (  -6% -   12%) 0.109
                         Prefix3      183.52      (9.3%)      188.79      (7.7%)    2.9% ( -12% -   21%) 0.287
          OrHighMedDayTaxoFacets       14.45     (16.8%)       15.30     (11.1%)    5.9% ( -18% -   40%) 0.191
     BrowseRandomLabelSSDVFacets       16.23      (7.5%)       18.27      (8.8%)   12.6% (  -3% -   31%) 0.000
       BrowseDayOfYearSSDVFacets       28.25     (19.0%)       31.82     (11.8%)   12.6% ( -15% -   53%) 0.011
           BrowseMonthSSDVFacets       21.41      (7.6%)       24.52     (11.7%)   14.5% (  -4% -   36%) 0.000
            BrowseDateSSDVFacets        3.64     (14.6%)        4.18     (15.0%)   14.8% ( -12% -   51%) 0.001
       BrowseDayOfYearTaxoFacets       34.48     (30.7%)       40.34     (25.7%)   17.0% ( -30% -  105%) 0.058
            BrowseDateTaxoFacets       34.36     (30.7%)       40.24     (25.6%)   17.1% ( -29% -  105%) 0.056
           BrowseMonthTaxoFacets       31.97     (35.3%)       40.02     (29.1%)   25.2% ( -28% -  138%) 0.014
     BrowseRandomLabelTaxoFacets       39.74     (46.9%)       51.86     (42.8%)   30.5% ( -40% -  226%) 0.032
{code}

> Improve performance of SortedSetDV faceting by iterating on class types
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-10517
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10517
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 9.1
>            Reporter: Chris Hegarty
>            Priority: Minor
>
> SortedSetDV faceting (and friends), can improve performance within tight loops by using _invokevirtual_ (rather than _invokeinterface_). The C2 JIT compiler can produce slightly more optimal code in this case, and since these loops are very hot, the impact can be significant (in the order of 10-20%).
> The code change amounts to using `SortedDocValues` or `SortedSetDocValues` class types, rather than the `DocIdSetIterator` interface type, in loops (specifically for invocation of `nextDoc()`, when the iterator type is known and not wrapped. 
> This issue is in some ways similar, and builds upon, prior optimisations in this area, like say LUCENE-5300.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org