You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Chris Hegarty (Jira)" <ji...@apache.org> on 2022/04/15 14:01:00 UTC
[jira] [Commented] (LUCENE-10517) Improve performance of SortedSetDV faceting by iterating on class types
[ https://issues.apache.org/jira/browse/LUCENE-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522838#comment-17522838 ]
Chris Hegarty commented on LUCENE-10517:
----------------------------------------
I my M1 I get the following luceneutil benchmark results.
$ sw_vers
ProductName: macOS
ProductVersion: 11.5.2
BuildVersion: 20G95
$ uname -a
Darwin chegar-MBP.local 20.6.0 Darwin Kernel Version 20.6.0: Wed Jun 23 00:26:27 PDT 2021; root:xnu-7195.141.2~5/RELEASE_ARM64_T8101 arm64
$ sysctl -n machdep.cpu.brand_string
Apple M1
$ system_profiler SPHardwareDataType
Hardware:
Hardware Overview:
Model Name: MacBook Pro
Model Identifier: MacBookPro17,1
Chip: Apple M1
Total Number of Cores: 8 (4 performance and 4 efficiency)
Memory: 16 GB
System Firmware Version: 6723.140.2
OS Loader Version: 6723.140.2
Serial Number (system): FVFG731MQ05P
Hardware UUID: 1D7BA696-DBDB-5E9C-BD46-5A18758DE699
Provisioning UDID: 00008103-000A05E001C0801E
Activation Lock Status: Disabled
{code:java}
TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value
LowPhrase 148.35 (2.1%) 143.66 (2.6%) -3.2% ( -7% - 1%) 0.000
MedIntervalsOrdered 197.27 (3.7%) 191.24 (5.7%) -3.1% ( -12% - 6%) 0.044
HighIntervalsOrdered 11.55 (2.6%) 11.33 (3.5%) -1.9% ( -7% - 4%) 0.055
AndHighMed 447.74 (2.1%) 441.26 (2.4%) -1.4% ( -5% - 3%) 0.042
HighTerm 2397.60 (4.0%) 2367.10 (2.4%) -1.3% ( -7% - 5%) 0.223
LowTerm 3939.37 (2.7%) 3890.14 (2.3%) -1.2% ( -6% - 3%) 0.111
OrHighNotHigh 1917.21 (2.8%) 1893.94 (3.2%) -1.2% ( -6% - 4%) 0.198
HighPhrase 32.93 (1.9%) 32.55 (1.1%) -1.2% ( -4% - 1%) 0.022
PKLookup 340.11 (4.5%) 336.69 (4.3%) -1.0% ( -9% - 8%) 0.471
TermDTSort 145.39 (4.1%) 144.09 (2.3%) -0.9% ( -7% - 5%) 0.394
HighSpanNear 10.38 (3.7%) 10.32 (1.9%) -0.6% ( -5% - 5%) 0.531
MedSpanNear 206.69 (2.8%) 205.70 (1.5%) -0.5% ( -4% - 3%) 0.500
Fuzzy2 91.75 (2.5%) 91.41 (1.4%) -0.4% ( -4% - 3%) 0.562
OrHighNotMed 1975.22 (3.5%) 1968.91 (2.7%) -0.3% ( -6% - 6%) 0.744
OrHighMed 66.62 (3.9%) 66.45 (4.8%) -0.3% ( -8% - 8%) 0.850
LowSloppyPhrase 62.60 (2.1%) 62.44 (2.5%) -0.3% ( -4% - 4%) 0.726
OrHighNotLow 1876.16 (2.5%) 1871.56 (2.4%) -0.2% ( -5% - 4%) 0.756
OrHighHigh 55.70 (3.9%) 55.64 (4.9%) -0.1% ( -8% - 9%) 0.940
Fuzzy1 100.97 (2.2%) 100.88 (2.1%) -0.1% ( -4% - 4%) 0.898
LowIntervalsOrdered 42.24 (0.7%) 42.21 (1.0%) -0.1% ( -1% - 1%) 0.766
MedPhrase 923.85 (1.3%) 923.14 (1.6%) -0.1% ( -2% - 2%) 0.867
OrNotHighMed 1427.45 (2.0%) 1428.11 (2.5%) 0.0% ( -4% - 4%) 0.949
Respell 82.74 (2.6%) 82.81 (1.9%) 0.1% ( -4% - 4%) 0.903
LowSpanNear 373.63 (2.6%) 373.97 (1.6%) 0.1% ( -4% - 4%) 0.893
HighTermDayOfYearSort 199.64 (1.7%) 199.83 (2.5%) 0.1% ( -4% - 4%) 0.887
OrNotHighHigh 1523.02 (2.2%) 1526.12 (2.0%) 0.2% ( -3% - 4%) 0.759
AndHighMedDayTaxoFacets 185.23 (0.9%) 185.79 (1.4%) 0.3% ( -1% - 2%) 0.416
MedTerm 3016.98 (3.4%) 3026.53 (3.2%) 0.3% ( -6% - 7%) 0.761
OrNotHighLow 1867.65 (2.5%) 1876.63 (2.4%) 0.5% ( -4% - 5%) 0.535
AndHighLow 1571.61 (3.1%) 1579.86 (2.6%) 0.5% ( -5% - 6%) 0.564
OrHighLow 1485.93 (3.7%) 1494.56 (2.5%) 0.6% ( -5% - 7%) 0.559
AndHighHigh 80.42 (2.8%) 81.06 (1.7%) 0.8% ( -3% - 5%) 0.273
HighSloppyPhrase 50.68 (4.0%) 51.14 (4.7%) 0.9% ( -7% - 9%) 0.506
MedSloppyPhrase 40.76 (2.6%) 41.13 (3.6%) 0.9% ( -5% - 7%) 0.356
Wildcard 123.13 (7.3%) 124.34 (6.5%) 1.0% ( -11% - 15%) 0.654
AndHighHighDayTaxoFacets 17.77 (2.8%) 17.95 (2.7%) 1.0% ( -4% - 6%) 0.256
MedTermDayTaxoFacets 46.83 (2.6%) 47.38 (1.8%) 1.2% ( -3% - 5%) 0.097
HighTermMonthSort 193.35 (1.5%) 195.77 (5.4%) 1.2% ( -5% - 8%) 0.320
IntNRQ 69.13 (17.2%) 70.81 (16.2%) 2.4% ( -26% - 43%) 0.646
HighTermTitleBDVSort 198.10 (1.7%) 203.76 (7.8%) 2.9% ( -6% - 12%) 0.109
Prefix3 183.52 (9.3%) 188.79 (7.7%) 2.9% ( -12% - 21%) 0.287
OrHighMedDayTaxoFacets 14.45 (16.8%) 15.30 (11.1%) 5.9% ( -18% - 40%) 0.191
BrowseRandomLabelSSDVFacets 16.23 (7.5%) 18.27 (8.8%) 12.6% ( -3% - 31%) 0.000
BrowseDayOfYearSSDVFacets 28.25 (19.0%) 31.82 (11.8%) 12.6% ( -15% - 53%) 0.011
BrowseMonthSSDVFacets 21.41 (7.6%) 24.52 (11.7%) 14.5% ( -4% - 36%) 0.000
BrowseDateSSDVFacets 3.64 (14.6%) 4.18 (15.0%) 14.8% ( -12% - 51%) 0.001
BrowseDayOfYearTaxoFacets 34.48 (30.7%) 40.34 (25.7%) 17.0% ( -30% - 105%) 0.058
BrowseDateTaxoFacets 34.36 (30.7%) 40.24 (25.6%) 17.1% ( -29% - 105%) 0.056
BrowseMonthTaxoFacets 31.97 (35.3%) 40.02 (29.1%) 25.2% ( -28% - 138%) 0.014
BrowseRandomLabelTaxoFacets 39.74 (46.9%) 51.86 (42.8%) 30.5% ( -40% - 226%) 0.032
{code}
> Improve performance of SortedSetDV faceting by iterating on class types
> -----------------------------------------------------------------------
>
> Key: LUCENE-10517
> URL: https://issues.apache.org/jira/browse/LUCENE-10517
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/index
> Affects Versions: 9.1
> Reporter: Chris Hegarty
> Priority: Minor
>
> SortedSetDV faceting (and friends), can improve performance within tight loops by using _invokevirtual_ (rather than _invokeinterface_). The C2 JIT compiler can produce slightly more optimal code in this case, and since these loops are very hot, the impact can be significant (in the order of 10-20%).
> The code change amounts to using `SortedDocValues` or `SortedSetDocValues` class types, rather than the `DocIdSetIterator` interface type, in loops (specifically for invocation of `nextDoc()`, when the iterator type is known and not wrapped.
> This issue is in some ways similar, and builds upon, prior optimisations in this area, like say LUCENE-5300.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org