You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/04/08 20:15:34 UTC
[GitHub] [lucene] Yuti-G opened a new pull request, #806: LUCENE-10488: Optimize Facets#getTopDims in FloatTaxonomyFacets
Yuti-G opened a new pull request, #806:
URL: https://github.com/apache/lucene/pull/806
# Description
This change overrides and optimizes the default implementation of getTopDims in FloatTaxonomyFacets which is extended by TaxonomyFacetFloatAssociations
# Solution
Override getTopDims and refactor the getTopChildren function in FloatTaxonomyFacets to get dimCount (aggregated dim values) more efficiently by checking if dimCount has been populated in indexing time for a dim that is hierarchical or multiValued && requireDimCount before aggregating dimCount by iterating its child ordinal.
# Tests
Added new testing for the overridden implementations of getTopDims and getAllDims in TestTaxonomyFacetAssociations
# Checklist
Please review the following and check all that apply:
- [X] I have reviewed the guidelines for [How to Contribute](https://github.com/apache/lucene/blob/main/CONTRIBUTING.md) and my code conforms to the standards described there to the best of my ability.
- [X] I have created a Jira issue and added the issue ID to my pull request title.
- [X] I have given Lucene maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
- [X] I have developed this patch against the `main` branch.
- [X] I have run `./gradlew check`.
- [X] I have added tests for my changes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene] Yuti-G commented on pull request #806: LUCENE-10488: Optimize Facets#getTopDims in FloatTaxonomyFacets
Posted by GitBox <gi...@apache.org>.
Yuti-G commented on PR #806:
URL: https://github.com/apache/lucene/pull/806#issuecomment-1093475380
Just ran benchmark and no regression found.
TaskQPS baseline StdDevQPS candidate StdDev Pct diff p-value
BrowseDateSSDVFacets 679.04 (19.1%) 628.59 (25.5%) -7.4% ( -43% - 46%) 0.297
BrowseRandomLabelSSDVFacets 965.92 (18.4%) 894.89 (16.3%) -7.4% ( -35% - 33%) 0.181
LowIntervalsOrdered 1499.67 (15.6%) 1399.39 (16.2%) -6.7% ( -33% - 29%) 0.183
LowTerm 3877.68 (17.1%) 3660.19 (16.7%) -5.6% ( -33% - 34%) 0.294
BrowseDayOfYearSSDVFacets 2023.80 (16.5%) 1910.72 (18.4%) -5.6% ( -34% - 35%) 0.313
Respell 369.84 (15.0%) 350.55 (15.4%) -5.2% ( -30% - 29%) 0.278
AndHighHigh 1396.47 (14.0%) 1329.88 (15.9%) -4.8% ( -30% - 29%) 0.315
BrowseRandomLabelTaxoFacets 1444.68 (20.4%) 1377.72 (19.4%) -4.6% ( -36% - 44%) 0.461
LowSloppyPhrase 1277.52 (14.5%) 1218.87 (15.4%) -4.6% ( -30% - 29%) 0.332
MedPhrase 1326.72 (15.4%) 1266.61 (16.6%) -4.5% ( -31% - 32%) 0.371
AndHighLow 1667.96 (17.5%) 1595.41 (16.6%) -4.3% ( -32% - 36%) 0.421
Wildcard 503.72 (15.4%) 482.57 (14.7%) -4.2% ( -29% - 30%) 0.378
AndHighMed 1165.87 (16.9%) 1118.54 (17.0%) -4.1% ( -32% - 35%) 0.449
HighSloppyPhrase 926.13 (16.2%) 889.79 (16.3%) -3.9% ( -31% - 34%) 0.445
MedIntervalsOrdered 1708.43 (14.8%) 1642.35 (17.0%) -3.9% ( -31% - 32%) 0.442
HighSpanNear 570.65 (16.7%) 549.92 (16.5%) -3.6% ( -31% - 35%) 0.489
Prefix3 1220.12 (15.3%) 1176.40 (17.0%) -3.6% ( -31% - 33%) 0.483
OrHighMed 1133.47 (16.2%) 1092.99 (15.6%) -3.6% ( -30% - 33%) 0.478
Fuzzy2 83.72 (15.8%) 80.79 (17.3%) -3.5% ( -31% - 35%) 0.504
MedTerm 3257.50 (16.4%) 3149.18 (17.0%) -3.3% ( -31% - 36%) 0.530
HighPhrase 510.67 (16.2%) 494.20 (15.3%) -3.2% ( -29% - 33%) 0.518
BrowseDateTaxoFacets 2429.96 (16.8%) 2353.26 (21.7%) -3.2% ( -35% - 42%) 0.607
LowSpanNear 1087.70 (15.2%) 1055.01 (17.4%) -3.0% ( -30% - 34%) 0.561
BrowseDayOfYearTaxoFacets 2243.38 (17.6%) 2185.69 (19.7%) -2.6% ( -33% - 42%) 0.663
HighTermDayOfYearSort 2622.10 (16.6%) 2558.03 (16.9%) -2.4% ( -30% - 37%) 0.644
MedSpanNear 857.24 (16.0%) 836.66 (18.2%) -2.4% ( -31% - 37%) 0.658
HighIntervalsOrdered 655.43 (14.9%) 640.01 (16.6%) -2.4% ( -29% - 34%) 0.637
BrowseMonthSSDVFacets 2029.76 (16.5%) 1982.42 (20.6%) -2.3% ( -33% - 41%) 0.693
BrowseMonthTaxoFacets 2289.21 (18.7%) 2237.91 (19.1%) -2.2% ( -33% - 43%) 0.707
MedSloppyPhrase 1464.38 (16.6%) 1438.57 (17.4%) -1.8% ( -30% - 38%) 0.743
Fuzzy1 278.00 (18.4%) 273.52 (18.9%) -1.6% ( -32% - 43%) 0.785
HighTermMonthSort 1994.56 (16.7%) 1966.87 (17.4%) -1.4% ( -30% - 39%) 0.797
HighTerm 2132.65 (18.5%) 2103.69 (15.3%) -1.4% ( -29% - 39%) 0.800
IntNRQ 1763.52 (18.6%) 1741.35 (17.4%) -1.3% ( -31% - 42%) 0.825
OrHighLow 1241.11 (15.2%) 1225.73 (17.0%) -1.2% ( -28% - 36%) 0.807
PKLookup 115.16 (18.3%) 114.43 (24.2%) -0.6% ( -36% - 51%) 0.925
LowPhrase 1336.77 (17.3%) 1336.88 (14.9%) 0.0% ( -27% - 39%) 0.999
OrHighHigh 891.20 (15.8%) 896.93 (14.3%) 0.6% ( -25% - 36%) 0.893
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene] gsmiller merged pull request #806: LUCENE-10488: Optimize Facets#getTopDims in FloatTaxonomyFacets
Posted by GitBox <gi...@apache.org>.
gsmiller merged PR #806:
URL: https://github.com/apache/lucene/pull/806
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org
[GitHub] [lucene] Yuti-G commented on pull request #806: LUCENE-10488: Optimize Facets#getTopDims in FloatTaxonomyFacets
Posted by GitBox <gi...@apache.org>.
Yuti-G commented on PR #806:
URL: https://github.com/apache/lucene/pull/806#issuecomment-1123167787
Thanks! Please see the latest commit for the update.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org