You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/04/08 20:15:34 UTC

[GitHub] [lucene] Yuti-G opened a new pull request, #806: LUCENE-10488: Optimize Facets#getTopDims in FloatTaxonomyFacets

Yuti-G opened a new pull request, #806:
URL: https://github.com/apache/lucene/pull/806

   # Description
   
   This change overrides and optimizes the default implementation of getTopDims in FloatTaxonomyFacets which is extended by TaxonomyFacetFloatAssociations
   
   # Solution
   Override getTopDims and refactor the getTopChildren function in FloatTaxonomyFacets to get dimCount (aggregated dim values) more efficiently by checking if dimCount has been populated in indexing time for a dim that is hierarchical or multiValued && requireDimCount before aggregating dimCount by iterating its child ordinal.
   
   # Tests
   
   Added new testing for the overridden implementations of getTopDims and getAllDims in TestTaxonomyFacetAssociations
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [X] I have reviewed the guidelines for [How to Contribute](https://github.com/apache/lucene/blob/main/CONTRIBUTING.md) and my code conforms to the standards described there to the best of my ability.
   - [X] I have created a Jira issue and added the issue ID to my pull request title.
   - [X] I have given Lucene maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [X] I have developed this patch against the `main` branch.
   - [X] I have run `./gradlew check`.
   - [X] I have added tests for my changes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] Yuti-G commented on pull request #806: LUCENE-10488: Optimize Facets#getTopDims in FloatTaxonomyFacets

Posted by GitBox <gi...@apache.org>.
Yuti-G commented on PR #806:
URL: https://github.com/apache/lucene/pull/806#issuecomment-1093475380

   Just ran benchmark and no regression found. 
    
                              TaskQPS baseline      StdDevQPS candidate      StdDev                Pct diff p-value
               BrowseDateSSDVFacets      679.04     (19.1%)      628.59     (25.5%)   -7.4% ( -43% -   46%) 0.297
        BrowseRandomLabelSSDVFacets      965.92     (18.4%)      894.89     (16.3%)   -7.4% ( -35% -   33%) 0.181
                LowIntervalsOrdered     1499.67     (15.6%)     1399.39     (16.2%)   -6.7% ( -33% -   29%) 0.183
                            LowTerm     3877.68     (17.1%)     3660.19     (16.7%)   -5.6% ( -33% -   34%) 0.294
          BrowseDayOfYearSSDVFacets     2023.80     (16.5%)     1910.72     (18.4%)   -5.6% ( -34% -   35%) 0.313
                            Respell      369.84     (15.0%)      350.55     (15.4%)   -5.2% ( -30% -   29%) 0.278
                        AndHighHigh     1396.47     (14.0%)     1329.88     (15.9%)   -4.8% ( -30% -   29%) 0.315
        BrowseRandomLabelTaxoFacets     1444.68     (20.4%)     1377.72     (19.4%)   -4.6% ( -36% -   44%) 0.461
                    LowSloppyPhrase     1277.52     (14.5%)     1218.87     (15.4%)   -4.6% ( -30% -   29%) 0.332
                          MedPhrase     1326.72     (15.4%)     1266.61     (16.6%)   -4.5% ( -31% -   32%) 0.371
                         AndHighLow     1667.96     (17.5%)     1595.41     (16.6%)   -4.3% ( -32% -   36%) 0.421
                           Wildcard      503.72     (15.4%)      482.57     (14.7%)   -4.2% ( -29% -   30%) 0.378
                         AndHighMed     1165.87     (16.9%)     1118.54     (17.0%)   -4.1% ( -32% -   35%) 0.449
                   HighSloppyPhrase      926.13     (16.2%)      889.79     (16.3%)   -3.9% ( -31% -   34%) 0.445
                MedIntervalsOrdered     1708.43     (14.8%)     1642.35     (17.0%)   -3.9% ( -31% -   32%) 0.442
                       HighSpanNear      570.65     (16.7%)      549.92     (16.5%)   -3.6% ( -31% -   35%) 0.489
                            Prefix3     1220.12     (15.3%)     1176.40     (17.0%)   -3.6% ( -31% -   33%) 0.483
                          OrHighMed     1133.47     (16.2%)     1092.99     (15.6%)   -3.6% ( -30% -   33%) 0.478
                             Fuzzy2       83.72     (15.8%)       80.79     (17.3%)   -3.5% ( -31% -   35%) 0.504
                            MedTerm     3257.50     (16.4%)     3149.18     (17.0%)   -3.3% ( -31% -   36%) 0.530
                         HighPhrase      510.67     (16.2%)      494.20     (15.3%)   -3.2% ( -29% -   33%) 0.518
               BrowseDateTaxoFacets     2429.96     (16.8%)     2353.26     (21.7%)   -3.2% ( -35% -   42%) 0.607
                        LowSpanNear     1087.70     (15.2%)     1055.01     (17.4%)   -3.0% ( -30% -   34%) 0.561
          BrowseDayOfYearTaxoFacets     2243.38     (17.6%)     2185.69     (19.7%)   -2.6% ( -33% -   42%) 0.663
              HighTermDayOfYearSort     2622.10     (16.6%)     2558.03     (16.9%)   -2.4% ( -30% -   37%) 0.644
                        MedSpanNear      857.24     (16.0%)      836.66     (18.2%)   -2.4% ( -31% -   37%) 0.658
               HighIntervalsOrdered      655.43     (14.9%)      640.01     (16.6%)   -2.4% ( -29% -   34%) 0.637
              BrowseMonthSSDVFacets     2029.76     (16.5%)     1982.42     (20.6%)   -2.3% ( -33% -   41%) 0.693
              BrowseMonthTaxoFacets     2289.21     (18.7%)     2237.91     (19.1%)   -2.2% ( -33% -   43%) 0.707
                    MedSloppyPhrase     1464.38     (16.6%)     1438.57     (17.4%)   -1.8% ( -30% -   38%) 0.743
                             Fuzzy1      278.00     (18.4%)      273.52     (18.9%)   -1.6% ( -32% -   43%) 0.785
                  HighTermMonthSort     1994.56     (16.7%)     1966.87     (17.4%)   -1.4% ( -30% -   39%) 0.797
                           HighTerm     2132.65     (18.5%)     2103.69     (15.3%)   -1.4% ( -29% -   39%) 0.800
                             IntNRQ     1763.52     (18.6%)     1741.35     (17.4%)   -1.3% ( -31% -   42%) 0.825
                          OrHighLow     1241.11     (15.2%)     1225.73     (17.0%)   -1.2% ( -28% -   36%) 0.807
                           PKLookup      115.16     (18.3%)      114.43     (24.2%)   -0.6% ( -36% -   51%) 0.925
                          LowPhrase     1336.77     (17.3%)     1336.88     (14.9%)    0.0% ( -27% -   39%) 0.999
                         OrHighHigh      891.20     (15.8%)      896.93     (14.3%)    0.6% ( -25% -   36%) 0.893
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller merged pull request #806: LUCENE-10488: Optimize Facets#getTopDims in FloatTaxonomyFacets

Posted by GitBox <gi...@apache.org>.
gsmiller merged PR #806:
URL: https://github.com/apache/lucene/pull/806


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] Yuti-G commented on pull request #806: LUCENE-10488: Optimize Facets#getTopDims in FloatTaxonomyFacets

Posted by GitBox <gi...@apache.org>.
Yuti-G commented on PR #806:
URL: https://github.com/apache/lucene/pull/806#issuecomment-1123167787

   Thanks! Please see the latest commit for the update.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org