You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hoss Man (JIRA)" <ji...@apache.org> on 2017/12/06 18:30:00 UTC
[jira] [Commented] (SOLR-11729) Increase default overrequest
ratio/count in json.facet to match existing defaults for
facet.overrequest.ratio & facet.overrequest.count ?
[ https://issues.apache.org/jira/browse/SOLR-11729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280631#comment-16280631 ]
Hoss Man commented on SOLR-11729:
---------------------------------
[~yonik@apache.org]: do you remember if there was the an explicit reason you choose those lower constants in the json.facet code?
> Increase default overrequest ratio/count in json.facet to match existing defaults for facet.overrequest.ratio & facet.overrequest.count ?
> -----------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-11729
> URL: https://issues.apache.org/jira/browse/SOLR-11729
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Hoss Man
>
> When FacetComponent first got support for distributed search, the default "effective shard limit" done on shards followed the formula...
> {code}
> limit = (int)(dff.initialLimit * 1.5) + 10;
> {code}
> ...over time, this became configurable with the introduction of some expert level tuning options: {{facet.overrequest.ratio}} & {{facet.overrequest.count}} -- but the defaults (and basic formula) remain the same to this day...
> {code}
> this.overrequestRatio
> = params.getFieldDouble(field, FacetParams.FACET_OVERREQUEST_RATIO, 1.5);
> this.overrequestCount
> = params.getFieldInt(field, FacetParams.FACET_OVERREQUEST_COUNT, 10);
> ...
> private int doOverRequestMath(int limit, double ratio, int count) {
> // NOTE: normally, "1.0F < ratio"
> //
> // if the user chooses a ratio < 1, we allow it and don't "bottom out" at
> // the original limit until *after* we've also added the count.
> int adjustedLimit = (int) (limit * ratio) + count;
> return Math.max(limit, adjustedLimit);
> }
> {code}
> However...
> When {{json.facet}} multi-shard refinement was added, the code was written slightly diff:
> * there is an explicit {{overrequest:N}} (count) option
> * if {{-1 == overrequest}} (which is the default) then an "effective shard limit" is computed using the same basic formula as in FacetComponet -- _*but the constants are different*_...
> ** {{effectiveLimit = (long) (effectiveLimit * 1.1 + 4);}}
> * For any (non "-1") user specified {{overrequest}} value, it's added verbatim to the {{limit}} (which may have been user specified, or may just be the default)
> ** {{effectiveLimit += freq.overrequest;}}
> Given the design of the {{json.facet}} syntax, I can understand why the code path for an "advanced" user specified {{overrequest:N}} option avoids using any (implicit) ratio calculation and just does the straight addition of {{limit += overrequest}}.
> What I'm not clear on is the choice of the constants {{1.1}} and {{4}} in the common (default) case, and why those differ from the historically used {{1.5}} and {{6}}.
> ----
> It may seem like a small thing to worry about, but it can/will cause odd inconsistencies when people try to migrate simple {{facet.field=foo}} (or {{facet.pivot=foo,bar}}) queries to {{json.facet}} -- I have also seen it give people attempting these types of migrations the (mistaken) impression that discrepancies they are seeing are because {{refine:true}} is not be working.
> For this reason, I propose we change the (default) {{overrequest:-1}} behavior to use the same constants as the equivilent FacetComponent code...
> {code}
> if (fcontext.isShard()) {
> if (freq.overrequest == -1) {
> // add over-request if this is a shard request and if we have a small offset (large offsets will already be gathering many more buckets than needed)
> if (freq.offset < 10) {
> effectiveLimit = (long) (effectiveLimit * 1.5 + 6);
> }
> ...
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org