You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Michael Gibney (Jira)" <ji...@apache.org> on 2021/12/06 22:43:00 UTC

[jira] [Commented] (SOLR-14595) json.facet subfacet 'sort:"index asc", refine:true' can return diff results using method:enum

    [ https://issues.apache.org/jira/browse/SOLR-14595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454278#comment-17454278 ] 

Michael Gibney commented on SOLR-14595:
---------------------------------------

I had been inadvertently coming at this from a different direction: realizing that {{overrrequest}} was (largely) irrelevant to the "index sort" case even in non-"enum" {{FacetFieldProcessor}}, and actually had a branch lying around that I intended to propose as a PR/optimization.

[PR #447|https://github.com/apache/solr/pull/447] seeks to address this issue by a variant of [~hossman]'s first proposal; but rather than causing {{FacetFieldProcessorByEnumTermsStream}} to implement overrequest identical to base {{FacetFieldProcessor}} (where overrequest doesn't really make sense anyway for "index" sort), this PR avoids pointless overrequest for "index" sort in the base {{FacetFieldProcessor}} class (with the exception of "index" sort specified as {{prelim_sort}} -- because maybe someone would do that?).

Given the side-effect of {{overrequest}} on {{isBucketComplete}} (the essence of the inconsistency observed in this issue), I'm inclined to think we should unconditionally respect _explicit_ (non-default) overrequest for the distrib case -- this would only require a minor change. Even if it the "isBucketComplete" side-effect of overrequest feels a little "leaky", I also don't see much practical benefit in silently ignoring explicity specified {{overrequest}} for "index sort" -- unless, as is _not_ the case here, ignoring the explicit {{overrequest}} would be truly functionally equivalent.

> json.facet subfacet 'sort:"index asc", refine:true' can return diff results using method:enum
> ---------------------------------------------------------------------------------------------
>
>                 Key: SOLR-14595
>                 URL: https://issues.apache.org/jira/browse/SOLR-14595
>             Project: Solr
>          Issue Type: Bug
>          Components: Facet Module
>            Reporter: Chris M. Hostetter
>            Assignee: Chris M. Hostetter
>            Priority: Major
>         Attachments: SOLR-14595.patch, SOLR-14595.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> jenkins found a failing seed for TestCloudJSONFacetSKGEquiv that has nothing to do with SKG -- it shows that using {{method:enum}} can sometimes return different set of buckets then {{method:smart}} when computing a facet that uses {{"sort":"index asc", "refine":true}} _as a subfacet_ of some other facet.
> (In all the cases i've been able to trigger with more targetted testing, the "parent facet" needs to use a sort option that cause buckets to "sort worse" when more data is known about them -- ie: "count asc" or SKG -- but i haven't determined if that's actaully neccessary to trigger the fialure)
> original jenkins failure...
> {noformat}
> master jenkins (@ 541fc984e90) ...
>    [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestCloudJSONFacetSKGEquiv -Dtests.method=testRandom -Dtests.seed=356C5A0B17DE491 -Dtests.multiplier=2 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=en-KN -Dtests.timezone=Asia/Ho_Chi_Minh -Dtests.asserts=true -Dtests.file.encoding=UTF-8
>    [junit4] FAILURE 1.05s | TestCloudJSONFacetSKGEquiv.testRandom <<<
>    [junit4]    > Throwable #1: java.lang.AssertionError: rows=0&q=(field_7_multi_sds:19+OR+field_11_multi_sdsS:61+OR+field_8_multi_sdsS:45+OR+field_10_multi_sds:21+OR+field_2_multi_sdsS:28+OR+field_8_multi_sdsS:33+OR+field_10_multi_sds:54+OR+field_12_multi_ss:41)&fore=(field_5_multi_sdsS:48+OR+field_7_multi_sds:24+OR+field_13_multi_sds:61+OR+field_10_multi_sds:32+OR+field_9_multi_ss:45+OR+field_10_multi_sds:16+OR+field_11_multi_sdsS:28+OR+field_2_multi_sdsS:33+OR+field_8_multi_sdsS:43+OR+field_7_multi_sds:9)&back=(field_2_multi_sdsS:5+OR+field_9_multi_ss:16+OR+field_0_multi_ss:40+OR+field_0_multi_ss:16+OR+field_10_multi_sds:34+OR+field_10_multi_sds:58+OR+field_9_multi_ss:15+OR+field_1_multi_sds:44+OR+field_13_multi_sds:51+OR+field_10_multi_sds:21)&json.facet={"facet_1":{"method":"${method_val:smart}","limit":12,"sort":"count+asc","refine":true,"type":"terms","field":"field_12_multi_ss","facet":{"skg":{"type":"func","func":"relatedness($fore,$back)"},"facet_2":{"method":"${method_val:smart}","limit":1,"overrequest":38,"prefix":"2","sort":"index+asc","refine":true,"type":"terms","field":"field_3_multi_ss","facet":{"skg":{"type":"func","func":"relatedness($fore,$back)"},"facet_3":{"method":"${method_val:smart}","overrequest":0,"perSeg":false,"sort":"skg+desc","refine":true,"type":"terms","field":"field_8_multi_idsS","facet":{"skg":{"type":"func","func":"relatedness($fore,$back)"}}}}}}}}&_stateVer_=org.apache.solr.search.facet.TestCloudJSONFacetSKGEquiv_collection:4 ===> Mismatch: .facet_1.buckets[8][facet_2].buckets.length:1!=0 using method_val=enum
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org