You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "David Smiley (Jira)" <ji...@apache.org> on 2021/01/23 06:29:00 UTC

[jira] [Commented] (SOLR-15053) Remove ref-guide preference for collapse over grouping

    [ https://issues.apache.org/jira/browse/SOLR-15053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270570#comment-17270570 ] 

David Smiley commented on SOLR-15053:
-------------------------------------

The preference is there due to my modifications some time ago.  Yes, Collapse & Expand may not satisfy all use cases but if it does then please use it (IMO).  Another motivation of mine is that I think grouping is a bit of a mess in terms of code organization / concerns spilling out into QueryComponent and elsewhere.  I wish we didn't have it in the first place, honestly.

> Remove ref-guide preference for collapse over grouping
> ------------------------------------------------------
>
>                 Key: SOLR-15053
>                 URL: https://issues.apache.org/jira/browse/SOLR-15053
>             Project: Solr
>          Issue Type: Task
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Jason Gerlowski
>            Priority: Major
>
> Currently, the ref-guide states a clear preference for collapse over grouping.
> bq. Generally, you should prefer Collapse & Expand.
> But the reality is more complicated.  Collapse grouping has a lot of limitations on when it can be used (single shard environments only, with the exception that multi-shard environments can be made to work by ensuring that each value in the grouping field is colocated within the same shard.  Further, it's not necessarily more or less performant than traditional grouping:
> As Joel Bernstein put it in a recent mailing list thread:
> bq. There is a very specific use case where collapse performs better and in these scenarios collapse might be the only option that would work.  The use case where collapse works better is: (1) High cardinality grouping field, like product id, (2) Larger result sets, (3) The need to know the full number of groups that match the result set. In grouping this is group.ngroups.  At a certain point grouping will become too slow under the scenario
> described above. It will all depend on the scale of #1 and #2 above.
> We should correct the ref-guide wording here, as it's misleading for novice's and experts alike.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org