You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Domingo Gómez García (JIRA)" <ji...@apache.org> on 2009/05/05 10:54:31 UTC

[jira] Issue Comment Edited: (SOLR-236) Field collapsing

    [ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705959#action_12705959 ] 

Domingo Gómez García edited comment on SOLR-236 at 5/5/09 1:53 AM:
-------------------------------------------------------------------

The results of collapse_counts are not what i have expected. It losses many categories, only showing a few . I tried incrementing the collapse.max parameter:

max=1 results 

<lst name="doc">
<int name="2008/LICOBLE-00023">109</int>
<int name="2008/LICOBLE-3">5</int>
<int name="2009/LICOBLE-00036">4</int>
<int name="2009/LICOBLE-00095">1</int>
</lst>
−
<lst name="count">
<int name="12740">109</int>
<int name="12741">5</int>
<int name="13282">4</int>
<int>1</int>
</lst>


max=2 results

<lst name="doc">
<int name="2009/LICOBLE-00008">108</int>
<int name="2007/LICOBLE-1">4</int>
</lst>
−
<lst name="count">
<int name="12740">108</int>
<int name="12741">4</int>
</lst>


max=3 results

<lst name="doc">
<int name="2008/LICOBLE-00020">107</int>
<int name="2008/LICOBLE-00021">3</int>
</lst>
−
<lst name="count">
<int name="12740">107</int>
<int name="12741">3</int>
</lst>


max=4

<lst name="doc">
<int name="2009/LICOBLE-00060">106</int>
</lst>
−
<lst name="count">
<int name="12740">106</int>
</lst>

How is possible to get less results each time? There are like 70 categories, do I have any way to obtain all those counts? Am I mising any collapsing concept?
Thanks.

      was (Author: dgomezca):
    The results of collapse_counts are not what i have expected. It losses many categories, only showing . I tried incrementing the collapse.max parameter:

max=1 results 

<lst name="doc">
<int name="2008/LICOBLE-00023">109</int>
<int name="2008/LICOBLE-3">5</int>
<int name="2009/LICOBLE-00036">4</int>
<int name="2009/LICOBLE-00095">1</int>
</lst>
−
<lst name="count">
<int name="12740">109</int>
<int name="12741">5</int>
<int name="13282">4</int>
<int>1</int>
</lst>


max=2 results

<lst name="doc">
<int name="2009/LICOBLE-00008">108</int>
<int name="2007/LICOBLE-1">4</int>
</lst>
−
<lst name="count">
<int name="12740">108</int>
<int name="12741">4</int>
</lst>


max=3 results

<lst name="doc">
<int name="2008/LICOBLE-00020">107</int>
<int name="2008/LICOBLE-00021">3</int>
</lst>
−
<lst name="count">
<int name="12740">107</int>
<int name="12741">3</int>
</lst>


max=4

<lst name="doc">
<int name="2009/LICOBLE-00060">106</int>
</lst>
−
<lst name="count">
<int name="12740">106</int>
</lst>

How is possible to get less results each time? There are like 70 categories, do I have any way to obtain all those counts? Am I mising any collapsing concept?
Thanks.
  
> Field collapsing
> ----------------
>
>                 Key: SOLR-236
>                 URL: https://issues.apache.org/jira/browse/SOLR-236
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.3
>            Reporter: Emmanuel Keller
>             Fix For: 1.5
>
>         Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.