You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Doug Steigerwald (JIRA)" <ji...@apache.org> on 2008/12/09 21:31:45 UTC

[jira] Commented: (SOLR-236) Field collapsing

    [ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654950#action_12654950 ] 

Doug Steigerwald commented on SOLR-236:
---------------------------------------

I'm having an issue with Ivan's latest patch.  I'm testing on a data set of 8113 documents.  All the documents have a string field called site.  There are only two sites, Site1 and Site2.

Site1 has 3466 documents.
Site2 has 4647 documents.

With the following simple query, I only get 1 result:
http://localhost:8983/solr/core1/search?q=*:*&collapase=true&collapse.field=site

....
<lst name="collapse_counts">
 <str name="field">site</str>
 <lst name="doc">
  <int name="site2-doc-2981790">4646</int>
 </lst>
 <lst name="count">
  <int name="Site2">4646</int>
 </lst>
 <str name="debug">HashDocSet(2) Time(ms): 0/0/0/0</str>
</lst>
<result name="response" numFound="1" start="0">
....

The only result displayed is for Site2.

I have an older patch working with Solr 1.3.0, but I can't get it to mesh with localsolr properly.  My localsolr gives 1656 results, and collapsed on the site it should give 2 results but gives 8 results, some of which are duplicate documents.  Without localsolr, my field collapsing patch seems to work fine.

> Field collapsing
> ----------------
>
>                 Key: SOLR-236
>                 URL: https://issues.apache.org/jira/browse/SOLR-236
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.3
>            Reporter: Emmanuel Keller
>             Fix For: 1.4
>
>         Attachments: collapsing-patch-to-1.3.0-ivan.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, solr-236.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.