You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Martijn v Groningen <ma...@gmail.com> on 2009/11/15 16:34:07 UTC

Re: field collapse using 'adjacent' & 'includeCollapsedDocs' + 'sort' query field

Hi Micheal,

What you are saying seems logical, but that is currently not the case
with the collapsedDocs functionality. This functionality was build
with computing aggregated statistics in mind and not really to have a
separate collapse group search result. Although the collapsed
documents are collected in the order the appear in the search result
(only if collapsetype is adjacent) they are not saved in the order
they appear.

If you really need to have the collapse group search result in the
order they were collapsed you need to tweak the code. What you can do
is change the CollapsedDocumentCollapseCollector class in the
DocumentFieldsCollapseCollectorFactory.java source file. Currently the
document ids are stored inside a OpenBitSet per collapse group. You
can change that into an ArrayList<Integer> for example. In this way
the order in where the documents were collapsed is preserved.

I think the downside of this change will be to increase of memory
usage. OpenBitSet is memory wise more efficient then an ArrayList of
integers. I think that this will only be a real problem when the
collapse groups become very large.

I hope this will answer your question.

Martijn

2009/11/14 michael8 <mi...@saracatech.com>:
>
> Hi,
>
> This almost seems like a bug, but I can't be sure so I'm seeking
> confirmation.  Basically I am building a site that presents search results
> in reverse chronologically order.  I am also leveraging the field collapse
> feature so that I can group results using 'adjacent' mode and have solr
> return the collapsed results as well via 'includeCollapsedDocs'.  My
> collapsing field is a custom grouping_id that I have specified.
>
> What I'm noticing is that, my search results are coming back in the correct
> order by descending time (via 'sort' param in the main query) as expected.
> However, the results returned within the 'collapsedDocs' section via
> 'includeCollapsedDocs' are not in the same descending time order.
>
> My question is, shouldn't the collapsedDocs results also be in the same
> 'sort' order and key I have specified in the overall query, particularly
> since 'adjacent' mode is enabled, and that would mean results that are
> 'adjacent' in the sort order of the results.
>
> I'm using Solr 1.4.0 + field collapse patch as of 10/27/2009
>
> Thanks,
> Michael
>
> --
> View this message in context: http://old.nabble.com/field-collapse-using-%27adjacent%27---%27includeCollapsedDocs%27-%2B-%27sort%27-query-field-tp26351840p26351840.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: field collapse using 'adjacent' & 'includeCollapsedDocs' + 'sort' query field

Posted by michael8 <mi...@saracatech.com>.
Hi Martijn,

Thanks for your insight of collapsedDocs, and what I need to modify if I
need the functionality I want.

Michael


Martijn v Groningen wrote:
> 
> Hi Micheal,
> 
> What you are saying seems logical, but that is currently not the case
> with the collapsedDocs functionality. This functionality was build
> with computing aggregated statistics in mind and not really to have a
> separate collapse group search result. Although the collapsed
> documents are collected in the order the appear in the search result
> (only if collapsetype is adjacent) they are not saved in the order
> they appear.
> 
> If you really need to have the collapse group search result in the
> order they were collapsed you need to tweak the code. What you can do
> is change the CollapsedDocumentCollapseCollector class in the
> DocumentFieldsCollapseCollectorFactory.java source file. Currently the
> document ids are stored inside a OpenBitSet per collapse group. You
> can change that into an ArrayList<Integer> for example. In this way
> the order in where the documents were collapsed is preserved.
> 
> I think the downside of this change will be to increase of memory
> usage. OpenBitSet is memory wise more efficient then an ArrayList of
> integers. I think that this will only be a real problem when the
> collapse groups become very large.
> 
> I hope this will answer your question.
> 
> Martijn
> 
> 2009/11/14 michael8 <mi...@saracatech.com>:
>>
>> Hi,
>>
>> This almost seems like a bug, but I can't be sure so I'm seeking
>> confirmation.  Basically I am building a site that presents search
>> results
>> in reverse chronologically order.  I am also leveraging the field
>> collapse
>> feature so that I can group results using 'adjacent' mode and have solr
>> return the collapsed results as well via 'includeCollapsedDocs'.  My
>> collapsing field is a custom grouping_id that I have specified.
>>
>> What I'm noticing is that, my search results are coming back in the
>> correct
>> order by descending time (via 'sort' param in the main query) as
>> expected.
>> However, the results returned within the 'collapsedDocs' section via
>> 'includeCollapsedDocs' are not in the same descending time order.
>>
>> My question is, shouldn't the collapsedDocs results also be in the same
>> 'sort' order and key I have specified in the overall query, particularly
>> since 'adjacent' mode is enabled, and that would mean results that are
>> 'adjacent' in the sort order of the results.
>>
>> I'm using Solr 1.4.0 + field collapse patch as of 10/27/2009
>>
>> Thanks,
>> Michael
>>
>> --
>> View this message in context:
>> http://old.nabble.com/field-collapse-using-%27adjacent%27---%27includeCollapsedDocs%27-%2B-%27sort%27-query-field-tp26351840p26351840.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://old.nabble.com/field-collapse-%27includeCollapsedDocs%27-doesn%27t-return-results-within-%27collapsedDocs%27-in-%27sort%27-order-specified-tp26351840p26360433.html
Sent from the Solr - User mailing list archive at Nabble.com.