You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Joe Calderon <ca...@gmail.com> on 2009/09/30 22:18:41 UTC

field collapsing sums

hello all, i have a question on the field collapsing patch, say i have
an integer field called "num_in_stock" and i collapse by some other
column, is it possible to sum up that integer field and return the
total in the output, if not how would i go about extending the
collapsing component to support that?


thx much

--joe

Re: field collapsing sums

Posted by Martijn v Groningen <ma...@gmail.com>.
Well that is odd. How have you configured field collapsing with the
dismax request handler?
The collapse counts should X - 1 (if collapse.threshold=1).

Martijn

2009/10/1 Joe Calderon <ca...@gmail.com>:
> thx for the reply, i just want the number of dupes in the query
> result, but it seems i dont get the correct totals,
>
> for example a non collapsed dismax query for belgian beer returns X
> number results
> but when i collapse and sum the number of docs under collapse_counts,
> its much less than X
>
> it does seem to work when the collapsed results fit on one page (10
> rows in my case)
>
>
> --joe
>
>> 2) It seems that you are using the parameters as was intended. The
>> collapsed documents will contain all documents (from whole query
>> result) that have been collapsed on a certain field value that occurs
>> in the result set that is being displayed. That is how it should work.
>> But if I'm understanding you correctly you want to display all dupes
>> from the whole query result set (also those which collapse field value
>> does not occur in the in the displayed result set)?
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Re: field collapsing sums

Posted by Joe Calderon <ca...@gmail.com>.
thx for the reply, i just want the number of dupes in the query
result, but it seems i dont get the correct totals,

for example a non collapsed dismax query for belgian beer returns X
number results
but when i collapse and sum the number of docs under collapse_counts,
its much less than X

it does seem to work when the collapsed results fit on one page (10
rows in my case)


--joe

> 2) It seems that you are using the parameters as was intended. The
> collapsed documents will contain all documents (from whole query
> result) that have been collapsed on a certain field value that occurs
> in the result set that is being displayed. That is how it should work.
> But if I'm understanding you correctly you want to display all dupes
> from the whole query result set (also those which collapse field value
> does not occur in the in the displayed result set)?

Re: field collapsing sums

Posted by Martijn v Groningen <ma...@gmail.com>.
1) That is correct. Including collapsed documents fields can make you
search significantly slower (depending on how many documents are
returned).
2) It seems that you are using the parameters as was intended. The
collapsed documents will contain all documents (from whole query
result) that have been collapsed on a certain field value that occurs
in the result set that is being displayed. That is how it should work.
But if I'm understanding you correctly you want to display all dupes
from the whole query result set (also those which collapse field value
does not occur in the in the displayed result set)?

Martijn

2009/10/1 Joe Calderon <ca...@gmail.com>:
> hello martijn, thx for the tip, i tried that approach but ran into two
> snags, 1. returning the fields makes collapsing a lot slower for
> results, but that might just be the nature of iterating large results.
> 2. it seems like only dupes of records on the first page are returned
>
> or is tehre a a setting im missing? currently im only sending,
> collapse.field=brand and collapse.includeCollapseDocs.fl=num_in_stock
>
> --joe
>
> On Thu, Oct 1, 2009 at 1:14 AM, Martijn v Groningen
> <ma...@gmail.com> wrote:
>> Hi Joe,
>>
>> Currently the patch does not do that, but you can do something else
>> that might help you in getting your summed stock.
>>
>> In the latest patch you can include fields of collapsed documents in
>> the result per distinct field value.
>> If your specify collapse.includeCollapseDocs.fl=num_in_stock in the
>> request nd lets say you collapse on brand then in the response you
>> will receive the following xml:
>> <lst name="collapsedDocs">
>>   <result name="brand1" numFound="48" start="0">
>>        <doc>
>>          <str name="num_in_stock">2</str>
>>        </doc>
>>         <doc>
>>          <str name="num_in_stock">3</str>
>>        </doc>
>>      ...
>>   </result>
>>   <result name=”brand2” numFound=”9” start=”0”>
>>      ...
>>   </result>
>> </lst>
>>
>> On the client side you can do whatever you want with this data and for
>> example sum it together. Although the patch does not sum for you, I
>> think it will allow to implement your requirement without to much
>> hassle.
>>
>> Cheers,
>>
>> Martijn
>>
>> 2009/10/1 Matt Weber <ma...@mattweber.org>:
>>> You might want to see how the stats component works with field collapsing.
>>>
>>> Thanks,
>>>
>>> Matt Weber
>>>
>>> On Sep 30, 2009, at 5:16 PM, Uri Boness wrote:
>>>
>>>> Hi,
>>>>
>>>> At the moment I think the most appropriate place to put it is in the
>>>> AbstractDocumentCollapser (in the getCollapseInfo method). Though, it might
>>>> not be the most efficient.
>>>>
>>>> Cheers,
>>>> Uri
>>>>
>>>> Joe Calderon wrote:
>>>>>
>>>>> hello all, i have a question on the field collapsing patch, say i have
>>>>> an integer field called "num_in_stock" and i collapse by some other
>>>>> column, is it possible to sum up that integer field and return the
>>>>> total in the output, if not how would i go about extending the
>>>>> collapsing component to support that?
>>>>>
>>>>>
>>>>> thx much
>>>>>
>>>>> --joe
>>>>>
>>>>>
>>>
>>>
>>
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Re: field collapsing sums

Posted by Joe Calderon <ca...@gmail.com>.
hello martijn, thx for the tip, i tried that approach but ran into two
snags, 1. returning the fields makes collapsing a lot slower for
results, but that might just be the nature of iterating large results.
2. it seems like only dupes of records on the first page are returned

or is tehre a a setting im missing? currently im only sending,
collapse.field=brand and collapse.includeCollapseDocs.fl=num_in_stock

--joe

On Thu, Oct 1, 2009 at 1:14 AM, Martijn v Groningen
<ma...@gmail.com> wrote:
> Hi Joe,
>
> Currently the patch does not do that, but you can do something else
> that might help you in getting your summed stock.
>
> In the latest patch you can include fields of collapsed documents in
> the result per distinct field value.
> If your specify collapse.includeCollapseDocs.fl=num_in_stock in the
> request nd lets say you collapse on brand then in the response you
> will receive the following xml:
> <lst name="collapsedDocs">
>   <result name="brand1" numFound="48" start="0">
>        <doc>
>          <str name="num_in_stock">2</str>
>        </doc>
>         <doc>
>          <str name="num_in_stock">3</str>
>        </doc>
>      ...
>   </result>
>   <result name=”brand2” numFound=”9” start=”0”>
>      ...
>   </result>
> </lst>
>
> On the client side you can do whatever you want with this data and for
> example sum it together. Although the patch does not sum for you, I
> think it will allow to implement your requirement without to much
> hassle.
>
> Cheers,
>
> Martijn
>
> 2009/10/1 Matt Weber <ma...@mattweber.org>:
>> You might want to see how the stats component works with field collapsing.
>>
>> Thanks,
>>
>> Matt Weber
>>
>> On Sep 30, 2009, at 5:16 PM, Uri Boness wrote:
>>
>>> Hi,
>>>
>>> At the moment I think the most appropriate place to put it is in the
>>> AbstractDocumentCollapser (in the getCollapseInfo method). Though, it might
>>> not be the most efficient.
>>>
>>> Cheers,
>>> Uri
>>>
>>> Joe Calderon wrote:
>>>>
>>>> hello all, i have a question on the field collapsing patch, say i have
>>>> an integer field called "num_in_stock" and i collapse by some other
>>>> column, is it possible to sum up that integer field and return the
>>>> total in the output, if not how would i go about extending the
>>>> collapsing component to support that?
>>>>
>>>>
>>>> thx much
>>>>
>>>> --joe
>>>>
>>>>
>>
>>
>

Re: field collapsing sums

Posted by Martijn v Groningen <ma...@gmail.com>.
Hi Joe,

Currently the patch does not do that, but you can do something else
that might help you in getting your summed stock.

In the latest patch you can include fields of collapsed documents in
the result per distinct field value.
If your specify collapse.includeCollapseDocs.fl=num_in_stock in the
request nd lets say you collapse on brand then in the response you
will receive the following xml:
<lst name="collapsedDocs">
   <result name="brand1" numFound="48" start="0">
  	<doc>
          <str name="num_in_stock">2</str>
        </doc>
         <doc>
          <str name="num_in_stock">3</str>
        </doc>
      ...
   </result>
   <result name=”brand2” numFound=”9” start=”0”>
      ...
   </result>
</lst>

On the client side you can do whatever you want with this data and for
example sum it together. Although the patch does not sum for you, I
think it will allow to implement your requirement without to much
hassle.

Cheers,

Martijn

2009/10/1 Matt Weber <ma...@mattweber.org>:
> You might want to see how the stats component works with field collapsing.
>
> Thanks,
>
> Matt Weber
>
> On Sep 30, 2009, at 5:16 PM, Uri Boness wrote:
>
>> Hi,
>>
>> At the moment I think the most appropriate place to put it is in the
>> AbstractDocumentCollapser (in the getCollapseInfo method). Though, it might
>> not be the most efficient.
>>
>> Cheers,
>> Uri
>>
>> Joe Calderon wrote:
>>>
>>> hello all, i have a question on the field collapsing patch, say i have
>>> an integer field called "num_in_stock" and i collapse by some other
>>> column, is it possible to sum up that integer field and return the
>>> total in the output, if not how would i go about extending the
>>> collapsing component to support that?
>>>
>>>
>>> thx much
>>>
>>> --joe
>>>
>>>
>
>

Re: field collapsing sums

Posted by Matt Weber <ma...@mattweber.org>.
You might want to see how the stats component works with field  
collapsing.

Thanks,

Matt Weber

On Sep 30, 2009, at 5:16 PM, Uri Boness wrote:

> Hi,
>
> At the moment I think the most appropriate place to put it is in the  
> AbstractDocumentCollapser (in the getCollapseInfo method). Though,  
> it might not be the most efficient.
>
> Cheers,
> Uri
>
> Joe Calderon wrote:
>> hello all, i have a question on the field collapsing patch, say i  
>> have
>> an integer field called "num_in_stock" and i collapse by some other
>> column, is it possible to sum up that integer field and return the
>> total in the output, if not how would i go about extending the
>> collapsing component to support that?
>>
>>
>> thx much
>>
>> --joe
>>
>>


Re: field collapsing sums

Posted by Uri Boness <ub...@gmail.com>.
Hi,

At the moment I think the most appropriate place to put it is in the 
AbstractDocumentCollapser (in the getCollapseInfo method). Though, it 
might not be the most efficient.

Cheers,
Uri

Joe Calderon wrote:
> hello all, i have a question on the field collapsing patch, say i have
> an integer field called "num_in_stock" and i collapse by some other
> column, is it possible to sum up that integer field and return the
> total in the output, if not how would i go about extending the
> collapsing component to support that?
>
>
> thx much
>
> --joe
>
>