You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by prasenjit mukherjee <pr...@gmail.com> on 2010/03/22 13:56:31 UTC

access payload from HitCollector.collect()

I am trying to implement oracle's aggregation like SQL's  ( e.g.
SUM(col3) where col1='foo' and col2='bar' ) using lucene's payload
feature.

I can add the integer_value ( of col3 ) as a payload to my searchable
fields ( col1 and col2 ). I can probably extend the
DefaultSImilarity's scorePayload() to compute the aggregated value.
But was wondering if I can access the payload values from
HitColelctor.collect() as I really dont need to do any score
computation. I just want the sum of col3's values  for all matching
documents.

What is a good approach to achieve the above? The use case is slightly
different and more like RDBMS, but still ( methinks )  lucene seems to
the best tool available to do these kind of computation in a large
scale.

-Thanks,
Prasen

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: access payload from HitCollector.collect()

Posted by Grant Ingersoll <gs...@apache.org>.
On Mar 22, 2010, at 8:56 AM, prasenjit mukherjee wrote:

> I am trying to implement oracle's aggregation like SQL's  ( e.g.
> SUM(col3) where col1='foo' and col2='bar' ) using lucene's payload
> feature.
> 
> I can add the integer_value ( of col3 ) as a payload to my searchable
> fields ( col1 and col2 ). I can probably extend the
> DefaultSImilarity's scorePayload() to compute the aggregated value.
> But was wondering if I can access the payload values from
> HitColelctor.collect() as I really dont need to do any score
> computation. I just want the sum of col3's values  for all matching
> documents.
> 
> What is a good approach to achieve the above? The use case is slightly
> different and more like RDBMS, but still ( methinks )  lucene seems to
> the best tool available to do these kind of computation in a large
> scale.

See the SpanQuery class and the associated Spans class, which should provide you access to the matching docs, w/o scoring, as well as the payload values.

-Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org