You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by emerson cargnin <ec...@gmail.com> on 2006/03/11 20:18:51 UTC

colapsing the result by a given field

In my company's system we need to make a search that would return
hundreds of result.
Its a search over extracts of websites of the companies we list. We
have the ID (usually max of 10 will be used at each search) of the
companies which are used to bring the extracts (each ID my have
hundreds of extracts)
If we send 10 IDs in the search it should bring the 10 most relevant
extracts, one for each company. I thought in using sorting for this,
but it won't help, as there's no way to sort in a way that will bring
one document per ID.

What would be the easiest to achieve that?

Emerson

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: colapsing the result by a given field

Posted by Chris Hostetter <ho...@fucit.org>.
: What would be the easiest to achieve that?

The easiest way is to do the search 10 times, and use either a Filter or a
BooleanQuery with a mandatory companyId:XXX clause in each to restrict the
results.

Which appraoch you take depends on how many total companies might be used
over time, and wether individual companies are re-searched over and over
again so that caching the FIlter BitSet is advantages.


A more complicated way that could concievable be faster, is to use a
HItCollector, and a the FieldCache on your companyId field to record the
highest scoring doc for each companyId ... but that's a lot more work, and
if you are really only going to be dealing with 1-10 companies at a time,
issuing the search 10 teams really isn't that bad.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org