You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jay Hill <ja...@gmail.com> on 2005/06/15 01:23:21 UTC

Need a way to set a result limit on a particular field

I have a need to limit my Hits returned based on one of the indexed
fields. This is a web application and we want to limit the number of
hits from any one host. We have a field named "host_id" and I'd like
to be able to limit my results to no more than three results for any
one host_id.

Any help is appreciated.

Thanks,
-Jay

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Need a way to set a result limit on a particular field

Posted by Jay Hill <ja...@gmail.com>.
Thanks Tony and Erik for the replies. The trick is we don't know the
hosts that will be returned in advance, we just don't want more than 3
from any one host. It's not unlike searching on Google where you might
see a link that says "More results from foo.com". We essentially want
to discard any results > 3 for any one host. In some of our searches
we might get high scores on 20 or 30 documents, but we don't want to
show page after page from the same host, we'd rather limit it to 3
from each for more diversity.

I may have to use a brute force approach using HitCollector as Tony
suggests. I was hoping to avoid the HitCollector, but there may be no
other way right now.

Many thanks,
-Jay


On 6/14/05, Erik Hatcher <er...@ehatchersolutions.com> wrote:
> 
> On Jun 14, 2005, at 7:23 PM, Jay Hill wrote:
> > I have a need to limit my Hits returned based on one of the indexed
> > fields. This is a web application and we want to limit the number of
> > hits from any one host. We have a field named "host_id" and I'd like
> > to be able to limit my results to no more than three results for any
> > one host_id.
> 
> I may not be fully understanding your question, but I'll go with my
> assumptions... wrap the users query into a BooleanQuery as a required
> clause and then add another clause with a TermQuery for the specific
> host_id.  Then simply constrain the number of Hits shown to the first
> 3.  Does that do what you're after?
> 
>      Erik
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Need a way to set a result limit on a particular field

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Jun 14, 2005, at 7:23 PM, Jay Hill wrote:
> I have a need to limit my Hits returned based on one of the indexed
> fields. This is a web application and we want to limit the number of
> hits from any one host. We have a field named "host_id" and I'd like
> to be able to limit my results to no more than three results for any
> one host_id.

I may not be fully understanding your question, but I'll go with my  
assumptions... wrap the users query into a BooleanQuery as a required  
clause and then add another clause with a TermQuery for the specific  
host_id.  Then simply constrain the number of Hits shown to the first  
3.  Does that do what you're after?

     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Need a way to set a result limit on a particular field

Posted by Tony Schwartz <to...@simpleobjects.com>.
I don't see a way to do this today.  How many different hosts are there?  If
it's small, you could execute the query that many times only grabbing the
top 3 results from each.  Otherwise, you'll have to use brute force with a
HitCollector and read the field for each doc.  Good luck!

Tony Schwartz
tony@simpleobjects.com


-----Original Message-----
From: Jay Hill [mailto:jayallenhill@gmail.com] 
Sent: Tuesday, June 14, 2005 7:23 PM
To: java-user@lucene.apache.org
Subject: Need a way to set a result limit on a particular field

I have a need to limit my Hits returned based on one of the indexed
fields. This is a web application and we want to limit the number of
hits from any one host. We have a field named "host_id" and I'd like
to be able to limit my results to no more than three results for any
one host_id.

Any help is appreciated.

Thanks,
-Jay

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org