You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "raymondcreel (sent by Nabble.com)" <li...@nabble.com> on 2005/08/29 22:18:19 UTC

custom sort

Is it possible to write a custom sort for a query such that the first N documents that match a certain additional criteria get pushed to the top of the sort?  For instance say you sort your query based on field A, but you want to tweak the results such that the first 10 documents in the result set which have field B = some criteria will appear at the beginning of the resulting hits collection?   

thanks much,
raymond
--
Sent from the Lucene - Java Users forum at Nabble.com:
http://www.nabble.com/custom-sort-t262833.html#a737222

Re: custom sort

Posted by Jason Haruska <jh...@gmail.com>.
I had to do something similar, but I plan on re-writing it into something 
more elegant. I hope this helps give you some ideas.

1. Create a QueryFilter on only those items that matched the criteria (have 
a required clause in your boolean query)
2. Create a BitFilter which takes a BitSet from step 1 and flip the bits
3. Perform a search with each Filter and display results for each search

Here is my BitFilter code:
--------------------------------------------------
package org.apache.lucene.search;

import java.io.IOException;
import java.util.BitSet;

import org.apache.lucene.index.IndexReader;

public class BitFilter extends Filter {
private BitSet bitSet;

public BitFilter(BitSet bs) {
bitSet = bs;
}

public void flipAll() {
bitSet.flip(0, bitSet.size()-1);
}

/* (non-Javadoc)
* @see org.apache.lucene.search.Filter#bits(
org.apache.lucene.index.IndexReader)
*/
public BitSet bits(IndexReader reader) throws IOException {
return bitSet;
}
}
--------------------------------------------------------------------------------

On 8/31/05, raymondcreel (sent by Nabble.com <http://Nabble.com>) <
lists@nabble.com> wrote:
> 
> 
> Actually in this case I am sorting by score already but I'm not sure if 
> that helps. Regardless of how I do my primary sort, I want to tweak the 
> results such that some hardcoded number of documents that match some 
> criteria get pushed or frontloaded to the top of the results. For instance 
> think of a search engine where you generally are displaying a list of pages 
> sorted by score, but you want 10 pages from a featured site to always show 
> at the top of the first page, while leaving the rest of the sort order as it 
> is. That's why it's not something I can really do at the index stage by 
> assigning boosts - I only want to boost those first 10 items that match the 
> criteria, not all of them.
> 
> What I'm doing now is taking the whole resulting document collection, 
> iterating through it and manually moving these 10 documents to the front of 
> the collection. This is slow and ugly. I was hoping there might be a slicker 
> way to do it as part of the actual sort. I will play around with the custom 
> sorting and report back if I figure out an elegant way to do it.
> 
> Thanks for all your replies.
> Raymond
> --
> Sent from the Lucene - Java Users forum at Nabble.com <http://Nabble.com>:
> http://www.nabble.com/custom-sort-t262833.html#a750675
> 
>

Re: custom sort

Posted by "raymondcreel (sent by Nabble.com)" <li...@nabble.com>.
Hi thanks for the reply. Yes that sounds like it would work with the two searches.  Perhaps a custom sort might be less overhead since it would just be one search, but I think your solution will work for my purposes.  Thanks much.

raymond
--
Sent from the Lucene - Java Users forum at Nabble.com:
http://www.nabble.com/custom-sort-t262833.html#a818124

Re: custom sort

Posted by Chris Hostetter <ho...@fucit.org>.
: What I'm doing now is taking the whole resulting document collection,
: iterating through it and manually moving these 10 documents to the front
: of the collection.  This is slow and ugly.  I was hoping there might be
: a slicker way to do it as part of the actual sort.  I will play around
: with the custom sorting and report back if I figure out an elegant way
: to do it.

you could definitely do this with a Custom Sort ... but a simpler way to
go would be to do two searches.  if the users's basic search criteria is
soemthing like "foo:bar +yak:wak" and the criteria for identifying the
documents you want to return is "promote:yes" then first issue a search
for "+promote:yes +(foo:bar +yak:wak)" and display the first N, and keep
track of hte identity of the N that you display  Then do
a search for "foo:bar +yak:wak -id:id1 -id:id2 -id:id3 ... -id:idN"

this is assuming that you you only want to display the first N (sounds
like N is ten) but there might be a lot more then N which match but you
want the rest to display in their regular spots.  if you want them *ALL*
to be in the front, then making your second search "-promote:yes +(foo:bar
+yak:wak)" should work just fine.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: custom sort

Posted by "raymondcreel (sent by Nabble.com)" <li...@nabble.com>.
Actually in this case I am sorting by score already but I'm not sure if that helps.  Regardless of how I do my primary sort, I want to tweak the results such that some hardcoded number of documents that match some criteria get pushed or frontloaded to the top of the results.  For instance think of a search engine where you generally are displaying a list of pages sorted by score, but you want 10 pages from a featured site to always show at the top of the first page, while leaving the rest of the sort order as it is.  That's why it's not something I can really do at the index stage by assigning boosts - I only want to boost those first 10 items that match the criteria, not all of them.

What I'm doing now is taking the whole resulting document collection, iterating through it and manually moving these 10 documents to the front of the collection.  This is slow and ugly.  I was hoping there might be a slicker way to do it as part of the actual sort.  I will play around with the custom sorting and report back if I figure out an elegant way to do it.

Thanks for all your replies.
Raymond
--
Sent from the Lucene - Java Users forum at Nabble.com:
http://www.nabble.com/custom-sort-t262833.html#a750675

Re: custom sort

Posted by Chris Hostetter <ho...@fucit.org>.
: You can just assign the field B some weight when creating the index?

that implies that the field "A" being sorted on is SCORE ... which isn't
allways the case.

: Is it possible to write a custom sort for a query such that the first
: N documents that match a certain additional criteria get pushed to the
: top of the sort?  For instance say you sort your query based on field A,
: but you want to tweak the results such that the first 10 documents in
: the result set which have field B = some criteria will appear at the

absolutely, you can put juse about any code you want in a custom
SortComparatorSource to order documents using whatever rules you want.
You'll probably need to hard code your special field name "B" into the
code hosever, either that or fetch it from a global variable or a system
property.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: custom sort

Posted by Chris Lu <ch...@gmail.com>.
You can just assign the field B some weight when creating the index?

-- 
Chris Lu
------------
Lucene Search RAD on Any Database
http://www.dbsight.net

On 8/29/05, raymondcreel (sent by Nabble.com) <li...@nabble.com> wrote:
> 
> Is it possible to write a custom sort for a query such that the first N documents that match a certain additional criteria get pushed to the top of the sort?  For instance say you sort your query based on field A, but you want to tweak the results such that the first 10 documents in the result set which have field B = some criteria will appear at the beginning of the resulting hits collection?
> 
> thanks much,
> raymond
> --
> Sent from the Lucene - Java Users forum at Nabble.com:
> http://www.nabble.com/custom-sort-t262833.html#a737222
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org