You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Yanick Gamelin <ya...@ericsson.com> on 2011/08/25 21:02:26 UTC

Lucene scoring and random result order

Hi all,

I have the following problem with Lucene being not deterministic.

I use a MultiSearcher to process a search and when I get hits with same score, those are returned in a random order.
I wouldn't care much about the order of the hits with same score if I could get them all, so I could sort them myself.
But if we request a maximum number of results lower than the amount of hits with same score, we only get a subset of those hits and that result list of hits will change because the order is not guarantied.
Sometimes the first part of the result list is consistent because scoring is different for those hits, but then we have a bit block with equals scoring, so Lucene only take what it need to fill the rest of the list. Lucene takes randomly what its need from the big block of equal score

As an example imagine x,y,and z which have a high scoring, all other letters have same score
3 consecutive searches will give
[x,y,z,a,b,c,d,f,g,h,i,j]
[x,y,z,q,w,e,r,t,u,i,o,p]
[x,y,z,m,n,b,v,c,a,s,d,g]

Pretty annoying eh? So, what can I do about that?