You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by mark <ma...@justmags.com> on 2011/04/15 09:03:29 UTC

Result order when no sort defined

Hi,

I'm doing a query where it pulls back a collection of results based upon a
list of id values.  eg:

+(id:deb9d38d-58cb-4ef4-bb30-8deba344f7a5 OR
id:bf643b9b-7f1e-4a57-8218-1998d2ce2e0c OR
id:4a697a1e-f133-4e81-aacd-a0b73aa3b67f OR
id:f522b46c-f039-4bd6-806b-0e33f5142f3b OR
id:5a557987-9834-4cb9-8704-bb963c06d1ce OR
id:95b0f383-1f19-4641-8551-4dfd965b4e9b OR
id:28eacd12-1705-4a0e-a512-eccf692e3adb OR
id:be6f7b7f-26c2-45f4-8003-22cfb351ddfd)

This query returns the results in a consistent order, but I can't discern
how it works out that order.

About half of the results all come back with a score of 0.5912891 and then
some others have a lower 0.55549824 and the last one has 0.53078187.

id is a string field and each of these should be an exact match.

I don't really understand why they've got differing score values.

I'm trying to get them returned in the order that the ids are defined in the
query.  eg.  +(id:a or id:b or id:c) returns a list of [a,b,c]



--
View this message in context: http://lucene.472066.n3.nabble.com/Result-order-when-no-sort-defined-tp2823608p2823608.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Re: Result order when no sort defined

Posted by Chris Hostetter <ho...@fucit.org>.
: I'm doing a query where it pulls back a collection of results based upon a
: list of id values.  eg:
	....
: This query returns the results in a consistent order, but I can't discern
: how it works out that order.
: 
: About half of the results all come back with a score of 0.5912891 and then
: some others have a lower 0.55549824 and the last one has 0.53078187.
: 
: id is a string field and each of these should be an exact match.

the specifics of you you executed the search (ie: what API you used and 
what your code looks like) are important.   At the lowest level, docs come 
back in "index order" by default -- that historicly has been the order 
they were put into the index, but as Lucene evolves and new merge 
algorithms are added it may not always been true.

based on what you're describing however, it's possible you are using a 
TopDocs based API which returns them sorted by score.

The real question is why the docs don't all have identical scores ... the 
score explanation feature should help you understand that.  the details 
really depend on wether there are any docs with the same id value (even if 
they are deleted) and how the id value is indexed. (and if there are doc 
boosts, etc...)

: I'm trying to get them returned in the order that the ids are defined in the
: query.  eg.  +(id:a or id:b or id:c) returns a list of [a,b,c]

i don't know of any simple way to do that with lucene scoring.  if these 
are "unique keys" and you know in advance the upper bound number of docs 
you'll get back (ie: the number of ids) and you want them all at once (ie: 
no pagination) then it's pretty trivial to sort them in the client.


-Hoss