You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Scott Smith <ss...@mainstreamdata.com> on 2014/01/26 19:01:38 UTC

Tie breakers when sorting equal items

I promised to ask this on the forum just to confirm what I assume is true.

Suppose you're returning results using a sort order based on some field (so, not relevancy). For example, suppose it's a date field which indicates when the document was loaded into the solr index.   Suppose two items have exactly the same date/time in the field.  Would solr return the two items in the order in which they were inserted.  I would assume that the answer is "not necessarily".

I know that you can have secondary sort fields if something exists that would provide the desired functionality.  I know that I could set up some kind of numbering scheme that would provide the same result (the customer doesn't want to pay for that).

So, I'm really just asking if Solr has any guarantees that when you sort on a field and two items have the same value, they will be sorted in the order they were inserted into the index.  Again, I assume the answer is "no", but I said I would ask.

Re: Tie breakers when sorting equal items

Posted by Erick Erickson <er...@gmail.com>.
It's even worse. It may change. The internal tiebreaker is the
internal Lucene doc ID,
which may change one against another as segments are merged. So docA may
sort < docB for a while, then eventually sort docA > docB....

I _think_ you can play games with the merge policy to anticipate this,
but really it'd be
much cheaper to add some kind of counter that you're guaranteed wouldn't change.

Best,
Erick

On Sun, Jan 26, 2014 at 12:01 PM, Scott Smith <ss...@mainstreamdata.com> wrote:
> I promised to ask this on the forum just to confirm what I assume is true.
>
> Suppose you're returning results using a sort order based on some field (so, not relevancy). For example, suppose it's a date field which indicates when the document was loaded into the solr index.   Suppose two items have exactly the same date/time in the field.  Would solr return the two items in the order in which they were inserted.  I would assume that the answer is "not necessarily".
>
> I know that you can have secondary sort fields if something exists that would provide the desired functionality.  I know that I could set up some kind of numbering scheme that would provide the same result (the customer doesn't want to pay for that).
>
> So, I'm really just asking if Solr has any guarantees that when you sort on a field and two items have the same value, they will be sorted in the order they were inserted into the index.  Again, I assume the answer is "no", but I said I would ask.