You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by mark12345 <ma...@yahoo.com.au> on 2013/03/12 05:43:09 UTC

Solr _docid_ parameter

In Solr, I noticed that I can sort by the internal Lucene _docid_.

->   http://wiki.apache.org/solr/CommonQueryParameters
<http://wiki.apache.org/solr/CommonQueryParameters>  

> You can sort by index id using sort=_docid_ asc or sort=_docid_ desc

* I have also read the docid is represented by a sequential number.

->  
http://lucene.472066.n3.nabble.com/Get-DocID-after-Document-insert-td556278.html
<http://lucene.472066.n3.nabble.com/Get-DocID-after-Document-insert-td556278.html>  

>  Your document IDs may change, and in fact *will* change if you delete a
> document and then optimize. Say you index 100 docs, delete number 50 and
> optimize. Documents that originally had IDs 51-100 will now have IDs 50-99
> and your hierarchy will be messed up. 

-> 
http://www.garethfreeman.com/2011/11/sorting-results-by-order-indexed-in.html
<http://www.garethfreeman.com/2011/11/sorting-results-by-order-indexed-in.html>  

> Just a quick one. If you are looking to sort your Solr results by the
> order they were indexed you can used sort=_docid_ asc or sort=_docid_ desc
> as you sorting query parameter. 

So there is a slight chance that the _docid_ might represent document
creation order.  Does anyone have knowledge and experience with the
internals of Solr/Lucene 4.x and the  _docid_ field to clarify this?




--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-docid-parameter-tp4046544.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr _docid_ parameter

Posted by mark12345 <ma...@yahoo.com.au>.
The following relates directly to my question above.  Thanks Erick.


Erick Erickson wrote
> Don't use the internal Lucene doc ID. It _will_ change, even the
> relationship between existing docs will change. When cores are merged, the
> Lucene doc IDs are renumbered. Segments are NOT merged in insertion order,
> they're merged to try to not keep rewriting large segments.
> 
> So if you rely on any ordering based on insertion order by trying to use
> internal Lucene doc ID, you'll be disappointed.
> 
> I really think you'll have to generate something yourself that you can
> count on.
> 
> Best
> Erick





--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-docid-parameter-tp4046544p4046840.html
Sent from the Solr - User mailing list archive at Nabble.com.