You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Adrian Pemsel <ap...@gmail.com> on 2010/03/29 09:37:33 UTC

Solr not returning all documents?

Hi,

As part of our application I have written a reindex task that runs through
all documents in a core one by one (using *:*, a start offset and a row
limit of 1) and adds them to a new core (potentially with a new schema).
However, while working well for small sets this approach somehow does not
seem to work for larger data sets. The Reindex task counts its offset into
the old core, this count stops at about 118000 and no more documents are
returned. However, numDocs says there are around 582000 documents in the old
core.
Am I making a wrong assumption in believing I should get all documents like
this?

Thanks,

Adrian

Re: Solr not returning all documents?

Posted by Lance Norskog <go...@gmail.com>.
Yes, this should work. It will be very slow.

There is a special hack by which you can say sort=_docid_+asc (or
+desc). _docid_ is a magic field name that avoids sorting the results.
Pulling documents at row # 1 million should be only a little slower
than pulling documents at row #0.

On Mon, Mar 29, 2010 at 12:37 AM, Adrian Pemsel <ap...@gmail.com> wrote:
> Hi,
>
> As part of our application I have written a reindex task that runs through
> all documents in a core one by one (using *:*, a start offset and a row
> limit of 1) and adds them to a new core (potentially with a new schema).
> However, while working well for small sets this approach somehow does not
> seem to work for larger data sets. The Reindex task counts its offset into
> the old core, this count stops at about 118000 and no more documents are
> returned. However, numDocs says there are around 582000 documents in the old
> core.
> Am I making a wrong assumption in believing I should get all documents like
> this?
>
> Thanks,
>
> Adrian
>



-- 
Lance Norskog
goksron@gmail.com