You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pulkit Singhal <pu...@gmail.com> on 2011/10/05 03:00:46 UTC

DIH full-import with clean=false is still removing old data

Hello,

I have a unique dataset of 1,110,000 products, each as its own file.
It is split into three different directories as 500,000 and 110,000
files and 500,000.

When I run:
http://localhost:8983/solr/bbyopen/dataimport?command=full-import&clean=false&commit=true
The first 500,000 entries are successfully indexed and then the next
110,000 entries also work ... but after I run the third full-import on
the last set of 500,000 entries, the document count remains at 610,000
... it doesn't go up to 1,110,000!

1) Is there some kind of limit here? Why can the full-import keep the
initial 500,000 entries and then let me do a full-import with 110,000
more entries ... but when I try to do a 3rd full-import, the document
count doesn't go up.

2) I know for sure that all the data is unique. Since I am not doing
delta-imports, I have NOT specified any primary key in the
data-import.xml file. But I do have a uniqueKey in the schema.xml
file.

Any tips?
- Pulkit

Re: DIH full-import with clean=false is still removing old data

Posted by Pulkit Singhal <pu...@gmail.com>.
Bah it worked after cleaning it out for the 3rd time, don't know what
I did differently this time :(

<result name="response" numFound="1110983" start="0"/>

On Tue, Oct 4, 2011 at 8:00 PM, Pulkit Singhal <pu...@gmail.com> wrote:
> Hello,
>
> I have a unique dataset of 1,110,000 products, each as its own file.
> It is split into three different directories as 500,000 and 110,000
> files and 500,000.
>
> When I run:
> http://localhost:8983/solr/bbyopen/dataimport?command=full-import&clean=false&commit=true
> The first 500,000 entries are successfully indexed and then the next
> 110,000 entries also work ... but after I run the third full-import on
> the last set of 500,000 entries, the document count remains at 610,000
> ... it doesn't go up to 1,110,000!
>
> 1) Is there some kind of limit here? Why can the full-import keep the
> initial 500,000 entries and then let me do a full-import with 110,000
> more entries ... but when I try to do a 3rd full-import, the document
> count doesn't go up.
>
> 2) I know for sure that all the data is unique. Since I am not doing
> delta-imports, I have NOT specified any primary key in the
> data-import.xml file. But I do have a uniqueKey in the schema.xml
> file.
>
> Any tips?
> - Pulkit
>