You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by antoniosi <an...@gmail.com> on 2011/05/25 01:42:30 UTC

newbie question for DataImportHandler

Hi,

I am new to Solr; apologize in advance if this is a stupid question.

I have created a simple database, with only 1 table with 3 columns, id,
name, and last_update fields.

I populate the database with 1 million test rows.
I run solr, go to the data import handler development console and do a full
import. I use the "Luke" tool to look at the content of the lucene index.

This all works fine so far.

I remove all the 1 million rows from my table and populate the table with
another million rows of data.
I remove the index that solr previously create. I restart solr and go to the
data import handler development console and do the full import again.

I use the "Luke" tool to look at the content of the lucene index. However, I
am seeing the old data in my new index.

Doe Solr keeps a cached copy of the index somewhere?

I hope I have described my problem clearly.

Thanks in advance.

--
View this message in context: http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: newbie question for DataImportHandler

Posted by Zac Smith <za...@trinkit.com>.
Sounds like you might not be committing the delete. How are you deleting it?
If you run the data import handler with clean=true (which is the default) it will delete the data for you anyway so you don't need to delete it yourself.

Hope that helps.

-----Original Message-----
From: antoniosi [mailto:antonio.si@gmail.com] 
Sent: Tuesday, May 24, 2011 4:43 PM
To: solr-user@lucene.apache.org
Subject: newbie question for DataImportHandler

Hi,

I am new to Solr; apologize in advance if this is a stupid question.

I have created a simple database, with only 1 table with 3 columns, id, name, and last_update fields.

I populate the database with 1 million test rows.
I run solr, go to the data import handler development console and do a full import. I use the "Luke" tool to look at the content of the lucene index.

This all works fine so far.

I remove all the 1 million rows from my table and populate the table with another million rows of data.
I remove the index that solr previously create. I restart solr and go to the data import handler development console and do the full import again.

I use the "Luke" tool to look at the content of the lucene index. However, I am seeing the old data in my new index.

Doe Solr keeps a cached copy of the index somewhere?

I hope I have described my problem clearly.

Thanks in advance.

--
View this message in context: http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: newbie question for DataImportHandler

Posted by Kevin Bootz <kb...@caci.com>.
In the op it's stated that the index was deleted. I'm guessing that means the physical files, /data/....  
quote
populate the table 
> with another million rows of data.
> I remove the index that solr previously create. I restart solr and go 
> to
the
> data import handler development console and do the full import again.
endquote

Is there a separate cache that could be causing the issue? I'm a newbie as well and it seems that if I delete the index there shouldn't be any vestige info left anywhere????

Thanks

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Sunday, May 29, 2011 9:00 PM
To: solr-user@lucene.apache.org
Subject: Re: newbie question for DataImportHandler

This trips up a lot of folks. Sold just marks docs as deleted, the terms etc are left in the index until an optimize is performed, or the segments are merged. This latter isn't very predictable, so just do an optimize.

The docs aren't returned as results though.

Best
Erick
On May 24, 2011 10:22 PM, "antoniosi" <an...@gmail.com> wrote:
> Hi,
>
> I am new to Solr; apologize in advance if this is a stupid question.
>
> I have created a simple database, with only 1 table with 3 columns, 
> id, name, and last_update fields.
>
> I populate the database with 1 million test rows.
> I run solr, go to the data import handler development console and do a
full
> import. I use the "Luke" tool to look at the content of the lucene index.
>
> This all works fine so far.
>
> I remove all the 1 million rows from my table and populate the table 
> with another million rows of data.
> I remove the index that solr previously create. I restart solr and go 
> to
the
> data import handler development console and do the full import again.
>
> I use the "Luke" tool to look at the content of the lucene index. 
> However,
I
> am seeing the old data in my new index.
>
> Doe Solr keeps a cached copy of the index somewhere?
>
> I hope I have described my problem clearly.
>
> Thanks in advance.
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: newbie question for DataImportHandler

Posted by Erick Erickson <er...@gmail.com>.
This trips up a lot of folks. Sold just marks docs as deleted, the terms etc
are left in the index until an optimize is performed, or the segments are
merged. This latter isn't very predictable, so just do an optimize.

The docs aren't returned as results though.

Best
Erick
On May 24, 2011 10:22 PM, "antoniosi" <an...@gmail.com> wrote:
> Hi,
>
> I am new to Solr; apologize in advance if this is a stupid question.
>
> I have created a simple database, with only 1 table with 3 columns, id,
> name, and last_update fields.
>
> I populate the database with 1 million test rows.
> I run solr, go to the data import handler development console and do a
full
> import. I use the "Luke" tool to look at the content of the lucene index.
>
> This all works fine so far.
>
> I remove all the 1 million rows from my table and populate the table with
> another million rows of data.
> I remove the index that solr previously create. I restart solr and go to
the
> data import handler development console and do the full import again.
>
> I use the "Luke" tool to look at the content of the lucene index. However,
I
> am seeing the old data in my new index.
>
> Doe Solr keeps a cached copy of the index somewhere?
>
> I hope I have described my problem clearly.
>
> Thanks in advance.
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/newbie-question-for-DataImportHandler-tp2982277p2982277.html
> Sent from the Solr - User mailing list archive at Nabble.com.