You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by con <co...@gmail.com> on 2008/11/08 20:53:42 UTC

full-import vs delta-import to update solr index

Hi guys,
I have a simple question.

The main difference between a full-import and a delta-import, AS I
UNDERSTOOD, is that full import creates the whole index each time it is run
and a delta import updates the existing index. So i prefered using the delta
import since my application can grow bigger later.
So in my data-config.xml, i have defined an entity just like:
		<entity name="test" transformer="TemplateTransformer" pk="USER_ID"
query="select * FROM EMP, CUSTOMER where EMP.userId=CUST.userId"
deltaQuery="select * FROM EMP, CUSTOMER where EMP.userId=CUST.userId">

So that I can update the index on a regular interval without restarting the
server or rebuilding the index, by runing
http://localhost:8080/solr/core0/dataimport?command=delta-import

But the  problem is, each time i run the delta import, I am geting duplicate
entries in my index.
Is it because of the wrong data-config or do i need to add some more
parameters to some configuration files or is there some other much better
approach or whether a full-import each time can serve better??.

I respect any suggestion or hints to solve this issue. Waiting for your
responce,
Thanking in advance
con.
-- 
View this message in context: http://www.nabble.com/full-import-vs-delta-import-to-update-solr-index-tp20399673p20399673.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: full-import vs delta-import to update solr index

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
You must let the tool identify the changed rows instead of providing a
"select * from"

see the section on more details
http://wiki.apache.org/solr/DataImportHandler#head-9ee74e0ad772fd57f6419033fb0af9828222e041





On Sun, Nov 9, 2008 at 1:23 AM, con <co...@gmail.com> wrote:
>
> Hi guys,
> I have a simple question.
>
> The main difference between a full-import and a delta-import, AS I
> UNDERSTOOD, is that full import creates the whole index each time it is run
> and a delta import updates the existing index. So i prefered using the delta
> import since my application can grow bigger later.
> So in my data-config.xml, i have defined an entity just like:
>                <entity name="test" transformer="TemplateTransformer" pk="USER_ID"
> query="select * FROM EMP, CUSTOMER where EMP.userId=CUST.userId"
> deltaQuery="select * FROM EMP, CUSTOMER where EMP.userId=CUST.userId">
>
> So that I can update the index on a regular interval without restarting the
> server or rebuilding the index, by runing
> http://localhost:8080/solr/core0/dataimport?command=delta-import
>
> But the  problem is, each time i run the delta import, I am geting duplicate
> entries in my index.
> Is it because of the wrong data-config or do i need to add some more
> parameters to some configuration files or is there some other much better
> approach or whether a full-import each time can serve better??.
>
> I respect any suggestion or hints to solve this issue. Waiting for your
> responce,
> Thanking in advance
> con.
> --
> View this message in context: http://www.nabble.com/full-import-vs-delta-import-to-update-solr-index-tp20399673p20399673.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul