You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Mohan, Sowmya" <So...@icf.com> on 2016/10/18 15:06:38 UTC

CachedSqlEntityProcessor with delta-import

Good morning,

Can CachedSqlEntityProcessor be used with delta-import? In my setup when running a delta-import with CachedSqlEntityProcessor, the child entity values are not correctly updated for the parent record. I am on Solr 4.3. Has anyone experienced this and if so how to resolve it?

Thanks,
Sowmya.


RE: CachedSqlEntityProcessor with delta-import

Posted by "Mohan, Sowmya" <So...@icf.com>.
Thanks. We did implement the delete by query on another core and thought of giving the delta import a try here. Looks like differential via full index and deletes using delete by id/query is the way to go. 

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Tuesday, October 25, 2016 12:31 PM
To: solr-user <so...@lucene.apache.org>
Subject: Re: CachedSqlEntityProcessor with delta-import

Why not use delete by id rather than query? It'll be more efficient....

Probably not a big deal though.

On Tue, Oct 25, 2016 at 1:47 AM, Aniket Khare <an...@gmail.com> wrote:
> Hi Sowmya,
>
> I my case I have implemeneted the data indexing suggested by James and 
> for deleting the reords I have created my own data indexing job which 
> will call the delete API periodically by passing the list of unique Id.
> https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+I
> ndex+Handlers
>
> http://localhost:8983/solr/update?stream.body=
> <delete><query>id:1234</query></delete>&commit=true
>
> Thanks,
> Aniket S. Khare
>
> On Tue, Oct 25, 2016 at 1:32 AM, Mohan, Sowmya <So...@icf.com> wrote:
>
>> Thanks James. That's what I was using before. But I also wanted to 
>> perform deletes using deletedPkQuery and hence switched to delta 
>> imports. The problem with using deletedPkQuery with the full import 
>> is that dataimporter.last_index_time is no longer accurate.
>>
>> Below is an example of my deletedPkQuery. If run the full-import for 
>> a differential index, that would update the last index time. Running 
>> the delta import to remove the deleted records then wouldn't do 
>> anything since nothing changed since the last index time.
>>
>>
>>  deletedPkQuery="SELECT id
>>                         FROM content
>>                         WHERE active = 1 AND lastUpdate > 
>> '${dataimporter.last_index_time}'"
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Dyer, James [mailto:James.Dyer@ingramcontent.com]
>> Sent: Friday, October 21, 2016 4:23 PM
>> To: solr-user@lucene.apache.org
>> Subject: RE: CachedSqlEntityProcessor with delta-import
>>
>> Sowmya,
>>
>> My memory is that the cache feature does not work with Delta Imports.  
>> In fact, I believe that nearly all DIH features except straight JDBC 
>> imports do not work with Delta Imports.  My advice is to not use the 
>> Delta Import feature at all as the same result can (often 
>> more-efficiently) be accomplished following the approach outlined here:
>> https://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport
>>
>> James Dyer
>> Ingram Content Group
>>
>> -----Original Message-----
>> From: Mohan, Sowmya [mailto:Sowmya.Mohan@icf.com]
>> Sent: Tuesday, October 18, 2016 10:07 AM
>> To: solr-user@lucene.apache.org
>> Subject: CachedSqlEntityProcessor with delta-import
>>
>> Good morning,
>>
>> Can CachedSqlEntityProcessor be used with delta-import? In my setup 
>> when running a delta-import with CachedSqlEntityProcessor, the child 
>> entity values are not correctly updated for the parent record. I am on Solr 4.3.
>> Has anyone experienced this and if so how to resolve it?
>>
>> Thanks,
>> Sowmya.
>>
>>
>
>
> --
> Regards,
>
> Aniket S. Khare

Re: CachedSqlEntityProcessor with delta-import

Posted by Erick Erickson <er...@gmail.com>.
Why not use delete by id rather than query? It'll be more efficient....

Probably not a big deal though.

On Tue, Oct 25, 2016 at 1:47 AM, Aniket Khare <an...@gmail.com> wrote:
> Hi Sowmya,
>
> I my case I have implemeneted the data indexing suggested by James and for
> deleting the reords I have created my own data indexing job which will call
> the delete API periodically by passing the list of unique Id.
> https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers
>
> http://localhost:8983/solr/update?stream.body=
> <delete><query>id:1234</query></delete>&commit=true
>
> Thanks,
> Aniket S. Khare
>
> On Tue, Oct 25, 2016 at 1:32 AM, Mohan, Sowmya <So...@icf.com> wrote:
>
>> Thanks James. That's what I was using before. But I also wanted to perform
>> deletes using deletedPkQuery and hence switched to delta imports. The
>> problem with using deletedPkQuery with the full import is that
>> dataimporter.last_index_time is no longer accurate.
>>
>> Below is an example of my deletedPkQuery. If run the full-import for a
>> differential index, that would update the last index time. Running the
>> delta import to remove the deleted records then wouldn't do anything since
>> nothing changed since the last index time.
>>
>>
>>  deletedPkQuery="SELECT id
>>                         FROM content
>>                         WHERE active = 1 AND lastUpdate >
>> '${dataimporter.last_index_time}'"
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Dyer, James [mailto:James.Dyer@ingramcontent.com]
>> Sent: Friday, October 21, 2016 4:23 PM
>> To: solr-user@lucene.apache.org
>> Subject: RE: CachedSqlEntityProcessor with delta-import
>>
>> Sowmya,
>>
>> My memory is that the cache feature does not work with Delta Imports.  In
>> fact, I believe that nearly all DIH features except straight JDBC imports
>> do not work with Delta Imports.  My advice is to not use the Delta Import
>> feature at all as the same result can (often more-efficiently) be
>> accomplished following the approach outlined here:
>> https://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport
>>
>> James Dyer
>> Ingram Content Group
>>
>> -----Original Message-----
>> From: Mohan, Sowmya [mailto:Sowmya.Mohan@icf.com]
>> Sent: Tuesday, October 18, 2016 10:07 AM
>> To: solr-user@lucene.apache.org
>> Subject: CachedSqlEntityProcessor with delta-import
>>
>> Good morning,
>>
>> Can CachedSqlEntityProcessor be used with delta-import? In my setup when
>> running a delta-import with CachedSqlEntityProcessor, the child entity
>> values are not correctly updated for the parent record. I am on Solr 4.3.
>> Has anyone experienced this and if so how to resolve it?
>>
>> Thanks,
>> Sowmya.
>>
>>
>
>
> --
> Regards,
>
> Aniket S. Khare

Re: CachedSqlEntityProcessor with delta-import

Posted by Aniket Khare <an...@gmail.com>.
Hi Sowmya,

I my case I have implemeneted the data indexing suggested by James and for
deleting the reords I have created my own data indexing job which will call
the delete API periodically by passing the list of unique Id.
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers

http://localhost:8983/solr/update?stream.body=
<delete><query>id:1234</query></delete>&commit=true

Thanks,
Aniket S. Khare

On Tue, Oct 25, 2016 at 1:32 AM, Mohan, Sowmya <So...@icf.com> wrote:

> Thanks James. That's what I was using before. But I also wanted to perform
> deletes using deletedPkQuery and hence switched to delta imports. The
> problem with using deletedPkQuery with the full import is that
> dataimporter.last_index_time is no longer accurate.
>
> Below is an example of my deletedPkQuery. If run the full-import for a
> differential index, that would update the last index time. Running the
> delta import to remove the deleted records then wouldn't do anything since
> nothing changed since the last index time.
>
>
>  deletedPkQuery="SELECT id
>                         FROM content
>                         WHERE active = 1 AND lastUpdate >
> '${dataimporter.last_index_time}'"
>
>
>
>
>
>
> -----Original Message-----
> From: Dyer, James [mailto:James.Dyer@ingramcontent.com]
> Sent: Friday, October 21, 2016 4:23 PM
> To: solr-user@lucene.apache.org
> Subject: RE: CachedSqlEntityProcessor with delta-import
>
> Sowmya,
>
> My memory is that the cache feature does not work with Delta Imports.  In
> fact, I believe that nearly all DIH features except straight JDBC imports
> do not work with Delta Imports.  My advice is to not use the Delta Import
> feature at all as the same result can (often more-efficiently) be
> accomplished following the approach outlined here:
> https://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport
>
> James Dyer
> Ingram Content Group
>
> -----Original Message-----
> From: Mohan, Sowmya [mailto:Sowmya.Mohan@icf.com]
> Sent: Tuesday, October 18, 2016 10:07 AM
> To: solr-user@lucene.apache.org
> Subject: CachedSqlEntityProcessor with delta-import
>
> Good morning,
>
> Can CachedSqlEntityProcessor be used with delta-import? In my setup when
> running a delta-import with CachedSqlEntityProcessor, the child entity
> values are not correctly updated for the parent record. I am on Solr 4.3.
> Has anyone experienced this and if so how to resolve it?
>
> Thanks,
> Sowmya.
>
>


-- 
Regards,

Aniket S. Khare

RE: CachedSqlEntityProcessor with delta-import

Posted by "Mohan, Sowmya" <So...@icf.com>.
Thanks James. That's what I was using before. But I also wanted to perform deletes using deletedPkQuery and hence switched to delta imports. The problem with using deletedPkQuery with the full import is that dataimporter.last_index_time is no longer accurate. 

Below is an example of my deletedPkQuery. If run the full-import for a differential index, that would update the last index time. Running the delta import to remove the deleted records then wouldn't do anything since nothing changed since the last index time. 


 deletedPkQuery="SELECT id
			FROM content
			WHERE active = 1 AND lastUpdate > '${dataimporter.last_index_time}'"
			





-----Original Message-----
From: Dyer, James [mailto:James.Dyer@ingramcontent.com] 
Sent: Friday, October 21, 2016 4:23 PM
To: solr-user@lucene.apache.org
Subject: RE: CachedSqlEntityProcessor with delta-import

Sowmya,

My memory is that the cache feature does not work with Delta Imports.  In fact, I believe that nearly all DIH features except straight JDBC imports do not work with Delta Imports.  My advice is to not use the Delta Import feature at all as the same result can (often more-efficiently) be accomplished following the approach outlined here: https://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport

James Dyer
Ingram Content Group

-----Original Message-----
From: Mohan, Sowmya [mailto:Sowmya.Mohan@icf.com] 
Sent: Tuesday, October 18, 2016 10:07 AM
To: solr-user@lucene.apache.org
Subject: CachedSqlEntityProcessor with delta-import

Good morning,

Can CachedSqlEntityProcessor be used with delta-import? In my setup when running a delta-import with CachedSqlEntityProcessor, the child entity values are not correctly updated for the parent record. I am on Solr 4.3. Has anyone experienced this and if so how to resolve it?

Thanks,
Sowmya.


RE: CachedSqlEntityProcessor with delta-import

Posted by "Dyer, James" <Ja...@ingramcontent.com>.
Sowmya,

My memory is that the cache feature does not work with Delta Imports.  In fact, I believe that nearly all DIH features except straight JDBC imports do not work with Delta Imports.  My advice is to not use the Delta Import feature at all as the same result can (often more-efficiently) be accomplished following the approach outlined here: https://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport

James Dyer
Ingram Content Group

-----Original Message-----
From: Mohan, Sowmya [mailto:Sowmya.Mohan@icf.com] 
Sent: Tuesday, October 18, 2016 10:07 AM
To: solr-user@lucene.apache.org
Subject: CachedSqlEntityProcessor with delta-import

Good morning,

Can CachedSqlEntityProcessor be used with delta-import? In my setup when running a delta-import with CachedSqlEntityProcessor, the child entity values are not correctly updated for the parent record. I am on Solr 4.3. Has anyone experienced this and if so how to resolve it?

Thanks,
Sowmya.