You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Selvam <s....@gmail.com> on 2016/09/26 10:36:02 UTC

Retaining a field value during DataImport

Hi All,

We use DataImportHandler to import data from Redshift. We want to overwrite
some 250M existing records (that has around 350 columns) while retaining
the field value of only one column in those 250M records. The reason is,
that one column is a multi-valued and requires a costly query to build that
values again.

I learned about Transformers, I am not sure if it is possible to get the
old document value during that process. Any help would be appreciated.


-- 
Regards,
Selvam

Re: Retaining a field value during DataImport

Posted by Selvam <s....@gmail.com>.
Hi,

Thanks, I will look into options specified.


On Mon, Sep 26, 2016 at 4:35 PM, Alexandre Rafalovitch <ar...@gmail.com>
wrote:

> Transformers do not see what's in the Solr index, they are too early
> in the processing chain.
>
> You could probably do something by exporting that field's value,
> caching it and injecting it back with transformer from that cache.
> Messy but doable.
>
> UpdateRequestProcessor would be able to do it, but your request from
> DIH is coming as a new document, not an update. So the old one would
> be overidden.
>
> SOLR-9530 could be an answer to that, but it is just a design so far -
> no implementation. You could write one yourself or see if showing
> excitement on the JIRA and being ready to debug the patch would get
> the committer's attention.
>
>
> Regards,
>     Alex.
> ----
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 26 September 2016 at 17:36, Selvam <s....@gmail.com> wrote:
> > Hi All,
> >
> > We use DataImportHandler to import data from Redshift. We want to
> overwrite
> > some 250M existing records (that has around 350 columns) while retaining
> > the field value of only one column in those 250M records. The reason is,
> > that one column is a multi-valued and requires a costly query to build
> that
> > values again.
> >
> > I learned about Transformers, I am not sure if it is possible to get the
> > old document value during that process. Any help would be appreciated.
> >
> >
> > --
> > Regards,
> > Selvam
>



-- 
Regards,
Selvam
KnackForge <http://knackforge.com>

Re: Retaining a field value during DataImport

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Transformers do not see what's in the Solr index, they are too early
in the processing chain.

You could probably do something by exporting that field's value,
caching it and injecting it back with transformer from that cache.
Messy but doable.

UpdateRequestProcessor would be able to do it, but your request from
DIH is coming as a new document, not an update. So the old one would
be overidden.

SOLR-9530 could be an answer to that, but it is just a design so far -
no implementation. You could write one yourself or see if showing
excitement on the JIRA and being ready to debug the patch would get
the committer's attention.


Regards,
    Alex.
----
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 26 September 2016 at 17:36, Selvam <s....@gmail.com> wrote:
> Hi All,
>
> We use DataImportHandler to import data from Redshift. We want to overwrite
> some 250M existing records (that has around 350 columns) while retaining
> the field value of only one column in those 250M records. The reason is,
> that one column is a multi-valued and requires a costly query to build that
> values again.
>
> I learned about Transformers, I am not sure if it is possible to get the
> old document value during that process. Any help would be appreciated.
>
>
> --
> Regards,
> Selvam