You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Roxana Danger <ro...@reedonline.co.uk> on 2015/09/28 17:23:19 UTC

entity processing order during updates

Hello,
     I am importing in solr 2 entities coming from 2 different tables, and
I have defined an update request processor chain with two custom processor
factories:
     - the first processor factory needs to be executed first for one type
of entities and then for the other (I differentiate the "entity type" with
a field called table). In the import data config file I keep the order on
which the entities should need to be processed.
      - the second processor needs to be executed after complete the first
one.
     When executed the updates having only the first processor, the updates
work all fine. However, when I added the second processor, it seems that
the first update processor is not getting the entities in the order I was
expected.
     Does anyone had this problem before? Could anyone help me to configure
this?
     Thank you very much in advance,
             Roxana






<http://www.reed.co.uk/lovemondays>

Re: entity processing order during updates

Posted by Roxana Danger <ro...@reedonline.co.uk>.
Of course, thank you!
Hopefully, it will be more clear now. I have:
    - in db-config:
          <document>
                 <entity> ...
                         <field column="table" template="E1" />
                 </entity>
                  <entity> ....
                          <field column="table" template="E2" />
                 </entity>
          <document>
    - in config:
            <updateRequestProcessorChain name="retrieveDetails">
                 <processor class="myClass1"/>
                 <processor class="MyClass2"/>
                 <processor class="solr.LogUpdateProcessorFactory" />
                 <processor class="solr.RunUpdateProcessorFactory" />
           </updateRequestProcessorChain>

I need the following order to be executed:
           - import data from DB for E1
           - import data from DB for E2
           - execute myClass1 for all the docs
           - execute myClass2 for all the docs

Sometimes it seems to be loading data for E2 before importing all data for
E1.
Also, when the process for importing the data for E2 begins, have the
analyzers for the fields associated to E1 been already executed?

Thank you very much again,
Roxana


On 30 September 2015 at 15:59, Alexandre Rafalovitch <ar...@gmail.com>
wrote:

> Hmm. It seems I misread " the second processor needs to be executed
> after complete the first
> one." In fact, I am still unsure what that is supposed to mean.
>
> Could you give a more concrete example of the sequence with say 2
> items of each time and what you see vs. what you expect to see.
>
> And I assume for DIH, you have two top level entity definitions next
> to each other. Not nested entities, no update clauses (just full
> import), etc.
>
> Regards,
>    Alex.
> ----
> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
> http://www.solr-start.com/
>
>
> On 30 September 2015 at 10:53, Roxana Danger
> <ro...@reedonline.co.uk> wrote:
> > Do you mean creating 2 instances and then generating a third one (or
> > updating one of them) for merging their data?
> > Is it not guaranteed that the entities in the DIH are imported in the
> order
> > described in the db-config file?
> > Thank you very much,
> > Roxana
> >
> >
> >
> > On 30 September 2015 at 14:48, Alexandre Rafalovitch <arafalov@gmail.com
> >
> > wrote:
> >
> >> Have you tried just having two separate endpoints each with its own
> >> definition of DIH and URP? Then, you just hit those end-points one at
> >> a time in whatever order you need.
> >>
> >> Seems easier than a custom switching logic.
> >>
> >> Regards,
> >>    Alex.
> >> ----
> >> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
> >> http://www.solr-start.com/
> >>
> >>
> >> On 28 September 2015 at 11:23, Roxana Danger
> >> <ro...@reedonline.co.uk> wrote:
> >> > Hello,
> >> >      I am importing in solr 2 entities coming from 2 different tables,
> >> and
> >> > I have defined an update request processor chain with two custom
> >> processor
> >> > factories:
> >> >      - the first processor factory needs to be executed first for one
> >> type
> >> > of entities and then for the other (I differentiate the "entity type"
> >> with
> >> > a field called table). In the import data config file I keep the
> order on
> >> > which the entities should need to be processed.
> >> >       - the second processor needs to be executed after complete the
> >> first
> >> > one.
> >> >      When executed the updates having only the first processor, the
> >> updates
> >> > work all fine. However, when I added the second processor, it seems
> that
> >> > the first update processor is not getting the entities in the order I
> was
> >> > expected.
> >> >      Does anyone had this problem before? Could anyone help me to
> >> configure
> >> > this?
> >> >      Thank you very much in advance,
> >> >              Roxana
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > <http://www.reed.co.uk/lovemondays>
> >>
> >
> >
> >
> > --
> > Roxana Danger | Data Scientist Dragon Court, 27-29 Macklin Street,
> London,
> > WC2B 5LX Tel: 020 7067 4568 [image: reed.co.uk] <http://www.reed.co.uk/>
> The
> > UK's #1 job site. <http://www.reed.co.uk/> [image: Follow us on Twitter]
> > <https://twitter.com/reedcouk>
> > <https://www.linkedin.com/company/reed.co.uk> [image:
> > Like us on Facebook] <https://www.facebook.com/reedcouk/>
> > <https://plus.google.com/u/0/+reedcouk/posts> It's time to Love Mondays
> »
> > <http://www.reed.co.uk/lovemondays>
>



-- 
Roxana Danger | Data Scientist Dragon Court, 27-29 Macklin Street, London,
WC2B 5LX Tel: 020 7067 4568 [image: reed.co.uk] <http://www.reed.co.uk/> The
UK's #1 job site. <http://www.reed.co.uk/> [image: Follow us on Twitter]
<https://twitter.com/reedcouk>
<https://www.linkedin.com/company/reed.co.uk> [image:
Like us on Facebook] <https://www.facebook.com/reedcouk/>
<https://plus.google.com/u/0/+reedcouk/posts> It's time to Love Mondays »
<http://www.reed.co.uk/lovemondays>

Re: entity processing order during updates

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Hmm. It seems I misread " the second processor needs to be executed
after complete the first
one." In fact, I am still unsure what that is supposed to mean.

Could you give a more concrete example of the sequence with say 2
items of each time and what you see vs. what you expect to see.

And I assume for DIH, you have two top level entity definitions next
to each other. Not nested entities, no update clauses (just full
import), etc.

Regards,
   Alex.
----
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 30 September 2015 at 10:53, Roxana Danger
<ro...@reedonline.co.uk> wrote:
> Do you mean creating 2 instances and then generating a third one (or
> updating one of them) for merging their data?
> Is it not guaranteed that the entities in the DIH are imported in the order
> described in the db-config file?
> Thank you very much,
> Roxana
>
>
>
> On 30 September 2015 at 14:48, Alexandre Rafalovitch <ar...@gmail.com>
> wrote:
>
>> Have you tried just having two separate endpoints each with its own
>> definition of DIH and URP? Then, you just hit those end-points one at
>> a time in whatever order you need.
>>
>> Seems easier than a custom switching logic.
>>
>> Regards,
>>    Alex.
>> ----
>> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
>> http://www.solr-start.com/
>>
>>
>> On 28 September 2015 at 11:23, Roxana Danger
>> <ro...@reedonline.co.uk> wrote:
>> > Hello,
>> >      I am importing in solr 2 entities coming from 2 different tables,
>> and
>> > I have defined an update request processor chain with two custom
>> processor
>> > factories:
>> >      - the first processor factory needs to be executed first for one
>> type
>> > of entities and then for the other (I differentiate the "entity type"
>> with
>> > a field called table). In the import data config file I keep the order on
>> > which the entities should need to be processed.
>> >       - the second processor needs to be executed after complete the
>> first
>> > one.
>> >      When executed the updates having only the first processor, the
>> updates
>> > work all fine. However, when I added the second processor, it seems that
>> > the first update processor is not getting the entities in the order I was
>> > expected.
>> >      Does anyone had this problem before? Could anyone help me to
>> configure
>> > this?
>> >      Thank you very much in advance,
>> >              Roxana
>> >
>> >
>> >
>> >
>> >
>> >
>> > <http://www.reed.co.uk/lovemondays>
>>
>
>
>
> --
> Roxana Danger | Data Scientist Dragon Court, 27-29 Macklin Street, London,
> WC2B 5LX Tel: 020 7067 4568 [image: reed.co.uk] <http://www.reed.co.uk/> The
> UK's #1 job site. <http://www.reed.co.uk/> [image: Follow us on Twitter]
> <https://twitter.com/reedcouk>
> <https://www.linkedin.com/company/reed.co.uk> [image:
> Like us on Facebook] <https://www.facebook.com/reedcouk/>
> <https://plus.google.com/u/0/+reedcouk/posts> It's time to Love Mondays »
> <http://www.reed.co.uk/lovemondays>

Re: entity processing order during updates

Posted by Roxana Danger <ro...@reedonline.co.uk>.
Do you mean creating 2 instances and then generating a third one (or
updating one of them) for merging their data?
Is it not guaranteed that the entities in the DIH are imported in the order
described in the db-config file?
Thank you very much,
Roxana



On 30 September 2015 at 14:48, Alexandre Rafalovitch <ar...@gmail.com>
wrote:

> Have you tried just having two separate endpoints each with its own
> definition of DIH and URP? Then, you just hit those end-points one at
> a time in whatever order you need.
>
> Seems easier than a custom switching logic.
>
> Regards,
>    Alex.
> ----
> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
> http://www.solr-start.com/
>
>
> On 28 September 2015 at 11:23, Roxana Danger
> <ro...@reedonline.co.uk> wrote:
> > Hello,
> >      I am importing in solr 2 entities coming from 2 different tables,
> and
> > I have defined an update request processor chain with two custom
> processor
> > factories:
> >      - the first processor factory needs to be executed first for one
> type
> > of entities and then for the other (I differentiate the "entity type"
> with
> > a field called table). In the import data config file I keep the order on
> > which the entities should need to be processed.
> >       - the second processor needs to be executed after complete the
> first
> > one.
> >      When executed the updates having only the first processor, the
> updates
> > work all fine. However, when I added the second processor, it seems that
> > the first update processor is not getting the entities in the order I was
> > expected.
> >      Does anyone had this problem before? Could anyone help me to
> configure
> > this?
> >      Thank you very much in advance,
> >              Roxana
> >
> >
> >
> >
> >
> >
> > <http://www.reed.co.uk/lovemondays>
>



-- 
Roxana Danger | Data Scientist Dragon Court, 27-29 Macklin Street, London,
WC2B 5LX Tel: 020 7067 4568 [image: reed.co.uk] <http://www.reed.co.uk/> The
UK's #1 job site. <http://www.reed.co.uk/> [image: Follow us on Twitter]
<https://twitter.com/reedcouk>
<https://www.linkedin.com/company/reed.co.uk> [image:
Like us on Facebook] <https://www.facebook.com/reedcouk/>
<https://plus.google.com/u/0/+reedcouk/posts> It's time to Love Mondays »
<http://www.reed.co.uk/lovemondays>

Re: entity processing order during updates

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Have you tried just having two separate endpoints each with its own
definition of DIH and URP? Then, you just hit those end-points one at
a time in whatever order you need.

Seems easier than a custom switching logic.

Regards,
   Alex.
----
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 28 September 2015 at 11:23, Roxana Danger
<ro...@reedonline.co.uk> wrote:
> Hello,
>      I am importing in solr 2 entities coming from 2 different tables, and
> I have defined an update request processor chain with two custom processor
> factories:
>      - the first processor factory needs to be executed first for one type
> of entities and then for the other (I differentiate the "entity type" with
> a field called table). In the import data config file I keep the order on
> which the entities should need to be processed.
>       - the second processor needs to be executed after complete the first
> one.
>      When executed the updates having only the first processor, the updates
> work all fine. However, when I added the second processor, it seems that
> the first update processor is not getting the entities in the order I was
> expected.
>      Does anyone had this problem before? Could anyone help me to configure
> this?
>      Thank you very much in advance,
>              Roxana
>
>
>
>
>
>
> <http://www.reed.co.uk/lovemondays>