You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by James Greene <ja...@jamesaustingreene.com> on 2020/05/05 15:38:51 UTC

Data Import Handler - Concurrent Entity Importing

Hello, I'm new to the group here so please excuse me if I do not have the
etiquette down yet.

Is it possible to have multiple entities (customer configurable, up to 40
atm) in a DIH configuration to be imported at once?  Right now I have
multiple root entities in my configuration but they get indexes
sequentially and this means the entities that are last are always delayed
hitting the index.

I'm trying to migrate an existing setup (solr 6.6) that utilizes a
different collection for each "entity type" into a single collection (solr
8.4) to get around some of the hurdles faced when needing to have searches
that require multiple block joins and currently does not work going cross
core.

I'm also wondering if it is better to fully qualify a field name or use two
different fields for performing the "same" search.  i.e:


{
    type_A_status; Active
    type_A_value: Test
}
vs
{
    type: A
    status: Active
    value: Test
}

Re: Data Import Handler - Concurrent Entity Importing

Posted by ART GALLERY <al...@goretoy.com>.
check out the videos on this website TROO.TUBE don't be such a
sheep/zombie/loser/NPC. Much love!
https://troo.tube/videos/watch/aaa64864-52ee-4201-922f-41300032f219

On Tue, May 5, 2020 at 1:58 PM Mikhail Khludnev <mk...@apache.org> wrote:
>
> Hello, James.
>
> DataImportHandler has a lock preventing concurrent execution. If you need
> to run several imports in parallel at the same core, you need to duplicate
> "/dataimport" handlers definition in solrconfig.xml. Thus, you can run them
> in parallel. Regarding schema, I prefer the latter but mileage may vary.
>
> --
> Mikhail.
>
> On Tue, May 5, 2020 at 6:39 PM James Greene <ja...@jamesaustingreene.com>
> wrote:
>
> > Hello, I'm new to the group here so please excuse me if I do not have the
> > etiquette down yet.
> >
> > Is it possible to have multiple entities (customer configurable, up to 40
> > atm) in a DIH configuration to be imported at once?  Right now I have
> > multiple root entities in my configuration but they get indexes
> > sequentially and this means the entities that are last are always delayed
> > hitting the index.
> >
> > I'm trying to migrate an existing setup (solr 6.6) that utilizes a
> > different collection for each "entity type" into a single collection (solr
> > 8.4) to get around some of the hurdles faced when needing to have searches
> > that require multiple block joins and currently does not work going cross
> > core.
> >
> > I'm also wondering if it is better to fully qualify a field name or use two
> > different fields for performing the "same" search.  i.e:
> >
> >
> > {
> >     type_A_status; Active
> >     type_A_value: Test
> > }
> > vs
> > {
> >     type: A
> >     status: Active
> >     value: Test
> > }
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev

Re: Data Import Handler - Concurrent Entity Importing

Posted by Mikhail Khludnev <mk...@apache.org>.
Hello, James.

DataImportHandler has a lock preventing concurrent execution. If you need
to run several imports in parallel at the same core, you need to duplicate
"/dataimport" handlers definition in solrconfig.xml. Thus, you can run them
in parallel. Regarding schema, I prefer the latter but mileage may vary.

--
Mikhail.

On Tue, May 5, 2020 at 6:39 PM James Greene <ja...@jamesaustingreene.com>
wrote:

> Hello, I'm new to the group here so please excuse me if I do not have the
> etiquette down yet.
>
> Is it possible to have multiple entities (customer configurable, up to 40
> atm) in a DIH configuration to be imported at once?  Right now I have
> multiple root entities in my configuration but they get indexes
> sequentially and this means the entities that are last are always delayed
> hitting the index.
>
> I'm trying to migrate an existing setup (solr 6.6) that utilizes a
> different collection for each "entity type" into a single collection (solr
> 8.4) to get around some of the hurdles faced when needing to have searches
> that require multiple block joins and currently does not work going cross
> core.
>
> I'm also wondering if it is better to fully qualify a field name or use two
> different fields for performing the "same" search.  i.e:
>
>
> {
>     type_A_status; Active
>     type_A_value: Test
> }
> vs
> {
>     type: A
>     status: Active
>     value: Test
> }
>


-- 
Sincerely yours
Mikhail Khludnev