You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Brian Narsi <bn...@gmail.com> on 2016/01/22 18:29:07 UTC

Scaling DataImportHandler

What are the various ways DataImportHandler can be scaled?

Thanks

Re: Scaling DataImportHandler

Posted by Shawn Heisey <ap...@elyograg.org>.
On 1/22/2016 10:29 AM, Brian Narsi wrote:
> What are the various ways DataImportHandler can be scaled?

I'm not very familiar with how DIH interacts with SolrCloud.  I know you 
can use it with SolrCloud, but nothing else.  Assuming you're not 
running SolrCloud, the following information will apply:

The only way I know of to scale DIH performance on a single Solr core is 
to define multiple DIH handlers, each of which imports a subset of your 
data.  Then you can run multiple imports simultaneously.  A single DIH 
handler can only run one import at a time, and it will be single-threaded.

If you've got a sharded index, each shard would be in its own core, and 
therefore each one would have its own instance of your dataimport 
handler with its own config, so you can run an import on each shard 
simultaneously.  This is what happens on my index when I need to fully 
rebuild it.

Thanks,
Shawn


Re: Scaling DataImportHandler

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
a mile age can vary 
http://blog.griddynamics.com/2015/07/how-to-import-structured-data-into-solr.html

On Fri, Jan 22, 2016 at 8:29 PM, Brian Narsi <bn...@gmail.com> wrote:

> What are the various ways DataImportHandler can be scaled?
>
> Thanks
>



-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mk...@griddynamics.com>