You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Shawn Heisey <so...@elyograg.org> on 2012/01/09 20:26:27 UTC

Multiple dataimport processes to same core?

Is it safe or advisable to run multiple dataimport handler requests on 
one Solr core simultaneously?

Thanks,
Shawn


RE: Multiple dataimport processes to same core?

Posted by "Dyer, James" <Ja...@ingrambook.com>.
We do this in production and haven't had any issues.  This is a 1.4.1 installation, back when there was no "threads" option in DIH.  We divide the index into 8 parts and then run 8 DIH handlers at the same time, indexing simultaneously.  While Lucene itself is a bottleneck, we have a lot of data sources that DIH has to join, transformers, etc, so running multiple DIH handlers at once provides scale.

One annoyance is because of how DIH is designed, you need a separate handler set up in solrconfig.xml for each DIH you plan to run.  So you have to plan in advance how many DIH instances you want to run, which config files they'll use, etc.  

The other thing is you want to avoid the scenario where more than one DIH handler ends around the same time and they auto-commit on top of one another (or worse, optimize).  Because in our case we split the work in equal parts, we just turned auto-commit off in DIH and then do one big commit at the end once all 8 DIH's are done running.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

-----Original Message-----
From: Shawn Heisey [mailto:solr@elyograg.org] 
Sent: Monday, January 09, 2012 1:26 PM
To: solr-user@lucene.apache.org
Subject: Multiple dataimport processes to same core?

Is it safe or advisable to run multiple dataimport handler requests on 
one Solr core simultaneously?

Thanks,
Shawn