You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Saumitra Srivastav <sa...@gmail.com> on 2014/01/05 15:59:51 UTC

Solr maxIndexingThreads

I have a solr cluster with 8 server(4 shards with one replica for each). I
have 80 client threads indexing to this cluster. Client is running on a
different machine. I am trying to figure out optimal number of indexing
threads.

Now, solrconfig.xml have a config for *maxIndexingThreads*:

"The maximum number of simultaneous threads that may be indexing documents
at once in IndexWriter; if more than this many threads arrive they will wait
for others to finish. Default in Solr/Lucene is 8. "

I want to know whether this configuration is per solr instance or per
core(or collection).







--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-maxIndexingThreads-tp4109604.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr maxIndexingThreads

Posted by Otis Gospodnetic <ot...@gmail.com>.
Yes, I believe you got it right.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Sun, Jan 5, 2014 at 5:27 PM, Saumitra Srivastav <
saumitra.srivastav7@gmail.com> wrote:

> Yes, I am using SolrCloud.
>
> So lets say I have a 2 host cluster with 2 collections and 1 shard 1
> replica
> for each collection.
>
> Host-1(10.0.0.111)
>     -solr/
>           -collectionAAA_shard1_replica1/
>           -collectionBBB_shard1_replica1/
>
> Host-2:(10.0.0.222)
>      -solr/
>           -collectionAAA_shard1_replica2/
>           -collectionBBB_shard1_replica2/
>
>
> So, if my understanding of term "core" is correct, then I have 4 cores in
> total. That means while indexing 8 IndexWriter can be opened at one time
> per
> core. That means 16 IndexWriter per host. Please let me know if I got this
> correct.
>
>
> Thanks,
> Saumitra
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-maxIndexingThreads-tp4109604p4109670.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Solr maxIndexingThreads

Posted by Saumitra Srivastav <sa...@gmail.com>.
Yes, I am using SolrCloud. 

So lets say I have a 2 host cluster with 2 collections and 1 shard 1 replica
for each collection. 

Host-1(10.0.0.111)
    -solr/
          -collectionAAA_shard1_replica1/  
          -collectionBBB_shard1_replica1/

Host-2:(10.0.0.222)
     -solr/
          -collectionAAA_shard1_replica2/
          -collectionBBB_shard1_replica2/


So, if my understanding of term "core" is correct, then I have 4 cores in
total. That means while indexing 8 IndexWriter can be opened at one time per
core. That means 16 IndexWriter per host. Please let me know if I got this
correct.


Thanks,
Saumitra



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-maxIndexingThreads-tp4109604p4109670.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr maxIndexingThreads

Posted by Shawn Heisey <so...@elyograg.org>.
On 1/5/2014 11:56 AM, Saumitra Srivastav wrote:
> But what about *maxIndexingThreads* config in solrconfig.xml. Does it apply
> to the the collection or solr instance?
> 
> For example, lets say I have two indexing clients with multiple threads,
> each sending docs to 2 different collections.  And maxIndexingThreads is set
> to 8 for both collections. In that case, will maxIndexingThreads be 16 *(for
> one solr instance)*?

Because it's in solrconfig.xml, it applies per Solr core.  You haven't
said whether you're running SolrCloud, where a collection is a specific
kind of entity separate from a core.

A typical collection in a production SolrCloud system usually has at
least one shard and at least two replicas.  If you multiply that out,
you'll usually have at least two cores per collection, and they'll
normally be on separate Solr instances, which will also normally be on
separate hosts.  Ideally you'll make sure they are on separate physical
hardware, even if there's virtualization involved.

If you're not running SolrCloud, then "collection" doesn't have any
specific meaning ... although the core that exists in the main example
is called "collection1" which can lead to a little bit of confusion.

Thanks,
Shawn


Re: Solr maxIndexingThreads

Posted by Saumitra Srivastav <sa...@gmail.com>.
Thanks Shawn.

But what about *maxIndexingThreads* config in solrconfig.xml. Does it apply
to the the collection or solr instance?

For example, lets say I have two indexing clients with multiple threads,
each sending docs to 2 different collections.  And maxIndexingThreads is set
to 8 for both collections. In that case, will maxIndexingThreads be 16 *(for
one solr instance)*?





--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-maxIndexingThreads-tp4109604p4109638.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr maxIndexingThreads

Posted by Shawn Heisey <so...@elyograg.org>.
On 1/5/2014 9:19 AM, Saumitra Srivastav wrote:
> Also is there a way to specify number of threads for queries?

I have not seen anything like this.  That doesn't mean it doesn't exist,
just that I haven't seen it.

The servlet container can put a limit on the total thread count.  For
the Jetty that's included in the example, this is configured via
etc/jetty.xml to a value of 10000, which is a lot more than most people
will ever need.

Thanks,
Shawn


Re: Solr maxIndexingThreads

Posted by Saumitra Srivastav <sa...@gmail.com>.
Also is there a way to specify number of threads for queries?



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-maxIndexingThreads-tp4109604p4109612.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr maxIndexingThreads

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi,

On Sun, Jan 5, 2014 at 9:59 AM, Saumitra Srivastav <
saumitra.srivastav7@gmail.com> wrote:

> I have a solr cluster with 8 server(4 shards with one replica for each). I
> have 80 client threads indexing to this cluster. Client is running on a
> different machine. I am trying to figure out optimal number of indexing
> threads.
>

You need to take into account a few more things:
* the number of CPU cores
* whether you are CPU or disk or network or memory-bound

Use a tool like SPM for Solr (see
http://sematext.com/spm/solr-performance-monitoring/ ) to gain insight into
various Solr, JVM, and OS metrics to understand what your bottleneck is and
how things change when you vary different parameters (e.g. soft commit
frequency, JVM heap size, ramBufferMemoryMB, the number of indexing threads
you are mentioning, etc.).

If you are using SolrCloud use the new CloudSolrServer and send docs in
batches.

btw. re:
"The maximum number of simultaneous threads that may be indexing documents
at once in IndexWriter; if more than this many threads arrive they will wait
for others to finish. Default in Solr/Lucene is 8. "

IndexWriter is a Lucene-level object. 1 IndexWriter writes to 1 Lucene
index.  1 Solr core == 1 Lucene index.

So another factor to consider is the optimal number of shards and replicas,
not just indexing threads.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/



>
> Now, solrconfig.xml have a config for *maxIndexingThreads*:
>
> "The maximum number of simultaneous threads that may be indexing documents
> at once in IndexWriter; if more than this many threads arrive they will
> wait
> for others to finish. Default in Solr/Lucene is 8. "
>
> I want to know whether this configuration is per solr instance or per
> core(or collection).
>
>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-maxIndexingThreads-tp4109604.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>