You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by KNitin <ni...@gmail.com> on 2014/04/08 17:48:59 UTC

Cannot get shard id error - Hitting limits on creating collections

Hi

 I am running solr cloud 4.3.1 (there is a plan to upgrade to later
versions but that would take a few months). I noticed a very peculiar solr
behavior in solr that beyond *2496* cores I am unable to create any more
collections due to this error

*Could not get shard id for core.....*

I also noticed in the solr "tree" view that the overseer's collections work
queue gets stuck
 ( /overseer
collection-queue-work
qn-0000000360
qn-0000000362
qn-0000000364)

The test results are as follows.

With 8 shards and 2 replicas, I can create 156 collections (and then hit
the above error)
With 4 shards and 2 replicas, I can create 312 collections (and then hit
the above error)
With 2 shards and 2 replicas, I can create 624 collections (and then hit
the above error)

The total no of cores is 2496 in all the above cases.

I am unable to create any more collections after this due to cannot get
shard id error?

Is this a known bug or is there a work around for this? Is it fixed in
future releases?

Thanks much
-Nitin

Re: Cannot get shard id error - Hitting limits on creating collections

Posted by KNitin <ni...@gmail.com>.
Thanks, Shawn. Adding it to all clients and servers worked


On Tue, Apr 8, 2014 at 3:37 PM, KNitin <ni...@gmail.com> wrote:

> Thanks. I missed "the clients" part from doc. Will try and update the
> results here
>
>
>
>
> On Tue, Apr 8, 2014 at 3:27 PM, Shawn Heisey <so...@elyograg.org> wrote:
>
>> On 4/8/2014 4:13 PM, KNitin wrote:
>>
>>> I have already raised the jute.buffersize to 5Mb on the zookeeper server
>>> side but still hitting the same problem. Should i make any changes on the
>>> solr server side for this (client side changes?)
>>>
>>
>> The jute.maxbuffer system property needs to be set on everything that
>> uses Zookeeper, which includes Solr itself as well as the server side.
>>  This would also include any call to zkCli. Here's that documentation link
>> I gave you before where it says this:
>>
>> http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#Unsafe+Options
>>
>> Thanks,
>> Shawn
>>
>>
>

Re: Cannot get shard id error - Hitting limits on creating collections

Posted by KNitin <ni...@gmail.com>.
Thanks. I missed "the clients" part from doc. Will try and update the
results here




On Tue, Apr 8, 2014 at 3:27 PM, Shawn Heisey <so...@elyograg.org> wrote:

> On 4/8/2014 4:13 PM, KNitin wrote:
>
>> I have already raised the jute.buffersize to 5Mb on the zookeeper server
>> side but still hitting the same problem. Should i make any changes on the
>> solr server side for this (client side changes?)
>>
>
> The jute.maxbuffer system property needs to be set on everything that uses
> Zookeeper, which includes Solr itself as well as the server side.  This
> would also include any call to zkCli. Here's that documentation link I gave
> you before where it says this:
>
> http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#Unsafe+Options
>
> Thanks,
> Shawn
>
>

Re: Cannot get shard id error - Hitting limits on creating collections

Posted by Shawn Heisey <so...@elyograg.org>.
On 4/8/2014 4:13 PM, KNitin wrote:
> I have already raised the jute.buffersize to 5Mb on the zookeeper server
> side but still hitting the same problem. Should i make any changes on the
> solr server side for this (client side changes?)

The jute.maxbuffer system property needs to be set on everything that 
uses Zookeeper, which includes Solr itself as well as the server side.  
This would also include any call to zkCli. Here's that documentation 
link I gave you before where it says this:

http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#Unsafe+Options

Thanks,
Shawn


Re: Cannot get shard id error - Hitting limits on creating collections

Posted by KNitin <ni...@gmail.com>.
Thanks, Shawn

I have already raised the jute.buffersize to 5Mb on the zookeeper server
side but still hitting the same problem. Should i make any changes on the
solr server side for this (client side changes?)


On Tue, Apr 8, 2014 at 9:09 AM, Shawn Heisey <so...@elyograg.org> wrote:

> On 4/8/2014 9:48 AM, KNitin wrote:
>
>>   I am running solr cloud 4.3.1 (there is a plan to upgrade to later
>> versions but that would take a few months). I noticed a very peculiar solr
>> behavior in solr that beyond *2496* cores I am unable to create any more
>>
>> collections due to this error
>>
>> *Could not get shard id for core.....*
>>
>>
>> I also noticed in the solr "tree" view that the overseer's collections
>> work
>> queue gets stuck
>>   ( /overseer
>> collection-queue-work
>> qn-0000000360
>> qn-0000000362
>> qn-0000000364)
>>
>> The test results are as follows.
>>
>> With 8 shards and 2 replicas, I can create 156 collections (and then hit
>> the above error)
>> With 4 shards and 2 replicas, I can create 312 collections (and then hit
>> the above error)
>> With 2 shards and 2 replicas, I can create 624 collections (and then hit
>> the above error)
>>
>> The total no of cores is 2496 in all the above cases.
>>
>> I am unable to create any more collections after this due to cannot get
>> shard id error?
>>
>> Is this a known bug or is there a work around for this? Is it fixed in
>> future releases?
>>
>
> You're probably hitting configuration limits, which are set high enough
> for "typical" scalability requirements.  Certain things need to be
> increased for extreme scalability.  I don't know about all of them, so this
> is likely an incomplete list:
>
> One of them, most likely the one involved here, is the maximum size of the
> zookeeper database - the jute.maxbuffer system property, which defaults to
> one megabyte.  Another is the maximum number of threads allowed by the
> servlet container. In Jetty, this is the maxThreads parameter.  Another is
> the various connection and thread pool settings in the ShardHandler config.
>
> http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#Unsafe+Options
> http://wiki.apache.org/solr/SolrConfigXml#Configuration_
> of_Shard_Handlers_for_Distributed_searches
> https://cwiki.apache.org/confluence/display/solr/
> Moving+to+the+New+solr.xml+Format
>
> As usual, I could be entirely incorrect about everything I'm saying.
>
> Thanks,
> Shawn
>
>

Re: Cannot get shard id error - Hitting limits on creating collections

Posted by Shawn Heisey <so...@elyograg.org>.
On 4/8/2014 9:48 AM, KNitin wrote:
>   I am running solr cloud 4.3.1 (there is a plan to upgrade to later
> versions but that would take a few months). I noticed a very peculiar solr
> behavior in solr that beyond *2496* cores I am unable to create any more
> collections due to this error
>
> *Could not get shard id for core.....*
>
> I also noticed in the solr "tree" view that the overseer's collections work
> queue gets stuck
>   ( /overseer
> collection-queue-work
> qn-0000000360
> qn-0000000362
> qn-0000000364)
>
> The test results are as follows.
>
> With 8 shards and 2 replicas, I can create 156 collections (and then hit
> the above error)
> With 4 shards and 2 replicas, I can create 312 collections (and then hit
> the above error)
> With 2 shards and 2 replicas, I can create 624 collections (and then hit
> the above error)
>
> The total no of cores is 2496 in all the above cases.
>
> I am unable to create any more collections after this due to cannot get
> shard id error?
>
> Is this a known bug or is there a work around for this? Is it fixed in
> future releases?

You're probably hitting configuration limits, which are set high enough 
for "typical" scalability requirements.  Certain things need to be 
increased for extreme scalability.  I don't know about all of them, so 
this is likely an incomplete list:

One of them, most likely the one involved here, is the maximum size of 
the zookeeper database - the jute.maxbuffer system property, which 
defaults to one megabyte.  Another is the maximum number of threads 
allowed by the servlet container. In Jetty, this is the maxThreads 
parameter.  Another is the various connection and thread pool settings 
in the ShardHandler config.

http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#Unsafe+Options
http://wiki.apache.org/solr/SolrConfigXml#Configuration_of_Shard_Handlers_for_Distributed_searches
https://cwiki.apache.org/confluence/display/solr/Moving+to+the+New+solr.xml+Format

As usual, I could be entirely incorrect about everything I'm saying.

Thanks,
Shawn