You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Jamie Johnson <je...@gmail.com> on 2011/09/26 17:42:39 UTC

Solr Cloud Number of Shard Limitation?

Is there any limitation, be it technical or for sanity reasons, on the
number of shards that can be part of a solr cloud implementation?

Re: Solr Cloud Number of Shard Limitation?

Posted by Jamie Johnson <je...@gmail.com>.

So I tested what I wrote, and man was that wrong.  I have updated it
and created a JIRA for this issue.  I also attached a patch which will
patch CloudState to address this issue.  Feedback is appreciated.

https://issues.apache.org/jira/browse/SOLR-2799

On Wed, Sep 28, 2011 at 11:46 PM, Jamie Johnson <je...@gmail.com> wrote:
> I'll definitely create a JIRA for this.  Looking at the code in
> CloudState I think we could do the following
>
> as we iterate over shardINames we check to see if the oldCloudState
> had the slice already, if so get the state from there, otherwise do
> what is already happening.  Something like the following:
>
> for (String shardIdZkPath : shardIdNames) {
>                        Slice slice = null;
>                        if(oldCloudState.liveNodesContain(shardIdZkPath)) {
>                                slice = oldCloudState.getCollectionStates().get(collection).get(shardIdZkPath);
>                        }
>
>                        if(slice == null){
>                                Map<String,ZkNodeProps> shardsMap = readShards(zkClient,
> shardIdPaths + "/" + shardIdZkPath);
>                                slice = new Slice(shardIdZkPath, shardsMap);
>                        }
>
>          slices.put(shardIdZkPath, slice);
>        }
> I don't see a need to remove the old states since we only keep the
> states that are already in oldCloudState and read new ones.  Does that
> make sense?
>
> On Wed, Sep 28, 2011 at 11:01 PM, Mark Miller <ma...@gmail.com> wrote:
>> No, we don't have any patches for it yet. You might make a JIRA issue for it?
>>
>> I think the big win is a fairly easy one - basically, right now when we update the cloud state, we look at the children of the 'shards' node, and then we read the data at each node individually. I imagine this is the part that breaks down :)
>>
>> We have already likely have most of that info though - really, you should just have to compare the children of the 'shards' node with the list we already have from the last time we got the cloud state - remove any that are no longer in the list, read the data for those not in the list, and get your new state efficiently.
>>
>> - Mark Miller
>> lucidimagination.com
>> 2011.lucene-eurocon.org | Oct 17-20 | Barcelona
>>
>> On Sep 28, 2011, at 10:35 PM, Jamie Johnson wrote:
>>
>>> Thanks Mark found the TODO in ZkStateReader.java
>>>
>>> // TODO: - possibly: incremental update rather than reread everything
>>>
>>> Was there a patch they provided back to address this?
>>>
>>> On Tue, Sep 27, 2011 at 9:20 PM, Mark Miller <ma...@gmail.com> wrote:
>>>>
>>>> On Sep 26, 2011, at 11:42 AM, Jamie Johnson wrote:
>>>>
>>>>> Is there any limitation, be it technical or for sanity reasons, on the
>>>>> number of shards that can be part of a solr cloud implementation?
>>>>
>>>>
>>>> The loggly guys ended up hitting a limit somewhere. Essentially, whenever the cloud state is updated, info is read about each shard to update the state (from zookeeper). There is a TODO that I put in there that says something like, "consider updating this incrementally" - usually the data on most shards has not changed, so no reason to read it all. They implemented that today in their own code, but we have not yet done this in trunk. What that places the upper limit at, I don't know - I imagine it takes quite a few shards before it ends up being too much of a problem - they shard by user I believe, so lot's of shards.
>>>>
>>>>
>>>> - Mark Miller
>>>> lucidimagination.com
>>>> 2011.lucene-eurocon.org | Oct 17-20 | Barcelona
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: Solr Cloud Number of Shard Limitation?

Posted by Jamie Johnson <je...@gmail.com>.

I'll definitely create a JIRA for this.  Looking at the code in
CloudState I think we could do the following

as we iterate over shardINames we check to see if the oldCloudState
had the slice already, if so get the state from there, otherwise do
what is already happening.  Something like the following:

for (String shardIdZkPath : shardIdNames) {
			Slice slice = null;
			if(oldCloudState.liveNodesContain(shardIdZkPath)) {
				slice = oldCloudState.getCollectionStates().get(collection).get(shardIdZkPath);
			}
			
			if(slice == null){
				Map<String,ZkNodeProps> shardsMap = readShards(zkClient,
shardIdPaths + "/" + shardIdZkPath);
				slice = new Slice(shardIdZkPath, shardsMap);
			}

          slices.put(shardIdZkPath, slice);
        }
I don't see a need to remove the old states since we only keep the
states that are already in oldCloudState and read new ones.  Does that
make sense?

On Wed, Sep 28, 2011 at 11:01 PM, Mark Miller <ma...@gmail.com> wrote:
> No, we don't have any patches for it yet. You might make a JIRA issue for it?
>
> I think the big win is a fairly easy one - basically, right now when we update the cloud state, we look at the children of the 'shards' node, and then we read the data at each node individually. I imagine this is the part that breaks down :)
>
> We have already likely have most of that info though - really, you should just have to compare the children of the 'shards' node with the list we already have from the last time we got the cloud state - remove any that are no longer in the list, read the data for those not in the list, and get your new state efficiently.
>
> - Mark Miller
> lucidimagination.com
> 2011.lucene-eurocon.org | Oct 17-20 | Barcelona
>
> On Sep 28, 2011, at 10:35 PM, Jamie Johnson wrote:
>
>> Thanks Mark found the TODO in ZkStateReader.java
>>
>> // TODO: - possibly: incremental update rather than reread everything
>>
>> Was there a patch they provided back to address this?
>>
>> On Tue, Sep 27, 2011 at 9:20 PM, Mark Miller <ma...@gmail.com> wrote:
>>>
>>> On Sep 26, 2011, at 11:42 AM, Jamie Johnson wrote:
>>>
>>>> Is there any limitation, be it technical or for sanity reasons, on the
>>>> number of shards that can be part of a solr cloud implementation?
>>>
>>>
>>> The loggly guys ended up hitting a limit somewhere. Essentially, whenever the cloud state is updated, info is read about each shard to update the state (from zookeeper). There is a TODO that I put in there that says something like, "consider updating this incrementally" - usually the data on most shards has not changed, so no reason to read it all. They implemented that today in their own code, but we have not yet done this in trunk. What that places the upper limit at, I don't know - I imagine it takes quite a few shards before it ends up being too much of a problem - they shard by user I believe, so lot's of shards.
>>>
>>>
>>> - Mark Miller
>>> lucidimagination.com
>>> 2011.lucene-eurocon.org | Oct 17-20 | Barcelona
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>
>
>
>
>
>
>
>
>
>
>
>
>

Re: Solr Cloud Number of Shard Limitation?

Posted by Mark Miller <ma...@gmail.com>.

No, we don't have any patches for it yet. You might make a JIRA issue for it?

I think the big win is a fairly easy one - basically, right now when we update the cloud state, we look at the children of the 'shards' node, and then we read the data at each node individually. I imagine this is the part that breaks down :)

We have already likely have most of that info though - really, you should just have to compare the children of the 'shards' node with the list we already have from the last time we got the cloud state - remove any that are no longer in the list, read the data for those not in the list, and get your new state efficiently.

- Mark Miller
lucidimagination.com
2011.lucene-eurocon.org | Oct 17-20 | Barcelona

On Sep 28, 2011, at 10:35 PM, Jamie Johnson wrote:

> Thanks Mark found the TODO in ZkStateReader.java
> 
> // TODO: - possibly: incremental update rather than reread everything
> 
> Was there a patch they provided back to address this?
> 
> On Tue, Sep 27, 2011 at 9:20 PM, Mark Miller <ma...@gmail.com> wrote:
>> 
>> On Sep 26, 2011, at 11:42 AM, Jamie Johnson wrote:
>> 
>>> Is there any limitation, be it technical or for sanity reasons, on the
>>> number of shards that can be part of a solr cloud implementation?
>> 
>> 
>> The loggly guys ended up hitting a limit somewhere. Essentially, whenever the cloud state is updated, info is read about each shard to update the state (from zookeeper). There is a TODO that I put in there that says something like, "consider updating this incrementally" - usually the data on most shards has not changed, so no reason to read it all. They implemented that today in their own code, but we have not yet done this in trunk. What that places the upper limit at, I don't know - I imagine it takes quite a few shards before it ends up being too much of a problem - they shard by user I believe, so lot's of shards.
>> 
>> 
>> - Mark Miller
>> lucidimagination.com
>> 2011.lucene-eurocon.org | Oct 17-20 | Barcelona
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>

Re: Solr Cloud Number of Shard Limitation?

Posted by Jamie Johnson <je...@gmail.com>.

Thanks Mark found the TODO in ZkStateReader.java

// TODO: - possibly: incremental update rather than reread everything

Was there a patch they provided back to address this?

On Tue, Sep 27, 2011 at 9:20 PM, Mark Miller <ma...@gmail.com> wrote:
>
> On Sep 26, 2011, at 11:42 AM, Jamie Johnson wrote:
>
>> Is there any limitation, be it technical or for sanity reasons, on the
>> number of shards that can be part of a solr cloud implementation?
>
>
> The loggly guys ended up hitting a limit somewhere. Essentially, whenever the cloud state is updated, info is read about each shard to update the state (from zookeeper). There is a TODO that I put in there that says something like, "consider updating this incrementally" - usually the data on most shards has not changed, so no reason to read it all. They implemented that today in their own code, but we have not yet done this in trunk. What that places the upper limit at, I don't know - I imagine it takes quite a few shards before it ends up being too much of a problem - they shard by user I believe, so lot's of shards.
>
>
> - Mark Miller
> lucidimagination.com
> 2011.lucene-eurocon.org | Oct 17-20 | Barcelona
>
>
>
>
>
>
>
>
>
>
>

Re: Solr Cloud Number of Shard Limitation?

Posted by Mark Miller <ma...@gmail.com>.

On Sep 26, 2011, at 11:42 AM, Jamie Johnson wrote:

> Is there any limitation, be it technical or for sanity reasons, on the
> number of shards that can be part of a solr cloud implementation?

The loggly guys ended up hitting a limit somewhere. Essentially, whenever the cloud state is updated, info is read about each shard to update the state (from zookeeper). There is a TODO that I put in there that says something like, "consider updating this incrementally" - usually the data on most shards has not changed, so no reason to read it all. They implemented that today in their own code, but we have not yet done this in trunk. What that places the upper limit at, I don't know - I imagine it takes quite a few shards before it ends up being too much of a problem - they shard by user I believe, so lot's of shards.

- Mark Miller
lucidimagination.com
2011.lucene-eurocon.org | Oct 17-20 | Barcelona

Re: Solr Cloud Number of Shard Limitation?

Posted by Erick Erickson <er...@gmail.com>.

No, not really. The administration becomes "interesting",
especially if the slaves are replicated.

One thing to be aware of is the "laggard shard" issue.
Essentially, your aggregated response is limited by the
slowest shard to respond. As you have more and more
shards, the odds that at least one of them will have an
anomalously long response time increases, and your
average response time starts to increase. This is only
really a question when you start getting to a pretty
large number of machines...

Best
Erick

On Mon, Sep 26, 2011 at 8:42 AM, Jamie Johnson <je...@gmail.com> wrote:
> Is there any limitation, be it technical or for sanity reasons, on the
> number of shards that can be part of a solr cloud implementation?
>