You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Timothy Potter <th...@gmail.com> on 2013/04/21 19:28:49 UTC

CloudSolrServer and update requests

Today is my day for conceptual questions ;-)

>From what I understand, CloudSolrServer is "smart" because it uses
cluster state information pulled from Zookeeper to send update
requests to leaders instead of replicas. This provides a slight
benefit in that the update request will land on the correct leader 1/S
times on avg. where S is the shard count, which is better than 1/N
where N is the total node count in the cluster. The benefit increases
as replication factor goes up.

My question is whether the document ID-based routing logic could be
done in CloudSolrServer too? It has the document ID right there in the
request and knows the hash-ranges of each shard (from Zk).

Any background information you can share on why routing is done on the
server-side and not in CloudSolrServer? I understand that it would
need to be done on the server too to support clients that don't have
CloudSolrServer, but seems like a nice optimization to be able to do
it on the client.

Thanks.
Tim

Re: CloudSolrServer and update requests

Posted by Timothy Potter <th...@gmail.com>.

Ok, thanks for both responses - agreed on the "bigger fish" part too,
but for this one I wanted to make sure I wasn't overlooking something.
Now that I know it's a "reasonable" approach, I'll give some more
thought.

Thanks.
Tim

On Sun, Apr 21, 2013 at 11:59 AM, Erick Erickson
<er...@gmail.com> wrote:
> Same reply as your other question I think.... It's on the drawing
> board but hasn't percolated up past other urgent issues...
>
> Erick
>
> On Sun, Apr 21, 2013 at 1:28 PM, Timothy Potter <th...@gmail.com> wrote:
>> Today is my day for conceptual questions ;-)
>>
>> From what I understand, CloudSolrServer is "smart" because it uses
>> cluster state information pulled from Zookeeper to send update
>> requests to leaders instead of replicas. This provides a slight
>> benefit in that the update request will land on the correct leader 1/S
>> times on avg. where S is the shard count, which is better than 1/N
>> where N is the total node count in the cluster. The benefit increases
>> as replication factor goes up.
>>
>> My question is whether the document ID-based routing logic could be
>> done in CloudSolrServer too? It has the document ID right there in the
>> request and knows the hash-ranges of each shard (from Zk).
>>
>> Any background information you can share on why routing is done on the
>> server-side and not in CloudSolrServer? I understand that it would
>> need to be done on the server too to support clients that don't have
>> CloudSolrServer, but seems like a nice optimization to be able to do
>> it on the client.
>>
>> Thanks.
>> Tim

Re: CloudSolrServer and update requests

Posted by Erick Erickson <er...@gmail.com>.

Same reply as your other question I think.... It's on the drawing
board but hasn't percolated up past other urgent issues...

Erick

On Sun, Apr 21, 2013 at 1:28 PM, Timothy Potter <th...@gmail.com> wrote:
> Today is my day for conceptual questions ;-)
>
> From what I understand, CloudSolrServer is "smart" because it uses
> cluster state information pulled from Zookeeper to send update
> requests to leaders instead of replicas. This provides a slight
> benefit in that the update request will land on the correct leader 1/S
> times on avg. where S is the shard count, which is better than 1/N
> where N is the total node count in the cluster. The benefit increases
> as replication factor goes up.
>
> My question is whether the document ID-based routing logic could be
> done in CloudSolrServer too? It has the document ID right there in the
> request and knows the hash-ranges of each shard (from Zk).
>
> Any background information you can share on why routing is done on the
> server-side and not in CloudSolrServer? I understand that it would
> need to be done on the server too to support clients that don't have
> CloudSolrServer, but seems like a nice optimization to be able to do
> it on the client.
>
> Thanks.
> Tim

Re: CloudSolrServer and update requests

Posted by Mark Miller <ma...@gmail.com>.

https://issues.apache.org/jira/browse/SOLR-3154

- Mark

On Apr 21, 2013, at 1:28 PM, Timothy Potter <th...@gmail.com> wrote:

> Today is my day for conceptual questions ;-)
> 
> From what I understand, CloudSolrServer is "smart" because it uses
> cluster state information pulled from Zookeeper to send update
> requests to leaders instead of replicas. This provides a slight
> benefit in that the update request will land on the correct leader 1/S
> times on avg. where S is the shard count, which is better than 1/N
> where N is the total node count in the cluster. The benefit increases
> as replication factor goes up.
> 
> My question is whether the document ID-based routing logic could be
> done in CloudSolrServer too? It has the document ID right there in the
> request and knows the hash-ranges of each shard (from Zk).
> 
> Any background information you can share on why routing is done on the
> server-side and not in CloudSolrServer? I understand that it would
> need to be done on the server too to support clients that don't have
> CloudSolrServer, but seems like a nice optimization to be able to do
> it on the client.
> 
> Thanks.
> Tim