You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by da...@ontrenet.com on 2012/02/02 15:51:42 UTC

Federation in SolrCloud?

Hi,
  I want to use SolrCloud in a more federated mode rather than
replication. The failover is nice, but I am more interested in
increasing capacity of an index through horizontal scaling (shards).

How can I configure shards such that they retain their own documents and
don't replicate (or replicate to some shards and not all)? Thus, when I
search from any shard I want results from all shards (being different
results from each).

Currently, if I kill a shard (using the example provided), no search works
and it errors out.

thanks!

Re: Federation in SolrCloud?

Posted by Mark Miller <ma...@gmail.com>.

So it sounds like what you want is partial results. We don't support that yet, but there is a JIRA issue for it. 

Currently we require that the full index is available - which means if you want to survive any given instance instance going down, you need to have a replica for each shard A and shard B.

On Feb 2, 2012, at 8:15 PM, Darren Govoni wrote:

> Thanks for the reply Mark.
> 
> I did example A. One of the instances had zookeeper. If I shut down the other instance, all searches on the other (running) instance produced an error in the browser.
> I don't have the error handy but it was one line. Something like missing shard in collection IIRC.
> 
> What I'm hoping to achieve is this.
> 
> Shard A: DocA, DocB
> Shard B: DocC, DocD
> 
> if I do a query with both shards running I get DocA,DocB,DocC,DocD. If Shard B goes down, I only get DocA, DocB.
> 
> After that I will fold replication in to understand it.
> 
> On 02/02/2012 04:22 PM, Mark Miller wrote:
>> On Feb 2, 2012, at 9:51 AM, darren@ontrenet.com wrote:
>> 
>>> Hi,
>>>  I want to use SolrCloud in a more federated mode rather than
>>> replication. The failover is nice, but I am more interested in
>>> increasing capacity of an index through horizontal scaling (shards).
>>> 
>>> How can I configure shards such that they retain their own documents and
>>> don't replicate (or replicate to some shards and not all)? Thus, when I
>>> search from any shard I want results from all shards (being different
>>> results from each).
>>> 
>>> Currently, if I kill a shard (using the example provided), no search works
>>> and it errors out.
>>> 
>>> thanks!
>> 
>> What example are you trying? Are you following it exactly? In order to serve requests at least one instance has to be up for every shard - but what you describe is how things work if you have enough replicas.
>> 
>> Example A splits the index across two shards, but there are no replicas - if an instance goes down, search will not work.
>> 
>> Example B and C add replicas. This means that one instance can die per shard and you will still be able to serve requests.
>> 
>> Keep in mind that if you are running ZooKeeper with Solr (as the examples do), you have to make sure at least half the nodes running ZooKeeper are up. If that is only one node, you cannot kill that node - it will be a single point of failure unless you create a ZooKeeper ensemble.
>> 
>> - Mark Miller
>> lucidimagination.com
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 

- Mark Miller
lucidimagination.com

Re: Federation in SolrCloud?

Posted by Darren Govoni <da...@ontrenet.com>.

Thanks for the reply Mark.

I did example A. One of the instances had zookeeper. If I shut down the 
other instance, all searches on the other (running) instance produced an 
error in the browser.
I don't have the error handy but it was one line. Something like missing 
shard in collection IIRC.

What I'm hoping to achieve is this.

Shard A: DocA, DocB
Shard B: DocC, DocD

if I do a query with both shards running I get DocA,DocB,DocC,DocD. If 
Shard B goes down, I only get DocA, DocB.

After that I will fold replication in to understand it.

On 02/02/2012 04:22 PM, Mark Miller wrote:
> On Feb 2, 2012, at 9:51 AM, darren@ontrenet.com wrote:
>
>> Hi,
>>   I want to use SolrCloud in a more federated mode rather than
>> replication. The failover is nice, but I am more interested in
>> increasing capacity of an index through horizontal scaling (shards).
>>
>> How can I configure shards such that they retain their own documents and
>> don't replicate (or replicate to some shards and not all)? Thus, when I
>> search from any shard I want results from all shards (being different
>> results from each).
>>
>> Currently, if I kill a shard (using the example provided), no search works
>> and it errors out.
>>
>> thanks!
>
> What example are you trying? Are you following it exactly? In order to serve requests at least one instance has to be up for every shard - but what you describe is how things work if you have enough replicas.
>
> Example A splits the index across two shards, but there are no replicas - if an instance goes down, search will not work.
>
> Example B and C add replicas. This means that one instance can die per shard and you will still be able to serve requests.
>
> Keep in mind that if you are running ZooKeeper with Solr (as the examples do), you have to make sure at least half the nodes running ZooKeeper are up. If that is only one node, you cannot kill that node - it will be a single point of failure unless you create a ZooKeeper ensemble.
>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
>
>

Re: Federation in SolrCloud?

Posted by Mark Miller <ma...@gmail.com>.

On Feb 2, 2012, at 9:51 AM, darren@ontrenet.com wrote:

> Hi,
>  I want to use SolrCloud in a more federated mode rather than
> replication. The failover is nice, but I am more interested in
> increasing capacity of an index through horizontal scaling (shards).
> 
> How can I configure shards such that they retain their own documents and
> don't replicate (or replicate to some shards and not all)? Thus, when I
> search from any shard I want results from all shards (being different
> results from each).
> 
> Currently, if I kill a shard (using the example provided), no search works
> and it errors out.
> 
> thanks!

What example are you trying? Are you following it exactly? In order to serve requests at least one instance has to be up for every shard - but what you describe is how things work if you have enough replicas.

Example A splits the index across two shards, but there are no replicas - if an instance goes down, search will not work.

Example B and C add replicas. This means that one instance can die per shard and you will still be able to serve requests.

Keep in mind that if you are running ZooKeeper with Solr (as the examples do), you have to make sure at least half the nodes running ZooKeeper are up. If that is only one node, you cannot kill that node - it will be a single point of failure unless you create a ZooKeeper ensemble.

- Mark Miller
lucidimagination.com