You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Eric Bus <er...@websight.nl> on 2013/11/05 10:09:11 UTC

Using all SolrCloud servers in round-robin setup

Hi,

I'm currently using a SolrCloud setup with 3 nodes. The setup hosts about 50 (small) collections of a few thousand documents each. In the past, I've used collections with replicationFactor = 3. So each node has a replica of all the collections.

But now I want to add an extra node. Now, new collections can be created on server 1, 2 and 4. Or on 1, 3 and 4. I'm not specifying specific nodes at creation time. My problem is that I cannot use each node in the cluster to query my collections. If a collection is not hosted on node 2, I cannot use node 2 to query that collection. Is that normal behavior? Does that mean that I'll have to keep a list of nodes per collection (or query and cache it from zookeeper) and use that in my client application?

Currently I'm using one of the nodes as a fixed IP in my client application. This node contains all the collections, because new collections are always created on that node. But when it goes down, there is no other node that contains all the collections.

Best regards,
Eric Bus

Re: Using all SolrCloud servers in round-robin setup

Posted by Anshum Gupta <an...@anshumgupta.net>.

Hi Eric,

You can use the CloudSolrServer which is zk aware and does a reasonable
amount of intelligent stuff for you.
http://lucene.apache.org/solr/4_5_0/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrServer.html

All it takes is the zk host address so you would not have to worry about
tracking what nodes host what collection.


On Tue, Nov 5, 2013 at 2:39 PM, Eric Bus <er...@websight.nl> wrote:

> Hi,
>
> I'm currently using a SolrCloud setup with 3 nodes. The setup hosts about
> 50 (small) collections of a few thousand documents each. In the past, I've
> used collections with replicationFactor = 3. So each node has a replica of
> all the collections.
>
> But now I want to add an extra node. Now, new collections can be created
> on server 1, 2 and 4. Or on 1, 3 and 4. I'm not specifying specific nodes
> at creation time. My problem is that I cannot use each node in the cluster
> to query my collections. If a collection is not hosted on node 2, I cannot
> use node 2 to query that collection. Is that normal behavior? Does that
> mean that I'll have to keep a list of nodes per collection (or query and
> cache it from zookeeper) and use that in my client application?
>
> Currently I'm using one of the nodes as a fixed IP in my client
> application. This node contains all the collections, because new
> collections are always created on that node. But when it goes down, there
> is no other node that contains all the collections.
>
> Best regards,
> Eric Bus
>



-- 

Anshum Gupta
http://www.anshumgupta.net