You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rahul Goswami <ra...@gmail.com> on 2019/07/01 21:59:36 UTC

Re: Configuration recommendation for SolrCloud

Hi Toke,

Thank you for following up. Reading back, I surely could have explained
better. Thanks for asking again.

>> What is a cluster? Is it a fully separate SolrCloud?
Yes, by cluster I mean a fully separate SolrCloud.


>> If so, does that mean you can divide your collection into (at least) 4
independent parts, where the indexing flow and the clients knows which
cluster to use?
So we can divide the documents across 4 SolrClouds each with multiple
nodes. The clients would know which SolrCloud to index to. So the answer to
your question is yes.


>>  Can it be divided further?
For the sake of maintainability and ease of configuration, we wouldn't want
to go beyond 4 SolrClouds. So at this point I would say no. But open to
ideas if you think it would be greatly advantageous.


So if we go with the 3rd configuration option we would be roughly indexing
1 billion documents (with an analyzed 'content' field possibly containing
large text) per SolrCloud.

Also I later got to know additional configurations and updated hardware
specs, so let me revise that. We would index with a replication factor of
2. Hence each SolrCloud would have 4x2=8 nodes and 1 billion x 2 =2 billion
documents indexed (with an analyzed 'content' field possibly containing
large text). We would have up to 12 GB heap space allocated per node. By
node I mean an individual Solr instance running on a certain port. Hence to
break down the specs :

For each SolrCloud:

8 nodes, each with 12 GB heap for Solr. Each node hosting 16 replicas
(cores).
2 billion documents (replication factor=2. So 1 billion unique documents)

Would SolrCloud scale well with the given configuration for a
moderate-heavy indexing and search load ?

Additional consideration: We have 4 beefy physical servers at disposal for
this deployment. If we go with 4 SolrClouds then we would have 4x8=32 nodes
(Solr instances) running across these 4 physical servers.

Any issues that you might see with this configuration or additional
considerations that I might be missing?

Thanks,
Rahul







On Sat, Jun 29, 2019 at 1:13 PM Toke Eskildsen <to...@kb.dk> wrote:

> Rahul Goswami <ra...@gmail.com> wrote:
> > We are running Solr 7.2.1 and planning for a deployment which will grow
> to
> > 4 billion documents over time. We have 16 nodes at disposal.I am thinking
> > between 3 configurations:
> >
> > 1 cluster - 16 nodes
> > vs
> > 2 clusters - 8 nodes each
> > vs
> > 4 clusters -4 nodes each
>
> You haven't got any answers. Maybe because it is a bit unclear what you're
> asking. What is a cluster? Is it a fully separate SolrCloud? If so, does
> that mean you can divide your collection into (at least) 4 independent
> parts, where the indexing flow and the clients knows which cluster to use?
> Can it be divided further?
>
> - Toke Eskildsen
>