You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Troy Edwards <te...@gmail.com> on 2015/11/19 05:02:36 UTC

Shards and Replicas

I am looking for some good articles/guidance on how to determine number of
shards and replicas for an index?

Thanks

Re: Shards and Replicas

Posted by Jack Krupansky <ja...@gmail.com>.
1. No more than 100 million documents per shard.
2. Number of replicas to meet your query load and to allow for the
possibility that a replica might go down. 2 or 3, maybe 4.
3. Proof of concept implementation to validate the number of documents that
will query well for a given number of documents per shard. But be aware
that a query for the sharded version will be slower than for a single-shard
implementation.

-- Jack Krupansky

On Wed, Nov 18, 2015 at 11:02 PM, Troy Edwards <te...@gmail.com>
wrote:

> I am looking for some good articles/guidance on how to determine number of
> shards and replicas for an index?
>
> Thanks
>

Re: Shards and Replicas

Posted by Shawn Heisey <ap...@elyograg.org>.
On 11/18/2015 9:02 PM, Troy Edwards wrote:
> I am looking for some good articles/guidance on how to determine number of
> shards and replicas for an index?

The long version:

https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

The short version:

There's no quick formula for figuring out how much hardware you need and
how to divide your index onto that hardware.  There are too many
variables involved.  Building a prototype (or ideally a full-scale
environment) is the only reliable way to figure it out.

Those of us who have been doing this for a long time can make educated
guesses if we are presented with the right pieces of information, but
frequently users will not know some of that information until the system
is put into production and actually handles real queries.

The only general advice I have is this:  It's probably going to cost
more than you think it will.

Thanks,
Shawn