You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Manohar Sripada <ma...@gmail.com> on 2015/01/16 06:58:59 UTC

How to select the correct number of Shards in SolrCloud

Hi All,

My Setup is as follows. There are 16 nodes in my SolrCloud and 4 CPU cores
on each Solr Node VM. Each having 64 GB of RAM, out of which I have
allocated 32 GB to Solr. I have a collection which contains around 100
million Docs, which I created with 64 shards, replication factor 2, and 8
shards per node. Each shard is getting around 1.6 Million Documents.

The reason I have created 64 Shards is there are 4 CPU cores on each VM;
while querying I can make use of all the CPU cores. On an average, Solr
QTime is around 500ms here.

Last time to my other discussion, Erick suggested that I might be over
sharding, So, I tried reducing the number of shards to 32 and then 16. To
my surprise, it started performing better. It came down to 300 ms (for 32
shards) and 100 ms (for 16 shards). I haven't tested with filters and
facets yet here. But, the simple search queries had shown lot of
improvement.

So, how come the less number of shards performing better?? Is it because
there are less number of posting lists to search on OR less merges that are
happening? And how to determine the correct number of shards?

Thanks,
Manohar

Re: How to select the correct number of Shards in SolrCloud

Posted by Daniel Collins <da...@gmail.com>.
Sharding a query lets you parallel the actual querying the index part of
the search. But remember that as soon as you spread the query out more, you
also need to bring all 64 results sets back together and consolidate them
into a single result set for the end user.  At some point, the gain of
being able to search the data quicker is outweighed by the cost of this
consolidation activity.

One other point to mention, which we noticed as a by-product of some
large-scale sharding we were testing (256 shards, no caches, whole
different kettle of fish!)

The resulting query is only as fast as the slowest shard.  If you have 64
shards, and 8 shards/cores per machine, how many JVMs are you running per
machine?  If you have a single JVM with 8 cores in it, then remember as
soon as that JVM enters a GC cycle, all those 8 cores will stall
processing.  If you have a query and it needs to get results from 64 cores,
if 63 return in 100ms but the last core is in GC pause and takes 500ms,
your query will take just over 500ms.

With respect to sharding, I would never start with a large number of shards
(and 64 is reasonably large in Solr terms). You might be able to get away
without sharding at all, if that meets your latency requirements, then why
bother with the complexity of sharding?  Use those extra CPUs for
processing more QPS instead of a single query faster?

Lastly, you mentioned you allocated 32Gb to "solr", do you mean to the JVM
heap?  That's quite a lot of a 64Gb machine, you haven't left much for the
page cache.  The general rule for Solr is to make the JVM heap as small as
you can get away with, to let the OS page cache (which is needed to cache
all the index files) with as much memory as possible.

On 16 January 2015 at 05:58, Manohar Sripada <ma...@gmail.com> wrote:

> Hi All,
>
> My Setup is as follows. There are 16 nodes in my SolrCloud and 4 CPU cores
> on each Solr Node VM. Each having 64 GB of RAM, out of which I have
> allocated 32 GB to Solr. I have a collection which contains around 100
> million Docs, which I created with 64 shards, replication factor 2, and 8
> shards per node. Each shard is getting around 1.6 Million Documents.
>
> The reason I have created 64 Shards is there are 4 CPU cores on each VM;
> while querying I can make use of all the CPU cores. On an average, Solr
> QTime is around 500ms here.
>
> Last time to my other discussion, Erick suggested that I might be over
> sharding, So, I tried reducing the number of shards to 32 and then 16. To
> my surprise, it started performing better. It came down to 300 ms (for 32
> shards) and 100 ms (for 16 shards). I haven't tested with filters and
> facets yet here. But, the simple search queries had shown lot of
> improvement.
>
> So, how come the less number of shards performing better?? Is it because
> there are less number of posting lists to search on OR less merges that are
> happening? And how to determine the correct number of shards?
>
> Thanks,
> Manohar
>

Re: How to select the correct number of Shards in SolrCloud

Posted by Manohar Sripada <ma...@gmail.com>.
Thanks Daniel and Shawn for your valuable suggestions,

Daniel,
If you have a query and it needs to get results from 64 cores, if 63 return
in 100ms but the last core is in GC pause and takes 500ms, your query will
take just over 500ms.
> There is only single JVM running per machine. I will get the QTime from
each Solr Core and will check if this is the root cause.

Lastly, you mentioned you allocated 32Gb to "solr", do you mean to the
JVM heap?
That's quite a lot of a 64Gb machine, you haven't left much for the page
cache.
> Yes, 32GB to Solr's JVM heap. I wanted to enable Filter & FieldValue
Cache, as most of my search queries revolves around filters and facets.
Also, I am planning to use Document cache.

Shawn,
Each server has 8 CPU cores and 64GB of RAM.  Solr requires a 6GB heap
> Can you please tell me what is the size of your index? And what is the
size of the large cold shard?
> Can you please suggest if any tool that you use for collecting the
statistics? like the QTime's for the queries etc.

Thanks,
Manohar


On Fri, Jan 16, 2015 at 3:23 PM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 1/15/2015 10:58 PM, Manohar Sripada wrote:
> > The reason I have created 64 Shards is there are 4 CPU cores on each VM;
> > while querying I can make use of all the CPU cores. On an average, Solr
> > QTime is around 500ms here.
> >
> > Last time to my other discussion, Erick suggested that I might be over
> > sharding, So, I tried reducing the number of shards to 32 and then 16. To
> > my surprise, it started performing better. It came down to 300 ms (for 32
> > shards) and 100 ms (for 16 shards). I haven't tested with filters and
> > facets yet here. But, the simple search queries had shown lot of
> > improvement.
> >
> > So, how come the less number of shards performing better?? Is it because
> > there are less number of posting lists to search on OR less merges that
> are
> > happening? And how to determine the correct number of shards?
>
> Daniel has replied with good information.
>
> One additional problem I can think of when there are too many shards: If
> your Solr server is busy enough to have any possibility of simultaneous
> requests, then you will find that it's NOT a good idea to create enough
> shards to use all your CPU cores.  In that situation, when you do a
> single query, all your CPU cores will be in use.  When multiple queries
> happen at the same time, they have to share the available CPU resources,
> slowing them down.  With a smaller number of shards, the additional CPU
> cores can handle simultaneous queries.
>
> I have an index with nearly 100 million documents.  I've divided it into
> six large cold shards and one very small hot shard.  It's not SolrCloud.
>  I put three large shards on each of two servers, and the small shard on
> one of those two servers.  The distributed query normally happens on the
> server without the small shard.  Each server has 8 CPU cores and 64GB of
> RAM.  Solr requires a 6GB heap.
>
> My median QTime over the last 231836 queries is 25 milliseconds and my
> 95th percentile QTime is 376 milliseconds.  My query rate is pretty low
> - I've never seen Solr's statistics for the 15 minute query rate go
> above a single digit per second.
>
> Thanks,
> Shawn
>
>

Re: How to select the correct number of Shards in SolrCloud

Posted by Shawn Heisey <ap...@elyograg.org>.
On 1/15/2015 10:58 PM, Manohar Sripada wrote:
> The reason I have created 64 Shards is there are 4 CPU cores on each VM;
> while querying I can make use of all the CPU cores. On an average, Solr
> QTime is around 500ms here.
> 
> Last time to my other discussion, Erick suggested that I might be over
> sharding, So, I tried reducing the number of shards to 32 and then 16. To
> my surprise, it started performing better. It came down to 300 ms (for 32
> shards) and 100 ms (for 16 shards). I haven't tested with filters and
> facets yet here. But, the simple search queries had shown lot of
> improvement.
> 
> So, how come the less number of shards performing better?? Is it because
> there are less number of posting lists to search on OR less merges that are
> happening? And how to determine the correct number of shards?

Daniel has replied with good information.

One additional problem I can think of when there are too many shards: If
your Solr server is busy enough to have any possibility of simultaneous
requests, then you will find that it's NOT a good idea to create enough
shards to use all your CPU cores.  In that situation, when you do a
single query, all your CPU cores will be in use.  When multiple queries
happen at the same time, they have to share the available CPU resources,
slowing them down.  With a smaller number of shards, the additional CPU
cores can handle simultaneous queries.

I have an index with nearly 100 million documents.  I've divided it into
six large cold shards and one very small hot shard.  It's not SolrCloud.
 I put three large shards on each of two servers, and the small shard on
one of those two servers.  The distributed query normally happens on the
server without the small shard.  Each server has 8 CPU cores and 64GB of
RAM.  Solr requires a 6GB heap.

My median QTime over the last 231836 queries is 25 milliseconds and my
95th percentile QTime is 376 milliseconds.  My query rate is pretty low
- I've never seen Solr's statistics for the 15 minute query rate go
above a single digit per second.

Thanks,
Shawn