You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Bram Van Dam <br...@intix.eu> on 2020/09/21 10:06:23 UTC

Many small instances, or few large instances?

Hey folks,

I've always heard that it's preferred to have a SolrCloud setup with
many smaller instances under the CompressedOops limit in terms of
memory, instead of having larger instances with, say, 256GB worth of
heap space.

Does this recommendation still hold true with newer garbage collectors?
G1 is pretty fast on large heaps. ZGC and Shenandoah promise even more
improvements.

Thx,

 - Bram

Re: Many small instances, or few large instances?

Posted by Bram Van Dam <br...@intix.eu>.
Thanks, Erick. I should probably keep a tally of how many beers I owe
you ;-)


On 21/09/2020 14:50, Erick Erickson wrote:
> In a word, yes. G1GC still has spikes, and the larger the heap the more likely you’ll be to encounter them. So having multiple JVMS rather than one large JVM with a ginormous heap is still recommended.
> 
> I’ve seen some cases that used the Zing zero-pause product with very large heaps, but they were forced into that by the project requirements.
> 
> That said, when Java has a ZCG option, I think we’re in uncharted territory. I frankly don’t know what using very large heaps without having to worry about GC pauses will mean for Solr. I suspect we’ll have to do something to take advantage of that. For instance, could we support a topology where all shards had at least one replica in the same JVM that didn’t make any HTTP requests? Would that topology be common enough to support? Maybe extend “rack aware” to be “JVM aware”? Etc.
> 
> One thing that does worry me is that it’ll be easier and easier to “just throw more memory at it” rather than examine whether you’re choosing options that minimize heap requirements. And Lucene has done a lot to move memory to the OS rather than heap (e.g. docValues, MMapDirectory etc.).
> 
> Anyway, carry on as before for the nonce.
> 
> Best,
> Erick
> 
>> On Sep 21, 2020, at 6:06 AM, Bram Van Dam <br...@intix.eu> wrote:
>>
>> Hey folks,
>>
>> I've always heard that it's preferred to have a SolrCloud setup with
>> many smaller instances under the CompressedOops limit in terms of
>> memory, instead of having larger instances with, say, 256GB worth of
>> heap space.
>>
>> Does this recommendation still hold true with newer garbage collectors?
>> G1 is pretty fast on large heaps. ZGC and Shenandoah promise even more
>> improvements.
>>
>> Thx,
>>
>> - Bram
> 


Re: Many small instances, or few large instances?

Posted by Erick Erickson <er...@gmail.com>.
In a word, yes. G1GC still has spikes, and the larger the heap the more likely you’ll be to encounter them. So having multiple JVMS rather than one large JVM with a ginormous heap is still recommended.

I’ve seen some cases that used the Zing zero-pause product with very large heaps, but they were forced into that by the project requirements.

That said, when Java has a ZCG option, I think we’re in uncharted territory. I frankly don’t know what using very large heaps without having to worry about GC pauses will mean for Solr. I suspect we’ll have to do something to take advantage of that. For instance, could we support a topology where all shards had at least one replica in the same JVM that didn’t make any HTTP requests? Would that topology be common enough to support? Maybe extend “rack aware” to be “JVM aware”? Etc.

One thing that does worry me is that it’ll be easier and easier to “just throw more memory at it” rather than examine whether you’re choosing options that minimize heap requirements. And Lucene has done a lot to move memory to the OS rather than heap (e.g. docValues, MMapDirectory etc.).

Anyway, carry on as before for the nonce.

Best,
Erick

> On Sep 21, 2020, at 6:06 AM, Bram Van Dam <br...@intix.eu> wrote:
> 
> Hey folks,
> 
> I've always heard that it's preferred to have a SolrCloud setup with
> many smaller instances under the CompressedOops limit in terms of
> memory, instead of having larger instances with, say, 256GB worth of
> heap space.
> 
> Does this recommendation still hold true with newer garbage collectors?
> G1 is pretty fast on large heaps. ZGC and Shenandoah promise even more
> improvements.
> 
> Thx,
> 
> - Bram