You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Derek Poh <dp...@globalsources.com> on 2018/03/26 08:22:59 UTC

edit gc parameters in solr.in.sh or solr?

Hi

From your experience, would like to know if It is advisable to change 
the gc parameters in solr.in.sh or solrfile?
It is mentioned in the documentation to edit solr.in.sh but would like 
toknow which file you actually edit.

I am using Solr 6.6.2at the moment.

Regards,
Derek


----------------------
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.

Re: edit gc parameters in solr.in.sh or solr?

Posted by Shawn Heisey <ap...@elyograg.org>.
On 3/26/2018 6:41 PM, Derek Poh wrote:
> On my installation, "solr.in.sh" is in solr-6.6.2/bin directory. It is 
> recommended to place the file in /etc/default?
>
> Regarding the "solrfile", I was referring to the file "solr". Sorry 
> for the typo.
> The file "solr" is not edited normally?

I've redirected my reply to this private message back to the list.

http://people.apache.org/~hossman/#private_q

If the active solr.in.sh file is not in /etc/default, that means that 
the service installer script was NOT used.  I strongly recommend using 
the service installer script on systems that will support it.

https://lucene.apache.org/solr/guide/6_6/taking-solr-to-production.html

Moving the location of that file manually to /etc/default might not 
actually work.  I have not reviewed the script recently enough to know 
for sure.  It *might* work.  My memory is fuzzy, but checking that 
location might be part of the bin/solr script already.  Whether that 
memory is correct or not, I still recommend running the service 
installer script.

The bin/solr script should not be edited unless you're fixing a bug in 
that script.  A lot of Solr's startup settings can be changed with 
solr.in.sh.  More settings are being added over time as Solr evolves.

Thanks,
Shawn


Re: edit gc parameters in solr.in.sh or solr?

Posted by Shawn Heisey <ap...@elyograg.org>.
On 3/26/2018 2:22 AM, Derek Poh wrote:
> From your experience, would like to know if It is advisable to change 
> the gc parameters in solr.in.sh or solrfile?
> It is mentioned in the documentation to edit solr.in.sh but would like 
> toknow which file you actually edit.

You need a GC_TUNE variable in solr.in.sh.  The java commandline 
parameters specified there will replace the standard GC tuning 
parameters.  If recommendations are followed, this file will be found in 
/etc/default, and could have "solr" in the filename replaced with 
something different, specifically the name given to the installed 
service.  On my dev server, it is named "solr6.in.sh".

What is the "solrfile" you have referenced?  I've not heard of this.

Thanks,
Shawn


Re: edit gc parameters in solr.in.sh or solr?

Posted by Walter Underwood <wu...@wunderwood.org>.
We use the G1 collector in Java 8u131 and it works well. We are running 6.6.2. Our Solr instances do a LOT of allocation. We have long queries (25 terms average) and many unique queries.

SOLR_HEAP=8g
# Use G1 GC  -- wunder 2017-01-23
# Settings from https://wiki.apache.org/solr/ShawnHeisey
GC_TUNE=" \
-XX:+UseG1GC \
-XX:+ParallelRefProcEnabled \
-XX:G1HeapRegionSize=8m \
-XX:MaxGCPauseMillis=200 \
-XX:+UseLargePages \
-XX:+AggressiveOpts \
"

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 26, 2018, at 1:22 AM, Derek Poh <dp...@globalsources.com> wrote:
> 
> Hi
> 
> From your experience, would like to know if It is advisable to change the gc parameters in solr.in.sh or solrfile?
> It is mentioned in the documentation to edit solr.in.sh but would like toknow which file you actually edit.
> 
> I am using Solr 6.6.2at the moment.
> 
> Regards,
> Derek
> 
> 
> ----------------------
> CONFIDENTIALITY NOTICE 
> This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 
> This e-mail and any reply to it may be monitored for security, legal, regulatory compliance and/or other appropriate reasons.


Re: edit gc parameters in solr.in.sh or solr?

Posted by Shawn Heisey <ap...@elyograg.org>.
On 3/27/2018 12:13 AM, Bernd Fehling wrote:
> may I give you the advise to _NOT_ set XX:G1HeapRegionSize.
> That is computed during JAVA start by the engine according to heap and available memory.
> A wrong set size can even a huge machine with 31GB heap and 157GB RAM force into OOM.
> Guess how I figured that out, took me about one week to locate it.

I have some notes on why I included that parameter on my wiki page.

https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_First.29_Collector

Basically, the filterCache entries were being marked as humongous
allocations, because each one for my indexes is over 2MB in size. 
Apparently it takes a full collection to collect humongous allocations
that become garbage, at least in the versions of Java that I was
experimenting with.  So without that parameter, full GCs were required,
and that will always make GC slow unless the heap size is very small.

If Oracle has made it so that humongous allocations can be collected by
the generation-specific collectors, then that parameter may no longer be
required in newer Java versions.  I do not know if this has happened.

Thanks,
Shawn


Re: edit gc parameters in solr.in.sh or solr?

Posted by Shawn Heisey <el...@elyograg.org>.
On 3/28/2018 1:06 AM, Bernd Fehling wrote:
> Humongous Allocations are not genrally bad. Sure, the G1 part for humongous allocations
> is not that performant and takes time. But just try to limit humongous allocations
> and not to avoid it under all circumstances.

I was told by Oracle engineers that humongous allocations can ONLY be 
collected by a full GC. A full GC on my servers, with an 8GB heap, can 
take 10-15 seconds.  This isn't a guess -- I've examined the GC logs.  
And that's a full stop-the-world pause for the entire collection.  If I 
increase the region size, then filterCache entries are not tagged as 
humongous, and full GCs almost never happen.

Every single filterCache entry on my large shards was more than 2MB at 
the time I was experimenting.  With an 8GB heap, the calculated G1 
region size is 4MB.  Anything that's half the region size or larger is 
considered a humongous allocation, so objects over 2MB get that tag.

Each server has three of those large shards, so when a new query goes 
into the filterCache, three large filterCache entries are created in one 
JVM.  These days, each of those shards has 31 million docs, which means 
that a filterCache entry has almost doubled in size since then.  It's 
almost 4 MB.

If one of those cores gets updated, multiple large objects become 
garbage, and because of cache autowarming (with autowarmCount set to 4), 
a few more get created.  Each of our queries typically has several 
filter queries in it.  Some of them don't really change, but a couple of 
them are highly variable from user to user.  Solr burns through the heap 
pretty fast with those large filterCache entries.  When it fills up with 
garbage objects tagged humongous, the only way to clear it out is a full 
GC, because the concurrent low-pause collectors are unable to collect 
humongous objects.

Because the region size tops out at 32MB, that means that any object 
over 16MB is always humongous.  So for indexes with a very large number 
of documents per core (more than about 134 million), G1 is not a good 
choice.  Unless the filterCache is completely disabled, which isn't 
normally recommended, G1 can't be tuned to avoid full GC with an index 
that size.

Thanks,
Shawn


Re: edit gc parameters in solr.in.sh or solr?

Posted by Bernd Fehling <be...@uni-bielefeld.de>.
Hi Shawn,

the problem with heap regions is, you can't get one advantage without any disadvantage.

According to your G1 example:
4GB heap with default 2MB region size = 2048 heap regions

4GB heap with G1HeapRegionSize set to 8MB = 512 heap regions

You see, you only have 1/4th of heap regions left.
This also means that objects which are only 1MB in size occupy 8MB on heap
and therefore a whole region, which is already very low.

Humongous Allocations are not genrally bad. Sure, the G1 part for humongous allocations
is not that performant and takes time. But just try to limit humongous allocations
and not to avoid it under all circumstances.

Regards
Bernd


Am 27.03.2018 um 23:07 schrieb Shawn Heisey:
> On 3/27/2018 12:13 AM, Bernd Fehling wrote:
>> may I give you the advise to _NOT_ set XX:G1HeapRegionSize.
>> That is computed during JAVA start by the engine according to heap and available memory.
>> A wrong set size can even a huge machine with 31GB heap and 157GB RAM force into OOM.
>> Guess how I figured that out, took me about one week to locate it.
> 
> I have some notes on why I included that parameter on my wiki page.
> 
> https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_First.29_Collector
> 
> Basically, the filterCache entries were being marked as humongous
> allocations, because each one for my indexes is over 2MB in size. 
> Apparently it takes a full collection to collect humongous allocations
> that become garbage, at least in the versions of Java that I was
> experimenting with.  So without that parameter, full GCs were required,
> and that will always make GC slow unless the heap size is very small.
> 
> If Oracle has made it so that humongous allocations can be collected by
> the generation-specific collectors, then that parameter may no longer be
> required in newer Java versions.  I do not know if this has happened.
> 
> Thanks,
> Shawn
> 

Re: edit gc parameters in solr.in.sh or solr?

Posted by Bernd Fehling <be...@uni-bielefeld.de>.
Hi Walter,

may I give you the advise to _NOT_ set XX:G1HeapRegionSize.
That is computed during JAVA start by the engine according to heap and available memory.
A wrong set size can even a huge machine with 31GB heap and 157GB RAM force into OOM.
Guess how I figured that out, took me about one week to locate it.

Regards
Bernd

Am 26.03.2018 um 17:08 schrieb Walter Underwood:
> We use the G1 collector in Java 8u131 and it works well. We are running 6.6.2. Our Solr instances do a LOT of allocation. We have long queries (25 terms average) and many unique queries.
> 
> SOLR_HEAP=8g
> # Use G1 GC  -- wunder 2017-01-23
> # Settings from https://wiki.apache.org/solr/ShawnHeisey
> GC_TUNE=" \
> -XX:+UseG1GC \
> -XX:+ParallelRefProcEnabled \
> -XX:G1HeapRegionSize=8m \
> -XX:MaxGCPauseMillis=200 \
> -XX:+UseLargePages \
> -XX:+AggressiveOpts \
> "
> 
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Mar 26, 2018, at 1:22 AM, Derek Poh <dp...@globalsources.com> wrote:
>>
>> Hi
>>
>> From your experience, would like to know if It is advisable to change the gc parameters in solr.in.sh or solrfile?
>> It is mentioned in the documentation to edit solr.in.sh but would like toknow which file you actually edit.
>>
>> I am using Solr 6.6.2at the moment.
>>
>> Regards,
>> Derek
>>
>>