You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Shawn Heisey <ap...@elyograg.org> on 2021/10/11 20:53:41 UTC

Requesting help from the community on GC config testing

I would like to request help from the community on something.  I'm not 
in a position to do the kind of testing that I want, as I no longer have 
access to Solr servers with large amounts of data.

What I want to test is the Sheandoah garbage collector.  I've done some 
testing on my own, but the index is very small (629MB) and so is the 
heap size (512MB).

Here is a GC log from my most recent test:

https://www.dropbox.com/s/8cbncuax7kv0x9c/solr_gc.log?dl=0

For this test, I deleted all the GC logs, restarted Solr, deleted all 
docs and optimized the index so it had 0 segments, and then asked 
dovecot (POP/IMAP server) to do a full reindex.  At this moment there 
are 158905 docs in the index.  Then I grabbed the GC log linked above 
and had the gceasy.io website analyze it.  The GC performance looks very 
good ... but with the heap at only 512MB, even a bad GC config would 
probably look good.  Here are the GC settings that I put in 
/etc/default/solr.in.sh:

GC_TUNE=" \
   -XX:+AlwaysPreTouch \
   -XX:+UseNUMA \
   -XX:+UseShenandoahGC \
   -XX:+ParallelRefProcEnabled \
   -XX:+UseStringDeduplication \
   -XX:ParallelGCThreads=2 \
"

I'm running this on a t3a.medium EC2 instance, which only has 2 CPUs, so 
I limited the GC threads to 2.  This instance is my personal mail 
server.  If anyone brave enough to help me test wants to try it, and you 
have a server with a LOT of cores, you could increase the number of threads.

What I need to see is the GC logs that Solr creates, along with some 
details about the indexes on the server that generated the log.  Best 
results will come from very busy servers that have a large index ... 
hoping for 100GB or more of index per Solr core, and a max heap size at 
least 4GB.  If you want to get really adventurous, you could gather GC 
logs with the default GC settings (which in later Solr versions is G1GC) 
and with Shenandoah.

A recent version of Java 11 is required to enable the Shenandoah 
collector.  I think it was made available in 11.0.3.  I am running 
OpenJDK 11.0.11, the latest available on Ubuntu 20.04 LTS.

I'm not advocating that anyone try this on a mission-critical production 
system, but I would not expect it to cause problems on such a setup.  
Use your own judgement.

Thanks,
Shawn


Re: Requesting help from the community on GC config testing

Posted by dinesh naik <di...@gmail.com>.
Hi Shawn,
I can try to help you with the test.
I have a 6 solr node cluster ( machines with 4 cores and 28GB RAM, 250 GB
hard disk ) running on OpenJDK 11.0.11) having 2 shards and 3 replica's
each.

Currently, the cluster has 27GB of data per core, I can ingest more data to
make it around 100GB per core.
The nodes have 20GB heap as of now, will change it to 4 GB for the test.

Here is the current GC settings from my cluster, please let me know if we
need to change anything before the test part from heap size?

-XX:+AggressiveOpts-XX:+HeapDumpOnOutOfMemoryError
-XX:+ParallelRefProcEnabled-XX:+PerfDisableSharedMem-XX:+UseG1GC
-XX:+UseLargePages-XX:-OmitStackTraceInFastThrow-XX:ConcGCThreads=4
-XX:G1ReservePercent=18-XX:HeapDumpPath=/app/solrdata8/logs/heapdump
-XX:InitiatingHeapOccupancyPercent=50-XX:MaxGCPauseMillis=250
-XX:MaxNewSize=4G-XX:OnOutOfMemoryError=/app/solr8/bin/oom_solr.sh 8983
/app/solrdata8/logs-XX:ParallelGCThreads=8
-Xlog:gc*:file=/app/solrdata8/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
-Xms20g-Xmx20g-Xss256k


On Tue, Oct 12, 2021 at 2:24 AM Shawn Heisey <ap...@elyograg.org> wrote:

> I would like to request help from the community on something.  I'm not
> in a position to do the kind of testing that I want, as I no longer have
> access to Solr servers with large amounts of data.
>
> What I want to test is the Sheandoah garbage collector.  I've done some
> testing on my own, but the index is very small (629MB) and so is the
> heap size (512MB).
>
> Here is a GC log from my most recent test:
>
> https://www.dropbox.com/s/8cbncuax7kv0x9c/solr_gc.log?dl=0
>
> For this test, I deleted all the GC logs, restarted Solr, deleted all
> docs and optimized the index so it had 0 segments, and then asked
> dovecot (POP/IMAP server) to do a full reindex.  At this moment there
> are 158905 docs in the index.  Then I grabbed the GC log linked above
> and had the gceasy.io website analyze it.  The GC performance looks very
> good ... but with the heap at only 512MB, even a bad GC config would
> probably look good.  Here are the GC settings that I put in
> /etc/default/solr.in.sh:
>
> GC_TUNE=" \
>    -XX:+AlwaysPreTouch \
>    -XX:+UseNUMA \
>    -XX:+UseShenandoahGC \
>    -XX:+ParallelRefProcEnabled \
>    -XX:+UseStringDeduplication \
>    -XX:ParallelGCThreads=2 \
> "
>
> I'm running this on a t3a.medium EC2 instance, which only has 2 CPUs, so
> I limited the GC threads to 2.  This instance is my personal mail
> server.  If anyone brave enough to help me test wants to try it, and you
> have a server with a LOT of cores, you could increase the number of
> threads.
>
> What I need to see is the GC logs that Solr creates, along with some
> details about the indexes on the server that generated the log.  Best
> results will come from very busy servers that have a large index ...
> hoping for 100GB or more of index per Solr core, and a max heap size at
> least 4GB.  If you want to get really adventurous, you could gather GC
> logs with the default GC settings (which in later Solr versions is G1GC)
> and with Shenandoah.
>
> A recent version of Java 11 is required to enable the Shenandoah
> collector.  I think it was made available in 11.0.3.  I am running
> OpenJDK 11.0.11, the latest available on Ubuntu 20.04 LTS.
>
> I'm not advocating that anyone try this on a mission-critical production
> system, but I would not expect it to cause problems on such a setup.
> Use your own judgement.
>
> Thanks,
> Shawn
>
>

-- 
Best Regards,
Dinesh Naik

Re: Requesting help from the community on GC config testing

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/11/2021 2:53 PM, Shawn Heisey wrote:
> I would like to request help from the community on something.  I'm not 
> in a position to do the kind of testing that I want, as I no longer have 
> access to Solr servers with large amounts of data.

Because of the small scale of my Solr server, I don't think this is all 
that useful as a GC test, but here is my most recent GC log.  It covers 
36 hours.  Max heap size is 256MB, using Shenandoah.  Longest GC pause 
in this log is 65 milliseconds:

https://www.dropbox.com/s/8lgwu8f7o1jf90v/gclog-elyograg.zip?dl=0

Thanks,
Shawn