You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Adam Sjøgren <as...@koldfront.dk.INVALID> on 2023/01/26 13:26:03 UTC

Recommended Java version on Ubuntu 20.04, GC

  Hi,


We have a Solr Cloud cluster running 8.11.2 on 16 servers that have just
been upgraded to Ubuntu 20.04 (from 18.04).

It looks like we are getting shards down/recovering more often than
previously, so I'm wondering what version of Java and which garbage
collector is recommended on Ubuntu 20.04?

On https://solr.apache.org/docs/8_11_2/SYSTEM_REQUIREMENTS.html is says
Java 8 or higher - we are running 11.0.17+8-1ubuntu2~20.04.

That page also links to
https://cwiki.apache.org/confluence/display/lucene/JavaBugs which very
clearly says not to use the G1 garbage collector and that the page isn't
outdated. Hm.


  Best regards,

    Adam

-- 
 "The laws of perspective have been repealed!               Adam Sjøgren
  Objects no longer diminish in size with distance!"   asjo@koldfront.dk


Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Adam Sjøgren <as...@koldfront.dk.INVALID>.
Jan writes:

> Can you say something about the root cause for solr processes to
> crash? Are they killed by Linux?

They are usually not crashing, I "just" see shards go into down/recovery
state.

Some of the time they recover without intervention, some of the time a
shard or two stays down, and only recover if I restart the affected Solr
instance.

Usually the load on the affected server is high and following the log
files does show more GC-activity when this happens.

> Which version of Java did you run on 18.04?

We were running Java 11 on Ubuntu 18.04 as well, so that would have been
11.0.17+8-1ubuntu2~18.04.

> Other changes done at the same time, such as OS-level settings for
> ulimits, vm.max_map_count, swappiness etc.

No other changes - just everything that the Ubuntu 18.04 to 20.04
upgrade did, including the Linux kernel version. Which makes it a little
hard to pin down, so I was fishing after known Ubuntu 20.04 gotchas :-)

> If you have not fine-tuned your JVM settings, it is recommended to run
> with the default JVM/GC settings. G1 in latest Java-11 shoud be fine.

A long time ago we adjusted according to the suggestions on
https://cwiki.apache.org/confluence/display/solr/ShawnHeisey#ShawnHeisey-G1(GarbageFirst)Collector

So we are running with:

    # These GC settings have shown to work well for a number of common Solr workloads
    GC_TUNE=" \
    -XX:+UseG1GC \
    -XX:+ParallelRefProcEnabled \
    -XX:MaxGCPauseMillis=250 \
    -XX:+UseLargePages \
    -XX:AutoBoxCacheMax=20000 \
    -XX:BiasedLockingStartupDelay=500 \
    -XX:G1HeapRegionSize=32m \
    -XX:InitiatingHeapOccupancyPercent=75 \
    -XX:+HeapDumpOnOutOfMemoryError \
    -XX:HeapDumpPath=/var/log/solr \
    "

and SOLR_HEAP="12g".


  Best regards,

    Adam

-- 
 "Tears in waves                                            Adam Sjøgren
  Lights on fire"                                      asjo@koldfront.dk


Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Jan Høydahl <ja...@cominvent.com>.
Can you say something about the root cause for solr processes to crash? Are they killed by Linux?
Which version of Java did you run on 18.04? Other changes done at the same time, such as OS-level settings for ulimits, vm.max_map_count, swappiness etc.

If you have not fine-tuned your JVM settings, it is recommended to run with the default JVM/GC settings. G1 in latest Java-11 shoud be fine.

Jan


> 26. jan. 2023 kl. 14:26 skrev Adam Sjøgren <as...@koldfront.dk.INVALID>:
> 
>  Hi,
> 
> 
> We have a Solr Cloud cluster running 8.11.2 on 16 servers that have just
> been upgraded to Ubuntu 20.04 (from 18.04).
> 
> It looks like we are getting shards down/recovering more often than
> previously, so I'm wondering what version of Java and which garbage
> collector is recommended on Ubuntu 20.04?
> 
> On https://solr.apache.org/docs/8_11_2/SYSTEM_REQUIREMENTS.html is says
> Java 8 or higher - we are running 11.0.17+8-1ubuntu2~20.04.
> 
> That page also links to
> https://cwiki.apache.org/confluence/display/lucene/JavaBugs which very
> clearly says not to use the G1 garbage collector and that the page isn't
> outdated. Hm.
> 
> 
>  Best regards,
> 
>    Adam
> 
> -- 
> "The laws of perspective have been repealed!               Adam Sjøgren
>  Objects no longer diminish in size with distance!"   asjo@koldfront.dk
> 


Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Shawn Heisey <el...@elyograg.org>.
On 2/6/23 17:15, Wei wrote:
> Do you see ZGC having better query performance than G1?  We are migrating
> from Solr 8 / JDK 11/ G1  to Solr 9 with JDK 17 / G1.  Also are there any
> notable performance changes from 9.0.0 to the latest 9.1.1?

I don't have enough information available to answer the last question. 
I don't really have access to Solr installs any more except my tiny 
little personal install.  One node, 200K docs, 800MB index size.  I've 
got it running with a 1GB heap, which is much bigger than it needs.

ZGC's individual pauses are VERY short compared to G1, but it does a lot 
more collections.  I don't have GC logs for detailed comparison any 
more, but if I recall correctly, the total time spent in GC was less for 
ZGC compared to G1.  General latency would be better, but I don't see it 
doing a lot for the speed of an individual query.

I can say for sure that on my tiny install, indexing is a little bit 
slower with ZGC.  I suspect that in a setup doing highly threaded 
indexing, that it probably wouldn't be substantially affected and might 
be faster than G1.  My indexing with dovecot is mostly single-threaded. 
It does index multiple mailboxes simultaneously, but the vast majority 
of the messages are in one mailbox (mine).

Thanks,
Shawn

Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Wei <we...@gmail.com>.
Hi Shawn,

Do you see ZGC having better query performance than G1?  We are migrating
from Solr 8 / JDK 11/ G1  to Solr 9 with JDK 17 / G1.  Also are there any
notable performance changes from 9.0.0 to the latest 9.1.1?

Thanks,
Wei

On Mon, Feb 6, 2023 at 1:34 PM Shawn Heisey <ap...@elyograg.org> wrote:

> On 2/6/23 14:24, Paul Ryder wrote:
> > I thought that was Solr 9 only?
>
> Solr 8.x can still use CaffeineCache, and it's the best choice in most
> circumstances, so it's not something I would risk, even if it turns out
> that it's not affected.  In 9.x, all the older cache implementations are
> gone, leaving only Caffeine, so it's a lot more likely for users to run
> into it with 9.x.
>
> Thanks,
> Shawn
>

Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Shawn Heisey <ap...@elyograg.org>.
On 2/6/23 14:24, Paul Ryder wrote:
> I thought that was Solr 9 only?

Solr 8.x can still use CaffeineCache, and it's the best choice in most 
circumstances, so it's not something I would risk, even if it turns out 
that it's not affected.  In 9.x, all the older cache implementations are 
gone, leaving only Caffeine, so it's a lot more likely for users to run 
into it with 9.x.

Thanks,
Shawn

Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Paul Ryder <pa...@greenwaymediatech.com>.
I thought that was Solr 9 only?

Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: Shawn Heisey <ap...@elyograg.org>
Sent: Monday, February 6, 2023 9:19:18 PM
To: users@solr.apache.org <us...@solr.apache.org>
Subject: Re: Recommended Java version on Ubuntu 20.04, GC

On 2/6/23 14:06, Paul Ryder wrote:
> Hi Shawn, what is the known bug in Java 17/Solr 8.x? ta! Paul

https://issues.apache.org/jira/browse/SOLR-16463

Thanks,
Shawn

Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Shawn Heisey <ap...@elyograg.org>.
On 2/6/23 14:06, Paul Ryder wrote:
> Hi Shawn, what is the known bug in Java 17/Solr 8.x? ta! Paul

https://issues.apache.org/jira/browse/SOLR-16463

Thanks,
Shawn

Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Paul Ryder <pa...@greenwaymediatech.com>.
Hi Shawn, what is the known bug in Java 17/Solr 8.x? ta! Paul

Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: Shawn Heisey <el...@elyograg.org>
Sent: Monday, February 6, 2023 8:59:07 PM
To: users@solr.apache.org <us...@solr.apache.org>
Subject: Re: Recommended Java version on Ubuntu 20.04, GC

On 2/6/23 13:18, Adam Sjøgren wrote:
> Thanks for the input - just to me sure I understand you correctly, using
> ZGC with Java 11 for Solr 8.11.2 on Ubuntu 20.04 might be worth a try
> when we are having problems with G1GC?
>
> Or would you run Java 17?

For Solr 8.x I would stick with Java 11.  8.x has Java 8 as a minimum
requirement, so I wouldn't go too far up in major versions.  Also there
is a known bug in Java 17 that I think would probably affect Solr 8.x,
and 8.x doesn't have a workaround for it.

I don't think I would trust Java 17 with anything older that Solr 9.1.1.
  9.1.0 might be OK, but since there is a new patch release I'd prefer that.

The same ZGC settings I gave earlier do work with Java 11.

Thanks,
Shawn

Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Adam Sjøgren <as...@koldfront.dk.INVALID>.
Just to give some feedback on this thread:

After upgrading from Ubuntu 18.04 to Ubuntu 20.04 our 80 node Solr
8.11.2 / Java 11 cluster became less stable, with shards regularly
switching into down or recovering state, and with manual intervention
sometimes needed (restart of Solr instances to recover).

We switched from G1GC to ZGC 10 days ago, after the discussion in this
thread, and since then we have only had two or three times where shards
have gone into down or recovering state, and we haven't had to intervene
manually.

So far the - I guess - lower latency of ZGC has been beneficial in our
setup.


  Best regards,

    Adam

-- 
 "Lovely day for a stroll, eh Hobbes?                       Adam Sjøgren
  I certainly enjoy my afternoon constitutional!"      asjo@koldfront.dk
 "Yes, it's quite invigorating!"


Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Shawn Heisey <el...@elyograg.org>.
On 2/6/23 13:18, Adam Sjøgren wrote:
> Thanks for the input - just to me sure I understand you correctly, using
> ZGC with Java 11 for Solr 8.11.2 on Ubuntu 20.04 might be worth a try
> when we are having problems with G1GC?
> 
> Or would you run Java 17?

For Solr 8.x I would stick with Java 11.  8.x has Java 8 as a minimum 
requirement, so I wouldn't go too far up in major versions.  Also there 
is a known bug in Java 17 that I think would probably affect Solr 8.x, 
and 8.x doesn't have a workaround for it.

I don't think I would trust Java 17 with anything older that Solr 9.1.1. 
  9.1.0 might be OK, but since there is a new patch release I'd prefer that.

The same ZGC settings I gave earlier do work with Java 11.

Thanks,
Shawn

Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Adam Sjøgren <as...@koldfront.dk.INVALID>.
Shawn writes:

> On 1/27/23 00:19, Adam Sjøgren wrote:

>> Any tips on reasonable settings for ZGC on a 80 node Solr cloud with ~3B
>> documents in a handful of collections and quite a bit of updates?
>
> This is my /etc/default/solr.in.sh config on my little install for Dovecot:

[...]

> GC_TUNE=" \
>   -XX:+UnlockExperimentalVMOptions \
>   -XX:+UseZGC \

[...]

> openjdk version "17.0.5" 2022-10-18

Thanks for the input - just to me sure I understand you correctly, using
ZGC with Java 11 for Solr 8.11.2 on Ubuntu 20.04 might be worth a try
when we are having problems with G1GC?

Or would you run Java 17?


  Best regards,

    Adam

-- 
 "Yeah, the revolution starts now                           Adam Sjøgren
  In your own backyard"                                asjo@koldfront.dk


Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Shawn Heisey <ap...@elyograg.org>.
On 1/27/23 00:19, Adam Sjøgren wrote:
> Any tips on reasonable settings for ZGC on a 80 node Solr cloud with ~3B
> documents in a handful of collections and quite a bit of updates?

This is my /etc/default/solr.in.sh config on my little install for Dovecot:

SOLR_PID_DIR="/var/solr"
SOLR_HOME="/var/solr/data"
LOG4J_PROPS="/var/solr/log4j2.xml"
SOLR_LOGS_DIR="/var/solr/logs"
SOLR_PORT="8983"
SOLR_HEAP="1g"
GC_TUNE=" \
   -XX:+UnlockExperimentalVMOptions \
   -XX:+UseZGC \
   -XX:+ParallelRefProcEnabled \
   -XX:+ExplicitGCInvokesConcurrent \
   -XX:+UseStringDeduplication \
   -XX:+AlwaysPreTouch \
   -XX:+UseNUMA \
"
SOLR_JAVA_STACK_SIZE="-Xss1m"
SOLR_ULIMIT_CHECKS=false
SOLR_GZIP_ENABLED=true
SOLR_JETTY_HOST=0.0.0.0
SOLR_MODE="solrcloud"
SOLR_MODULES="analysis-extras"

I don't think there are any settings in this GC_TUNE that might vary 
based on the specific server environment.

This is the Java version:

openjdk version "17.0.5" 2022-10-18
OpenJDK Runtime Environment (build 17.0.5+8-Ubuntu-2ubuntu120.04)
OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu120.04, mixed 
mode, sharing)

Thanks,
Shawn

Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Adam Sjøgren <as...@koldfront.dk.INVALID>.
Shawn writes:

> Solr 8.x and later uses G1 by default.  I haven't seen any problems
> with it, even though Lucene recommends not using it.

Ok, good.

> For 8.x, I would use OpenJDK 11.  For 9.x, OpenJDK 17.

Sounds like we're on the right track version-wise then.

> I would choose the ZGC collector in most cases.  But I have noticed
> that indexing throughput is a little bit better with G1 than ZGC.  If
> every little bit of indexing speed is critical, stick with G1.

I will happily trade more stability (replicas not going into recovering
or down state) over throughput currently, so I think we will give ZGC a
go.

Any tips on reasonable settings for ZGC on a 80 node Solr cloud with ~3B
documents in a handful of collections and quite a bit of updates?


Thanks for the input, all!


  Best regards,

    Adam

-- 
 "Why do you put your answer below the question and         Adam Sjøgren
  trim the quoted text?" "It's about having minimal    asjo@koldfront.dk
  courtesy to your readers by not forcing them to
  re-read stuff they just read."


Re: Recommended Java version on Ubuntu 20.04, GC

Posted by Shawn Heisey <ap...@elyograg.org>.
On 1/26/23 06:26, Adam Sjøgren wrote:
> We have a Solr Cloud cluster running 8.11.2 on 16 servers that have just
> been upgraded to Ubuntu 20.04 (from 18.04).
> 
> It looks like we are getting shards down/recovering more often than
> previously, so I'm wondering what version of Java and which garbage
> collector is recommended on Ubuntu 20.04?
> 
> On https://solr.apache.org/docs/8_11_2/SYSTEM_REQUIREMENTS.html is says
> Java 8 or higher - we are running 11.0.17+8-1ubuntu2~20.04.
> 
> That page also links to
> https://cwiki.apache.org/confluence/display/lucene/JavaBugs which very
> clearly says not to use the G1 garbage collector and that the page isn't
> outdated. Hm.

Solr 8.x and later uses G1 by default.  I haven't seen any problems with 
it, even though Lucene recommends not using it.

For 8.x, I would use OpenJDK 11.  For 9.x, OpenJDK 17.

17 might work with 8.x, but it's minimum requirement is Java 8.  Jumping 
a lot of major Java versions beyond the minimum requirement might cause 
problems.  The latest version of Solr has a workaround for a problem 
with Java 17, but I don't think that workaround is there for 8.11.x.

I would choose the ZGC collector in most cases.  But I have noticed that 
indexing throughput is a little bit better with G1 than ZGC.  If every 
little bit of indexing speed is critical, stick with G1.

Thanks,
Shawn