You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Gautam Worah <wo...@gmail.com> on 2022/01/31 23:42:28 UTC

Re: Java 17 and Lucene

Just circling back here with some of our (Product Search at Amazon)
findings on running Lucene with JDK 17 in production (to hopefully motivate
other teams to also upgrade their JDKs). We have not had any stability
issues or any hanging machines so far.

When we switched to JDK17 from JDK15, our production saw a 10-13.5%
decrease in p99.9 Search time, 11-13% decrease in CPU time.
This increase in performance was mostly driven by just the JDK17 runtime
(even with an older bytecode).
In general, we found that new generation GCs with low pauses caused a
10-25% drop in our throughput. This matches what Uwe described in an
earlier email.


Regards,
Gautam Worah.


On Wed, Oct 27, 2021 at 1:13 AM Robert Muir <rc...@gmail.com> wrote:

> Or just keep your small heap and tweak the 200ms target to be a lower
> value. It is just a JVM parameter, and it is dumb as rocks:
>
> That 200ms constant doesn't consider how many cores you have, or how
> small your heap is.
>
> On Tue, Oct 26, 2021 at 11:42 PM Michael Sokolov <ms...@gmail.com>
> wrote:
> >
> > Uwe, thanks for pointing out that ZGC is associated with all the
> > pauses you've observed. I'm feeling more confident now (since we are
> > generally using G1GC anyway, although sometimes experimenting with
> > other things). Indeed GC pauses have been much less of a problem since
> > we started using G1 to the point we don't worry about them much now. I
> > will say that with most of these so-called "pauseless" collectors it
> > seems they can really only achieve their promise by having a lot of
> > spare heap available, so if 200ms pauses are unacceptable this could
> > be a reason to run with a larger heap.
> >
> > On Tue, Oct 26, 2021 at 1:14 PM Uwe Schindler <uw...@thetaphi.de> wrote:
> > >
> > > Hi,
> > >
> > > > Is this recommended "-XX:+UseZGC options to enable ZGC." as it
> claims very
> > > > low pauses.
> > >
> > > You may have seen my prvious post that JDK 16, 17 and 18 have hangs on
> our build server. All of those hanging builds have one thing in common:
> They are running with ZGC. So my answer in short: Don’t use ZGC, which is
> anyways not a good idea with Lucene. It reduces pauses, but on the other
> hand reduces throughput by >10%. So IMHO, better use G1GC and have higher
> throughput. With G1GC the average pauses are limited, too. But I would say,
> with common workloads it is better to have 10% faster queries and maybe
> have some of them wait 200 ms because of a pause!? If you have multiple
> replicas just distribute your queries and the pause will be not really
> visible to many people. And: Why is 200 ms response time bad if it happens
> seldom?
> > >
> > > In addition: Lucene does not apply pressure to garbage collector, so
> use low heap space and use docvalues and other off-heap features of Lucene.
> Anybody running Lucene/Solr/Elasticsearch with huge heap space does
> something wrong!
> > >
> > > Uwe
> > >
> > > > For "*DY* (2021-10-19 08:14:33): Upgrade to JDK17+35" execution for
> > > > "Indexing
> > > > throughput
> > > > <https://home.apache.org/~mikemccand/lucenebench/indexing.html>"
> > > > is ZGC used for the "Indexing throughput
> > > > <https://home.apache.org/~mikemccand/lucenebench/indexing.html>"
> test?
> > > >
> > > >
> > > > On Wed, Oct 20, 2021 at 8:27 AM Michael McCandless <
> > > > lucene@mikemccandless.com> wrote:
> > > >
> > > > > Nightly benchmarks managed to succeed (once, so far) on JDK 17:
> > > > > https://home.apache.org/~mikemccand/lucenebench/
> > > > >
> > > > > No obvious performance changes on quick look.
> > > > >
> > > > > Mike McCandless
> > > > >
> > > > > http://blog.mikemccandless.com
> > > > >
> > > > >
> > > > > On Tue, Oct 19, 2021 at 8:42 PM Gautam Worah
> > > > <wo...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks for the note of caution Uwe.
> > > > > >
> > > > > > > On our Jenkins server running with AMD Ryzen CPU it happens
> quite often
> > > > > > that JDK 16, JDK 17 and JDK 18 hang during tests and stay
> unkillable
> > > > > (only
> > > > > > a hard kill with" kill -9")
> > > > > >
> > > > > > Scary stuff.
> > > > > > I'll try to reproduce the hang first and then try to get the JVM
> logs.
> > > > > I'll
> > > > > > respond back here if I find something useful.
> > > > > >
> > > > > > > Do you get this error in lucene:core:ecjLintMain and not during
> > > > > compile?
> > > > > > Then this is https://issues.apache.org/jira/browse/LUCENE-10185,
> solved
> > > > > > already.
> > > > > >
> > > > > > Ahh. I should've been clearer with my comment. The error we see
> is
> > > > > because
> > > > > > we have forked the class and have modified it a bit.
> > > > > > I just assumed that the upstream Lucene package would've also
> gotten
> > > > > errors
> > > > > > on the JDK17 build because it was untouched.
> > > > > >
> > > > > > -
> > > > > > Gautam Worah.
> > > > > >
> > > > > >
> > > > > > On Tue, Oct 19, 2021 at 5:07 AM Michael Sokolov <
> msokolov@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > > I would a bit careful: On our Jenkins server running with
> AMD Ryzen
> > > > > CPU
> > > > > > > it happens quite often that JDK 16, JDK 17 and JDK 18 hang
> during tests
> > > > > > and
> > > > > > > stay unkillable (only a hard kill with" kill -9"). Previous
> Java
> > > > > versions
> > > > > > > don't hang. It happens not all the time (about 1/4th of all
> builds) and
> > > > > > due
> > > > > > > to the fact that the JVM is unresponsible it is not possible
> to get a
> > > > > > stack
> > > > > > > trace with "jstack". If you know a way to get the stack trace,
> I'd
> > > > > happy
> > > > > > to
> > > > > > > get help.
> > > > > > >
> > > > > > > ooh that sounds scary. I suppose one could maybe get core
> dumps using
> > > > > > > the right signal and debug that way? Oh wait you said only 9
> works,
> > > > > > > darn! How about attaching using gdb? Do we maintain GC logs
> for these
> > > > > > > Jenkins builds? Maybe something suspicious would show up there.
> > > > > > >
> > > > > > > By the way the JDK is absolutely "responsible" in this
> situation! Not
> > > > > > > responsive maybe ...
> > > > > > >
> > > > > > > On Tue, Oct 19, 2021 at 4:46 AM Uwe Schindler <uwe@thetaphi.de
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > > Hey,
> > > > > > > > >
> > > > > > > > > Our team at Amazon Product Search recently ran our internal
> > > > > > benchmarks
> > > > > > > with
> > > > > > > > > JDK 17.
> > > > > > > > > We saw a ~5% increase in throughput and are in the process
> of
> > > > > > > > > experimenting/enabling it in production.
> > > > > > > > > We also plan to test the new Corretto Generational
> Shenandoah GC.
> > > > > > > >
> > > > > > > > I would a bit careful: On our Jenkins server running with
> AMD Ryzen
> > > > > CPU
> > > > > > > it happens quite often that JDK 16, JDK 17 and JDK 18 hang
> during tests
> > > > > > and
> > > > > > > stay unkillable (only a hard kill with" kill -9"). Previous
> Java
> > > > > versions
> > > > > > > don't hang. It happens not all the time (about 1/4th of all
> builds) and
> > > > > > due
> > > > > > > to the fact that the JVM is unresponsible it is not possible
> to get a
> > > > > > stack
> > > > > > > trace with "jstack". If you know a way to get the stack trace,
> I'd
> > > > > happy
> > > > > > to
> > > > > > > get help.
> > > > > > > >
> > > > > > > > Once I figured out what makes it hang, I will open issues in
> OpenJDK
> > > > > (I
> > > > > > > am OpenJDK member/editor). I have now many stuck JVMs running
> to
> > > > > analyze
> > > > > > on
> > > > > > > the server, so you're invited to help! At the moment, I have
> no time to
> > > > > > > take care, so any help is useful.
> > > > > > > >
> > > > > > > > > On a side note, the Lucene codebase still uses the
> deprecated (as
> > > > > of
> > > > > > > > > JDK17) AccessController
> > > > > > > > > in the RamUsageEstimator class.
> > > > > > > > > We suppressed the warning for now (based on recommendations
> > > > > > > > > <http://mail-archives.apache.org/mod_mbox/db-derby-
> > > > > > > > >
> > > > dev/202106.mbox/%3CJIRA.13369440.1617476525000.615331.16239514800
> > > > > > > > > 59@Atlassian.JIRA%3E>
> > > > > > > > > from the Apache Derby mailing list).
> > > > > > > >
> > > > > > > > This should not be an issue, because we compile Lucene with
> javac
> > > > > > > parameter "--release 11", so it won't show any warning that
> you need to
> > > > > > > suppress. Looks like your build system at Amazon is not the
> original
> > > > > one
> > > > > > by
> > > > > > > Lucene's Gradle, which shows no warnings at all.
> > > > > > > >
> > > > > > > > Uwe
> > > > > > > >
> > > > > > > > > Gautam Worah.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Oct 18, 2021 at 3:02 PM Michael McCandless <
> > > > > > > > > lucene@mikemccandless.com> wrote:
> > > > > > > > >
> > > > > > > > > > Also, I try to semi-aggressively upgrade Lucene's nightly
> > > > > > benchmarks
> > > > > > > to new
> > > > > > > > > > JDK releases and leave an annotation on the nightly
> charts:
> > > > > > > > > > https://home.apache.org/~mikemccand/lucenebench/
> > > > > > > > > >
> > > > > > > > > > I just now upgraded to JDK 17 and kicked off a new
> benchmark run
> > > > > > ...
> > > > > > > in a
> > > > > > > > > > few hours it should show the new data points and then
> I'll try to
> > > > > > > remember
> > > > > > > > > > to annotate it tomorrow.
> > > > > > > > > >
> > > > > > > > > > So let's see whether nightly benchmarks uncover any
> performance
> > > > > > > changes
> > > > > > > > > > from JDK17 :)
> > > > > > > > > >
> > > > > > > > > > Mike McCandless
> > > > > > > > > >
> > > > > > > > > > http://blog.mikemccandless.com
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Oct 18, 2021 at 5:36 PM Robert Muir <
> rcmuir@gmail.com>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > We test different releases on different platforms
> (e.g. Linux,
> > > > > > > Windows,
> > > > > > > > > > > Mac).
> > > > > > > > > > > We also test EA (Early Access) releases of openjdk
> versions
> > > > > > during
> > > > > > > the
> > > > > > > > > > > development process.
> > > > > > > > > > > This finds bugs before they get released.
> > > > > > > > > > >
> > > > > > > > > > > More information about versions/EA testing:
> > > > > > > https://jenkins.thetaphi.de/
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Oct 18, 2021 at 5:33 PM Kevin Rosendahl
> > > > > > > > > > > <ke...@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hello,
> > > > > > > > > > > >
> > > > > > > > > > > > We are using Lucene 8 and planning to upgrade from
> Java 11 to
> > > > > > > Java 17.
> > > > > > > > > > We
> > > > > > > > > > > > are curious:
> > > > > > > > > > > >
> > > > > > > > > > > >    - How lucene is testing against java versions.
> Are there
> > > > > > > correctness
> > > > > > > > > > > and
> > > > > > > > > > > >    performance tests using java 17?
> > > > > > > > > > > >       - Additionally, besides Java 17, how are new
> Java
> > > > > > releases
> > > > > > > > > > tested?
> > > > > > > > > > > >    - Are there any other orgs using Java 17 with
> Lucene?
> > > > > > > > > > > >    - Any other considerations we should be aware of?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Kevin Rosendahl
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > >
> ---------------------------------------------------------------------
> > > > > > > > > > > To unsubscribe, e-mail:
> > > > > java-user-unsubscribe@lucene.apache.org
> > > > > > > > > > > For additional commands, e-mail:
> > > > > > java-user-help@lucene.apache.org
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> ---------------------------------------------------------------------
> > > > > > > > To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> > > > > > > > For additional commands, e-mail:
> java-user-help@lucene.apache.org
> > > > > > > >
> > > > > > >
> > > > > > >
> ---------------------------------------------------------------------
> > > > > > > To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> > > > > > > For additional commands, e-mail:
> java-user-help@lucene.apache.org
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>