You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Guy Laden <gu...@gmail.com> on 2016/05/13 14:18:32 UTC

garbage collector choice and tuning

Hi, We are considering CMS vs G1 for ZooKeeper running under Oracle JDK8.
The expected heap size is 4-6GB.
How workload-specific is this choice in your opinion and it what ways? E.g.
if many short sessions prefer G1, etc...
Has anybody had experience they're willing to share regarding this?
We'd also be very interested to hear about any gc-tuning flags you've had
good experience with.
Thanks

Re: garbage collector choice and tuning

Posted by Guy Laden <gu...@gmail.com>.
Okay, will do.
I'll give things some thought and then will create a separate wiki page for
GC/tuning.



On Mon, May 16, 2016 at 2:10 PM, Flavio Junqueira <fp...@apache.org> wrote:

> I have granted you access, Guy, so you can edit the wiki now.
>
> That section in troubleshooting is pretty short and it is such an
> important issue that I suggest we create a page for it, but feel free to
> suggest a different path, I've suggested it because I get this question
> outside the list a lot and I've seen a lot of insight in the mail thread.
>
> -Flavio
>
>
> > On 16 May 2016, at 12:01, Guy Laden <gu...@gmail.com> wrote:
> >
> > Hi Flavio, Not sure if that was addressed to me?
> > I noticed the GC section in
> > *https://cwiki.apache.org/confluence/display/ZOOKEEPER/Troubleshooting
> > <https://cwiki.apache.org/confluence/display/ZOOKEEPER/Troubleshooting>*
> > (I assume this is more recent than the pages at wiki.apache.org)
> > Perhaps adding a link there to this thread makes sense?
> > if you/Patrick think this is a good idea I could do this if I was granted
> > the permissions.
> > So far I've created a wiki account but I don't think I have update
> > permissions.
> > Guy
> > ​
>
>

Re: garbage collector choice and tuning

Posted by Flavio Junqueira <fp...@apache.org>.
I have granted you access, Guy, so you can edit the wiki now. 

That section in troubleshooting is pretty short and it is such an important issue that I suggest we create a page for it, but feel free to suggest a different path, I've suggested it because I get this question outside the list a lot and I've seen a lot of insight in the mail thread.

-Flavio


> On 16 May 2016, at 12:01, Guy Laden <gu...@gmail.com> wrote:
> 
> Hi Flavio, Not sure if that was addressed to me?
> I noticed the GC section in
> *https://cwiki.apache.org/confluence/display/ZOOKEEPER/Troubleshooting
> <https://cwiki.apache.org/confluence/display/ZOOKEEPER/Troubleshooting>*
> (I assume this is more recent than the pages at wiki.apache.org)
> Perhaps adding a link there to this thread makes sense?
> if you/Patrick think this is a good idea I could do this if I was granted
> the permissions.
> So far I've created a wiki account but I don't think I have update
> permissions.
> Guy
> ​


Re: garbage collector choice and tuning

Posted by Guy Laden <gu...@gmail.com>.
Hi Flavio, Not sure if that was addressed to me?
I noticed the GC section in
*https://cwiki.apache.org/confluence/display/ZOOKEEPER/Troubleshooting
<https://cwiki.apache.org/confluence/display/ZOOKEEPER/Troubleshooting>*
(I assume this is more recent than the pages at wiki.apache.org)
Perhaps adding a link there to this thread makes sense?
if you/Patrick think this is a good idea I could do this if I was granted
the permissions.
So far I've created a wiki account but I don't think I have update
permissions.
Guy
​

Re: garbage collector choice and tuning

Posted by Flavio P JUNQUEIRA <fp...@apache.org>.
Perhaps start a wiki page on GC tuning? Sounds like some great experience
reported on this thread.

-Flavio
On 15 May 2016 20:34, "Guy Laden" <gu...@gmail.com> wrote:

> Maugli, Thanks so much for taking the time to write at length, and for the
> great pointers.
> Thanks all for sharing.
>

Re: garbage collector choice and tuning

Posted by Guy Laden <gu...@gmail.com>.
Maugli, Thanks so much for taking the time to write at length, and for the
great pointers.
Thanks all for sharing.

Re: garbage collector choice and tuning

Posted by Tom Crayford <tc...@heroku.com>.
We've been running run hundreds of zookeeper clusters with G1 on relatively
small heaps (some even as low as 2GB) with almost no tuning and had a very
pleasant time so far. Granted, our use is relatively light most of the
time, but still, GC has just not been an issue for us since we switched to
the G1

Tom Crayford
Heroku Kafka

On Fri, May 13, 2016 at 11:15 PM, Patrick Hunt <ph...@apache.org> wrote:

> Please do report your experiences to the list. I think generally folks
> would be interested (I would).
>
> My 0.02 - with such small heap sizes with ZK I typically use CMS. It's
> pretty much fire/forget with the defaults. My experience with G1 (granted
> typically larger heaps) is that it requires more tuning to get the benefit.
> YMMV.
>
> Patrick
>
>
> On Fri, May 13, 2016 at 11:42 AM, Attila Szabo <as...@cloudera.com>
> wrote:
>
> > Hi all,
> >
> > I do have one specific story worth mentioning in this subject:
> > About one year ago I was working for a financial company, where we had to
> > maintain a system responsible tons of EOD stock related calculations. The
> > system was working quite okay, but during the quarterly rebalances we
> > always experienced serious performance issues. We were still using CMS
> and
> > just 12GB mem, and regardless the advices provided my team to do GC
> tuning
> > and put more mem to the machine we were not allowed to change anything.
> > Until a part when by the next rebalancing date we hit an 8 minute SLA
> miss
> > (you could imagine what kind of complain storm it has started from our
> > customers...).
> >
> > So I've spent 1 weekend in the office with analyzing GC logs, playing
> with
> > GC params, and learning about G1.
> >
> > The result was the following:
> > After I'd totally understood how our application was working at that
> time I
> > was able to tune it (without touching the code) to work perfectly with
> the
> > 12G scenario without even having any stop the world event and shrank the
> > full runtime below the average runtime (so not the worst case! but
> average
> > everyday runtime) with 2 minutes. The throughput increased from 68% up to
> > 99.8%. I had to be aware that our heap contained a quite big amount of
> old
> > objects needed to be used, thus I had to set bigger the
> > -XX:InitiatingHeapOccupancyPercent
> > and -XX:G1MixedGCLiveThresholdPercent higher, and AFAIR I've also set
> > -XX:MaxGCPauseMillis
> > to 500ms (but in this last I'm not totally sure).
> >
> > When I've switched to Java8 and turned on string deduplication the
> > performance got even better. (With that turned on that specific usecase
> was
> > even okay with just 10GB of memory)
> >
> > The best resources I've used for my journeys were:
> > https://github.com/chewiebug/GCViewer
> > and
> > http://www.infoq.com/news/2015/12/java-garbage-collection-minibook
> (since
> > removed but you could find it here
> https://www.reddit.com/comments/3d8nfo
> > )
> >
> > Since that time I really in love with G1 and I'm very willing to use it
> for
> > small heap size cases (like the one you'd depicted) and nearly a must for
> > big heap size cases.
> >
> > However I did learn a very pragmatic approach during this journey.
> >
> > In all of the cases when you plan a change like this follow this
> approach:
> >
> >    1. Build an easily repeatable testcase for measurements.
> >    2. Measure with both of the old and new GC settings
> >    3. Analyze the logs with GC viewer. Understand how your application
> >    works from consumption POV
> >    4. Repeat step 2 and 3 until you can tune any of the parameters with
> >    each GC algorithm
> >    5. Use the better one.
> >
> > About the massive load of short sessions scenario I would consider
> playing
> > with -XX:G1HeapRegionSize and -XX:G1MaxNewSizePercent params as a start,
> > and check the results.
> >
> > My 2 cents,
> >
> > Cheers,
> > Maugli
> >
> > p.s.: If you're interested I happily advise my aid with the measurements
> > and tuning.
> >
> >
> > On Fri, May 13, 2016 at 4:18 PM, Guy Laden <gu...@gmail.com> wrote:
> >
> > > Hi, We are considering CMS vs G1 for ZooKeeper running under Oracle
> JDK8.
> > > The expected heap size is 4-6GB.
> > > How workload-specific is this choice in your opinion and it what ways?
> > E.g.
> > > if many short sessions prefer G1, etc...
> > > Has anybody had experience they're willing to share regarding this?
> > > We'd also be very interested to hear about any gc-tuning flags you've
> had
> > > good experience with.
> > > Thanks
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> > Attila Szabo
> > Sotware Engineer
> >
> > <http://www.cloudera.com>
> >
>

Re: garbage collector choice and tuning

Posted by Patrick Hunt <ph...@apache.org>.
Please do report your experiences to the list. I think generally folks
would be interested (I would).

My 0.02 - with such small heap sizes with ZK I typically use CMS. It's
pretty much fire/forget with the defaults. My experience with G1 (granted
typically larger heaps) is that it requires more tuning to get the benefit.
YMMV.

Patrick


On Fri, May 13, 2016 at 11:42 AM, Attila Szabo <as...@cloudera.com> wrote:

> Hi all,
>
> I do have one specific story worth mentioning in this subject:
> About one year ago I was working for a financial company, where we had to
> maintain a system responsible tons of EOD stock related calculations. The
> system was working quite okay, but during the quarterly rebalances we
> always experienced serious performance issues. We were still using CMS and
> just 12GB mem, and regardless the advices provided my team to do GC tuning
> and put more mem to the machine we were not allowed to change anything.
> Until a part when by the next rebalancing date we hit an 8 minute SLA miss
> (you could imagine what kind of complain storm it has started from our
> customers...).
>
> So I've spent 1 weekend in the office with analyzing GC logs, playing with
> GC params, and learning about G1.
>
> The result was the following:
> After I'd totally understood how our application was working at that time I
> was able to tune it (without touching the code) to work perfectly with the
> 12G scenario without even having any stop the world event and shrank the
> full runtime below the average runtime (so not the worst case! but average
> everyday runtime) with 2 minutes. The throughput increased from 68% up to
> 99.8%. I had to be aware that our heap contained a quite big amount of old
> objects needed to be used, thus I had to set bigger the
> -XX:InitiatingHeapOccupancyPercent
> and -XX:G1MixedGCLiveThresholdPercent higher, and AFAIR I've also set
> -XX:MaxGCPauseMillis
> to 500ms (but in this last I'm not totally sure).
>
> When I've switched to Java8 and turned on string deduplication the
> performance got even better. (With that turned on that specific usecase was
> even okay with just 10GB of memory)
>
> The best resources I've used for my journeys were:
> https://github.com/chewiebug/GCViewer
> and
> http://www.infoq.com/news/2015/12/java-garbage-collection-minibook (since
> removed but you could find it here https://www.reddit.com/comments/3d8nfo
> )
>
> Since that time I really in love with G1 and I'm very willing to use it for
> small heap size cases (like the one you'd depicted) and nearly a must for
> big heap size cases.
>
> However I did learn a very pragmatic approach during this journey.
>
> In all of the cases when you plan a change like this follow this approach:
>
>    1. Build an easily repeatable testcase for measurements.
>    2. Measure with both of the old and new GC settings
>    3. Analyze the logs with GC viewer. Understand how your application
>    works from consumption POV
>    4. Repeat step 2 and 3 until you can tune any of the parameters with
>    each GC algorithm
>    5. Use the better one.
>
> About the massive load of short sessions scenario I would consider playing
> with -XX:G1HeapRegionSize and -XX:G1MaxNewSizePercent params as a start,
> and check the results.
>
> My 2 cents,
>
> Cheers,
> Maugli
>
> p.s.: If you're interested I happily advise my aid with the measurements
> and tuning.
>
>
> On Fri, May 13, 2016 at 4:18 PM, Guy Laden <gu...@gmail.com> wrote:
>
> > Hi, We are considering CMS vs G1 for ZooKeeper running under Oracle JDK8.
> > The expected heap size is 4-6GB.
> > How workload-specific is this choice in your opinion and it what ways?
> E.g.
> > if many short sessions prefer G1, etc...
> > Has anybody had experience they're willing to share regarding this?
> > We'd also be very interested to hear about any gc-tuning flags you've had
> > good experience with.
> > Thanks
> >
>
>
>
> --
> Best regards,
>
> Attila Szabo
> Sotware Engineer
>
> <http://www.cloudera.com>
>

Re: garbage collector choice and tuning

Posted by Attila Szabo <as...@cloudera.com>.
Hi all,

I do have one specific story worth mentioning in this subject:
About one year ago I was working for a financial company, where we had to
maintain a system responsible tons of EOD stock related calculations. The
system was working quite okay, but during the quarterly rebalances we
always experienced serious performance issues. We were still using CMS and
just 12GB mem, and regardless the advices provided my team to do GC tuning
and put more mem to the machine we were not allowed to change anything.
Until a part when by the next rebalancing date we hit an 8 minute SLA miss
(you could imagine what kind of complain storm it has started from our
customers...).

So I've spent 1 weekend in the office with analyzing GC logs, playing with
GC params, and learning about G1.

The result was the following:
After I'd totally understood how our application was working at that time I
was able to tune it (without touching the code) to work perfectly with the
12G scenario without even having any stop the world event and shrank the
full runtime below the average runtime (so not the worst case! but average
everyday runtime) with 2 minutes. The throughput increased from 68% up to
99.8%. I had to be aware that our heap contained a quite big amount of old
objects needed to be used, thus I had to set bigger the
-XX:InitiatingHeapOccupancyPercent
and -XX:G1MixedGCLiveThresholdPercent higher, and AFAIR I've also set
-XX:MaxGCPauseMillis
to 500ms (but in this last I'm not totally sure).

When I've switched to Java8 and turned on string deduplication the
performance got even better. (With that turned on that specific usecase was
even okay with just 10GB of memory)

The best resources I've used for my journeys were:
https://github.com/chewiebug/GCViewer
and
http://www.infoq.com/news/2015/12/java-garbage-collection-minibook (since
removed but you could find it here https://www.reddit.com/comments/3d8nfo )

Since that time I really in love with G1 and I'm very willing to use it for
small heap size cases (like the one you'd depicted) and nearly a must for
big heap size cases.

However I did learn a very pragmatic approach during this journey.

In all of the cases when you plan a change like this follow this approach:

   1. Build an easily repeatable testcase for measurements.
   2. Measure with both of the old and new GC settings
   3. Analyze the logs with GC viewer. Understand how your application
   works from consumption POV
   4. Repeat step 2 and 3 until you can tune any of the parameters with
   each GC algorithm
   5. Use the better one.

About the massive load of short sessions scenario I would consider playing
with -XX:G1HeapRegionSize and -XX:G1MaxNewSizePercent params as a start,
and check the results.

My 2 cents,

Cheers,
Maugli

p.s.: If you're interested I happily advise my aid with the measurements
and tuning.


On Fri, May 13, 2016 at 4:18 PM, Guy Laden <gu...@gmail.com> wrote:

> Hi, We are considering CMS vs G1 for ZooKeeper running under Oracle JDK8.
> The expected heap size is 4-6GB.
> How workload-specific is this choice in your opinion and it what ways? E.g.
> if many short sessions prefer G1, etc...
> Has anybody had experience they're willing to share regarding this?
> We'd also be very interested to hear about any gc-tuning flags you've had
> good experience with.
> Thanks
>



-- 
Best regards,

Attila Szabo
Sotware Engineer

<http://www.cloudera.com>