You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Nicolas Liochon <nk...@gmail.com> on 2013/06/13 05:35:26 UTC

large heaps

Hi there,

During the hackathon I had some discussions around GC on large heaps.

This guy, who seems to know what he is talking about, and had a patch
accepted in hotspot jdk, said in 2011 that he's got a configuration working
reasonably well with large heaps at that time :

"I was able to keep GC pause on 32Gb Oracle Coherence storage node below
150ms on 8 core server."

(in http://java.dzone.com/articles/how-tame-java-gc-pauses)

There is a lot of stuff in his blog, some of it in Russian only, but at
least one of us will understand it.

http://blog.ragozin.info/2011/07/openjdk-patch-cutting-down-gc-pause.html
http://fr.slideshare.net/aragozin/garbage-collection-in-jvm-mailru-fin

Cheers,

Nicolas

Re: large heaps

Posted by Nick Dimiduk <nd...@gmail.com>.
It looks like there's some interest from RedHat in solving these problems
using OpenJDK.

http://rkennke.wordpress.com/2013/06/10/shenandoah-a-pauseless-gc-for-openjdk/

On Wed, Jun 12, 2013 at 11:55 PM, Nicolas Liochon <nk...@gmail.com> wrote:

> Thanks a lot Todd, your conclusion does confirm the feeling I had around
> large heaps with HBase.
>
>
> On Thu, Jun 13, 2013 at 7:30 AM, Todd Lipcon <to...@cloudera.com> wrote:
>
> > Hey Nicolas,
> >
> > I've corresponded with that guy a few times in the past -- back when i
> > was attempting to hack some patches into G1 for better performance on
> > HBase. The end result of that investigation was the MSLAB feature
> > which made it into 0.90.x.
> >
> > The main thing I learned about GC is that big heaps aren't in
> > themselves problematic -- they don't tend to make young gen pauses
> > take longer. The only problem is if you eventually hit a
> > stop-the-world CMS pause, the size of the heap linearly effects the
> > length of the pause. So, the trick is avoiding stop-the-world CMS.
> >
> > In order to avoid that, you need to do a few things:
> > - make sure you don't have any short-lived super-large objects: when
> > large objects are promoted from the young generation, they need to
> > find contiguous space in the old gen. If you allocate, say, a 400MB
> > array, even if it's short lived, it's unlikely you'll find 400MB of
> > contiguous space in the old gen without defragmenting. This will cause
> > a STW pause.
> >
> > If you have some super-large objects allocated at startup, that's OK,
> > they'll just park themselves in the old gen and not cause trouble.
> >
> > - make sure that most of your objects are "around the same size". This
> > prevents fragmentation build-up in the old gen.
> >
> > - move big memory consumers off-heap if possible
> >
> > We've done a pretty good job of the above so far, and with a bit more
> > careful analysis I think it's possible to fully avoid old-gen STW
> > pauses.
> >
> > -Todd
> >
> >
> > On Wed, Jun 12, 2013 at 8:35 PM, Nicolas Liochon <nk...@gmail.com>
> > wrote:
> > > Hi there,
> > >
> > > During the hackathon I had some discussions around GC on large heaps.
> > >
> > > This guy, who seems to know what he is talking about, and had a patch
> > > accepted in hotspot jdk, said in 2011 that he's got a configuration
> > working
> > > reasonably well with large heaps at that time :
> > >
> > > "I was able to keep GC pause on 32Gb Oracle Coherence storage node
> below
> > > 150ms on 8 core server."
> > >
> > > (in http://java.dzone.com/articles/how-tame-java-gc-pauses)
> > >
> > > There is a lot of stuff in his blog, some of it in Russian only, but at
> > > least one of us will understand it.
> > >
> > >
> >
> http://blog.ragozin.info/2011/07/openjdk-patch-cutting-down-gc-pause.html
> > > http://fr.slideshare.net/aragozin/garbage-collection-in-jvm-mailru-fin
> > >
> > > Cheers,
> > >
> > > Nicolas
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>

Re: large heaps

Posted by Nicolas Liochon <nk...@gmail.com>.
Thanks a lot Todd, your conclusion does confirm the feeling I had around
large heaps with HBase.


On Thu, Jun 13, 2013 at 7:30 AM, Todd Lipcon <to...@cloudera.com> wrote:

> Hey Nicolas,
>
> I've corresponded with that guy a few times in the past -- back when i
> was attempting to hack some patches into G1 for better performance on
> HBase. The end result of that investigation was the MSLAB feature
> which made it into 0.90.x.
>
> The main thing I learned about GC is that big heaps aren't in
> themselves problematic -- they don't tend to make young gen pauses
> take longer. The only problem is if you eventually hit a
> stop-the-world CMS pause, the size of the heap linearly effects the
> length of the pause. So, the trick is avoiding stop-the-world CMS.
>
> In order to avoid that, you need to do a few things:
> - make sure you don't have any short-lived super-large objects: when
> large objects are promoted from the young generation, they need to
> find contiguous space in the old gen. If you allocate, say, a 400MB
> array, even if it's short lived, it's unlikely you'll find 400MB of
> contiguous space in the old gen without defragmenting. This will cause
> a STW pause.
>
> If you have some super-large objects allocated at startup, that's OK,
> they'll just park themselves in the old gen and not cause trouble.
>
> - make sure that most of your objects are "around the same size". This
> prevents fragmentation build-up in the old gen.
>
> - move big memory consumers off-heap if possible
>
> We've done a pretty good job of the above so far, and with a bit more
> careful analysis I think it's possible to fully avoid old-gen STW
> pauses.
>
> -Todd
>
>
> On Wed, Jun 12, 2013 at 8:35 PM, Nicolas Liochon <nk...@gmail.com>
> wrote:
> > Hi there,
> >
> > During the hackathon I had some discussions around GC on large heaps.
> >
> > This guy, who seems to know what he is talking about, and had a patch
> > accepted in hotspot jdk, said in 2011 that he's got a configuration
> working
> > reasonably well with large heaps at that time :
> >
> > "I was able to keep GC pause on 32Gb Oracle Coherence storage node below
> > 150ms on 8 core server."
> >
> > (in http://java.dzone.com/articles/how-tame-java-gc-pauses)
> >
> > There is a lot of stuff in his blog, some of it in Russian only, but at
> > least one of us will understand it.
> >
> >
> http://blog.ragozin.info/2011/07/openjdk-patch-cutting-down-gc-pause.html
> > http://fr.slideshare.net/aragozin/garbage-collection-in-jvm-mailru-fin
> >
> > Cheers,
> >
> > Nicolas
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: large heaps

Posted by Todd Lipcon <to...@cloudera.com>.
Hey Nicolas,

I've corresponded with that guy a few times in the past -- back when i
was attempting to hack some patches into G1 for better performance on
HBase. The end result of that investigation was the MSLAB feature
which made it into 0.90.x.

The main thing I learned about GC is that big heaps aren't in
themselves problematic -- they don't tend to make young gen pauses
take longer. The only problem is if you eventually hit a
stop-the-world CMS pause, the size of the heap linearly effects the
length of the pause. So, the trick is avoiding stop-the-world CMS.

In order to avoid that, you need to do a few things:
- make sure you don't have any short-lived super-large objects: when
large objects are promoted from the young generation, they need to
find contiguous space in the old gen. If you allocate, say, a 400MB
array, even if it's short lived, it's unlikely you'll find 400MB of
contiguous space in the old gen without defragmenting. This will cause
a STW pause.

If you have some super-large objects allocated at startup, that's OK,
they'll just park themselves in the old gen and not cause trouble.

- make sure that most of your objects are "around the same size". This
prevents fragmentation build-up in the old gen.

- move big memory consumers off-heap if possible

We've done a pretty good job of the above so far, and with a bit more
careful analysis I think it's possible to fully avoid old-gen STW
pauses.

-Todd


On Wed, Jun 12, 2013 at 8:35 PM, Nicolas Liochon <nk...@gmail.com> wrote:
> Hi there,
>
> During the hackathon I had some discussions around GC on large heaps.
>
> This guy, who seems to know what he is talking about, and had a patch
> accepted in hotspot jdk, said in 2011 that he's got a configuration working
> reasonably well with large heaps at that time :
>
> "I was able to keep GC pause on 32Gb Oracle Coherence storage node below
> 150ms on 8 core server."
>
> (in http://java.dzone.com/articles/how-tame-java-gc-pauses)
>
> There is a lot of stuff in his blog, some of it in Russian only, but at
> least one of us will understand it.
>
> http://blog.ragozin.info/2011/07/openjdk-patch-cutting-down-gc-pause.html
> http://fr.slideshare.net/aragozin/garbage-collection-in-jvm-mailru-fin
>
> Cheers,
>
> Nicolas



-- 
Todd Lipcon
Software Engineer, Cloudera