You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Daniel Woo <da...@gmail.com> on 2012/10/11 08:04:27 UTC

cassandra 1.0.8 memory usage

Hi guys,

I am running a mini cluster with 6 nodes, recently we see very frequent
ParNewGC on two nodes. It takes 200 - 800 ms on average, sometimes it takes
5 seconds. You know, hte ParNewGC is stop-of-wolrd GC and our client throws
SocketTimeoutException every 3 minutes.

I checked the load, it seems well balanced, and the two nodes are running
on the same hardware: 2 * 4 cores xeon with 16G RAM, we give cassandrda 4G
heap, including 800MB young generation. We did not see any swap usage
during the GC, any idea about this?

Then I took a heap dump, it shows that 5 instances of JmxMBeanServer holds
500MB memory and most of the referenced objects are JMX mbean related, it's
kind of wired to me and looks like a memory leak.

-- 
Thanks & Regards,
Daniel

Re: cassandra 1.0.8 memory usage

Posted by Jason Wee <pe...@gmail.com>.
what jvm version?

On Thu, Oct 11, 2012 at 2:04 PM, Daniel Woo <da...@gmail.com> wrote:

> Hi guys,
>
> I am running a mini cluster with 6 nodes, recently we see very frequent
> ParNewGC on two nodes. It takes 200 - 800 ms on average, sometimes it takes
> 5 seconds. You know, hte ParNewGC is stop-of-wolrd GC and our client throws
> SocketTimeoutException every 3 minutes.
>
> I checked the load, it seems well balanced, and the two nodes are running
> on the same hardware: 2 * 4 cores xeon with 16G RAM, we give cassandrda 4G
> heap, including 800MB young generation. We did not see any swap usage
> during the GC, any idea about this?
>
> Then I took a heap dump, it shows that 5 instances of JmxMBeanServer holds
> 500MB memory and most of the referenced objects are JMX mbean related, it's
> kind of wired to me and looks like a memory leak.
>
> --
> Thanks & Regards,
> Daniel
>

Re: cassandra 1.0.8 memory usage

Posted by Rob Coli <rc...@palominodb.com>.
On Fri, Oct 12, 2012 at 1:26 AM, Daniel Woo <da...@gmail.com> wrote:
>>>What version of Cassandra? What JVM? Are JNA and Jamm working?
> cassandra 1.0.8. Sun JDK 1.7.0_05-b06, JNA memlock enabled, jamm works.

The unusual aspect here is Sun JDK 1.7. Can you use 1.6 on an affected
node and see if the problem disappears?

https://issues.apache.org/jira/browse/CASSANDRA-4571

Exists in 1.1.x (not your case) and is for leaking descriptors and not
memory, but affects both 1.6 and 1.7.

> JMAP shows that the per gen is only 40% used.

What is the usage of the other gens?

> I have very few column families, maybe 30-50. The nodetool shows each node
> has 5 GB load.

Most of your heap being consumed by 30-50 columnfamilies MBeans seems excessive.

>>> Disable swap for cassandra node
> I am gonna change swappiness to 20%

Even setting swappiness to 0% does not prevent the kernel from
swapping if swap is defined/enabled. I re-iterate my suggestion that
you de-define/disable swap on any node running Cassandra. :)

=Rob

-- 
=Robert Coli
AIM&GTALK - rcoli@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb

Re: cassandra 1.0.8 memory usage

Posted by Tyler Hobbs <ty...@datastax.com>.
On Fri, Oct 12, 2012 at 3:26 AM, Daniel Woo <da...@gmail.com> wrote:

>
> >> Disable swap for cassandra node
> I am gonna change swappiness to 20%


Dead nodes are better than crippled nodes.  I'll echo Rob's suggestion that
you disable swap entirely.

-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: cassandra 1.0.8 memory usage

Posted by Daniel Woo <da...@gmail.com>.
Hi Rob,

>>What version of Cassandra? What JVM? Are JNA and Jamm working?
cassandra 1.0.8. Sun JDK 1.7.0_05-b06, JNA memlock enabled, jamm works.

>>It sounds like the two nodes that are pathological right now have
exhausted the perm gen with actual non-garbage, probably mostly the  Bloom
filters and the JMX MBeans.
JMAP shows that the per gen is only 40% used.

>>Do you have a "large" number of ColumnFamilies? How large is the data
stored per node?
I have very few column families, maybe 30-50. The nodetool shows each node
has 5 GB load.

>> Disable swap for cassandra node
I am gonna change swappiness to 20%

Thanks,
Daniel


On Fri, Oct 12, 2012 at 2:02 AM, Rob Coli <rc...@palominodb.com> wrote:

> On Wed, Oct 10, 2012 at 11:04 PM, Daniel Woo <da...@gmail.com>
> wrote:
> > I am running a mini cluster with 6 nodes, recently we see very frequent
> > ParNewGC on two nodes. It takes 200 - 800 ms on average, sometimes it
> takes
> > 5 seconds. You know, hte ParNewGC is stop-of-wolrd GC and our client
> throws
> > SocketTimeoutException every 3 minutes.
>
> What version of Cassandra? What JVM? Are JNA and Jamm working?
>
> > I checked the load, it seems well balanced, and the two nodes are
> running on
> > the same hardware: 2 * 4 cores xeon with 16G RAM, we give cassandrda 4G
> > heap, including 800MB young generation. We did not see any swap usage
> during
> > the GC, any idea about this?
>
> It sounds like the two nodes that are pathological right now have
> exhausted the perm gen with actual non-garbage, probably mostly the
> Bloom filters and the JMX MBeans.
>
> > Then I took a heap dump, it shows that 5 instances of JmxMBeanServer
> holds
> > 500MB memory and most of the referenced objects are JMX mbean related,
> it's
> > kind of wired to me and looks like a memory leak.
>
> Do you have a "large" number of ColumnFamilies? How large is the data
> stored per node?
>
> =Rob
>
> --
> =Robert Coli
> AIM&GTALK - rcoli@palominodb.com
> YAHOO - rcoli.palominob
> SKYPE - rcoli_palominodb
>



-- 
Thanks & Regards,
Daniel

Re: cassandra 1.0.8 memory usage

Posted by Rob Coli <rc...@palominodb.com>.
On Thu, Oct 11, 2012 at 11:02 AM, Rob Coli <rc...@palominodb.com> wrote:
> On Wed, Oct 10, 2012 at 11:04 PM, Daniel Woo <da...@gmail.com> wrote:
>  We did not see any swap usage during the GC, any idea about this?

As an aside.. you shouldn't have swap enabled on a Cassandra node,
generally. As a simple example, if you have swap enabled and use the
off-heap row cache, the kernel might swap your row cache.

=Rob

-- 
=Robert Coli
AIM&GTALK - rcoli@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb

Re: cassandra 1.0.8 memory usage

Posted by Rob Coli <rc...@palominodb.com>.
On Wed, Oct 10, 2012 at 11:04 PM, Daniel Woo <da...@gmail.com> wrote:
> I am running a mini cluster with 6 nodes, recently we see very frequent
> ParNewGC on two nodes. It takes 200 - 800 ms on average, sometimes it takes
> 5 seconds. You know, hte ParNewGC is stop-of-wolrd GC and our client throws
> SocketTimeoutException every 3 minutes.

What version of Cassandra? What JVM? Are JNA and Jamm working?

> I checked the load, it seems well balanced, and the two nodes are running on
> the same hardware: 2 * 4 cores xeon with 16G RAM, we give cassandrda 4G
> heap, including 800MB young generation. We did not see any swap usage during
> the GC, any idea about this?

It sounds like the two nodes that are pathological right now have
exhausted the perm gen with actual non-garbage, probably mostly the
Bloom filters and the JMX MBeans.

> Then I took a heap dump, it shows that 5 instances of JmxMBeanServer holds
> 500MB memory and most of the referenced objects are JMX mbean related, it's
> kind of wired to me and looks like a memory leak.

Do you have a "large" number of ColumnFamilies? How large is the data
stored per node?

=Rob

-- 
=Robert Coli
AIM&GTALK - rcoli@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb