You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Chris Burroughs <ch...@gmail.com> on 2011/04/05 21:04:48 UTC

Minor Follow-up: reduced cached mem; resident set size growth

This is a minor followup to this thread which includes required context:

http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html

I haven't solved the problem, but since negative results can also be
useful I thought I would share them.  Things I tried unsuccessfully (on
individual nodes except for the upgrade):

- Upgrade from Cassandra 0.6 to 0.7
- Different collectors: -XX:+UseParallelGC -XX:+UseParallelOldGC
- JNA (but not mlockall)
- Switch disk_access_mode from standard to mmap_index_only (obviously in
this case RSS is less than useful, but overall memory graph still was
bad looking like this [1]).


On #cassandra there was speculation that a large (200k) row cache may be
inducing heap fragmentation.  I have not ruled this out but have been
unable to do that in stand alone ConcurrentLinkedHashMap stress testing.
 Since turning off the row cache would be a cure worse than the disease
I have not tried that yet with a real cluster.

Future possibilities would be to get the limits set right for mlockall,
trying combinations of the above, and running without caches.

I have gc logs if anyone is interested.

[1] http://img194.imageshack.us/img194/383/2weekmem.png

Re: Minor Follow-up: reduced cached mem; resident set size growth

Posted by Chris Burroughs <ch...@gmail.com>.
On 04/05/2011 03:04 PM, Chris Burroughs wrote:

> I have gc logs if anyone is interested.

This is from a node with standard io, jna enabled, but limits were not
set for mlockall to succeed.  One can see -/+ buffers/cache free
shrinking and the C* pid's RSS growing.


Includes several days of:
gc log
free -s
/proc/$PID/status

http://www.filefactory.com/file/ca94892/n/04-08.tar.gz

Please enjoy!  (If there is a preferred way to share the tarball let me
know.)

Re: Minor Follow-up: reduced cached mem; resident set size growth

Posted by Chris Burroughs <ch...@gmail.com>.
On 04/05/2011 04:38 PM, Peter Schuller wrote:
>> - Different collectors: -XX:+UseParallelGC -XX:+UseParallelOldGC
> 
> Unless you also removed the -XX:+UseConcMarkSweepGC I *think* it takes
> precedence, so that the above options would have no effect. I didn't
> test. In either case, did you definitely confirm CMS was no longer
> being used? (Should be pretty obvious if you ran with
> -XX:+PrintGCDetails which looks plenty different w/o CMS)
> 

More precisely, I did this:

# GC tuning options
#JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
#JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
#JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
#JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8"
#JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1"
#JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75"
#JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly"
JVM_OPTS="$JVM_OPTS -XX:+UseParallelGC"
JVM_OPTS="$JVM_OPTS -XX:+UseParallelOldGC"


>> I have gc logs if anyone is interested.
> 
> Yes :)
>


By "have gc logs" I meant "had them until I accidental blew them away
while restarting a server".  Will post them in a day or two when there
is a reasonable amount of data or the quantum state collapses and the
problem vanishes when it is observed.


>> [1] http://img194.imageshack.us/img194/383/2weekmem.png
> 
> I did go back and revisit the old thread... maybe I'm missing
> something, but just to be real sure:
> 
> What does the "no color"/white mean on this graph? Is that application
> memory (resident set)?
> 
> I'm not really sure what I'm looking for since you already said you
> tested with 'standard' which rules out the
> resident-set-memory-as-a-result-of-mmap being counted towards the
> leak. But still.
> 

I will be the first to admit that Zabbix's graphs are not the... easiest
to read.  My interpretation is that no "color" is "none of the above"
and by being unavailable is thus in use by applications.  This fits with
what I see will free and measurements of the RSS of the jvm from /proc/.
 I'll leave free -s going for a few days while waiting on the gc logs as
an extra sanity test.  That's probably easier to reason about anyway.

Re: Minor Follow-up: reduced cached mem; resident set size growth

Posted by Peter Schuller <pe...@infidyne.com>.
> - Different collectors: -XX:+UseParallelGC -XX:+UseParallelOldGC

Unless you also removed the -XX:+UseConcMarkSweepGC I *think* it takes
precedence, so that the above options would have no effect. I didn't
test. In either case, did you definitely confirm CMS was no longer
being used? (Should be pretty obvious if you ran with
-XX:+PrintGCDetails which looks plenty different w/o CMS)

> On #cassandra there was speculation that a large (200k) row cache may be
> inducing heap fragmentation.  I have not ruled this out but have been
> unable to do that in stand alone ConcurrentLinkedHashMap stress testing.
>  Since turning off the row cache would be a cure worse than the disease
> I have not tried that yet with a real cluster.

I didn't follow the IRC discussion, but I think the most likely way I
can see the row cache causing fragmentation and growth of
*non-java-heap* memory would be if it did so by way of the data
structures maintained by CMS for old-gen.

If you really made it run without CMS... I can't really claim a lot of
certainty but I'd be pretty surprised if the row cache was responsible
for out-of-heap memory leakage with the default compacting collectors.

> Future possibilities would be to get the limits set right for mlockall,
> trying combinations of the above, and running without caches.
>
> I have gc logs if anyone is interested.

Yes :)

> [1] http://img194.imageshack.us/img194/383/2weekmem.png

I did go back and revisit the old thread... maybe I'm missing
something, but just to be real sure:

What does the "no color"/white mean on this graph? Is that application
memory (resident set)?

I'm not really sure what I'm looking for since you already said you
tested with 'standard' which rules out the
resident-set-memory-as-a-result-of-mmap being counted towards the
leak. But still.

-- 
/ Peter Schuller