You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Chris Burroughs <ch...@gmail.com> on 2011/07/12 15:28:49 UTC

Survey: Cassandra/JVM Resident Set Size increase

### Preamble

There have been several reports on the mailing list of the JVM running
Cassandra using "too much" memory.  That is, the resident set size is
>>(max java heap size + mmaped segments) and continues to grow until the
process swaps, kernel oom killer comes along, or performance just
degrades too far due to the lack of space for the page cache.  It has
been unclear from these reports if there is a pattern.  My hope here is
that by comparing JVM versions, OS versions, JVM configuration etc., we
will find something.  Thank you everyone for your time.


Some example reports:
 - http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html
 -
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html
 - https://issues.apache.org/jira/browse/CASSANDRA-2868
 -
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html
 -
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-memory-problem-td6545642.html

For reference theories include (in no particular order):
 - memory fragmentation
 - JVM bug
 - OS/glibc bug
 - direct memory
 - swap induced fragmentation
 - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity.

### Survey

1. Do you think you are experiencing this problem?

2.  Why? (This is a good time to share a graph like
http://www.twitpic.com/5fdabn or
http://img24.imageshack.us/img24/1754/cassandrarss.png)

2. Are you using mmap? (If yes be sure to have read
http://wiki.apache.org/cassandra/FAQ#mmap , and explain how you have
used pmap [or another tool] to rule you mmap and top decieving you.)

3. Are you using JNA?  Was mlockall succesful (it's in the logs on startup)?

4. Is swap enabled? Are you swapping?

5. What version of Apache Cassandra are you using?

6. What is the earliest version of Apache Cassandra you recall seeing
this problem with?

7. Have you tried the patch from CASSANDRA-2654 ?

8. What jvm and version are you using?

9. What OS and version are you using?

10. What are your jvm flags?

11. Have you tried limiting direct memory (-XX:MaxDirectMemorySize)

12. Can you characterise how much GC your cluster is doing?

13. Approximately how many read/writes per unit time is your cluster
doing (per node or the whole cluster)?

14.  How are you column families configured (key cache size, row cache
size, etc.)?


Re: Survey: Cassandra/JVM Resident Set Size increase

Posted by Zhu Han <sc...@gmail.com>.
Chris,

I've deployed the patch to the cluster for two days. Everything is quite
good since then.

Thank you!

best regards,
韩竹(Zhu Han)



On Sat, Jul 30, 2011 at 3:52 AM, Chris Burroughs
<ch...@gmail.com>wrote:

> Thanks to everyone who responded (I think I learned a few new tricks
> from seeing what you tried and how your monitor).  I didn't see any
> patterns in JVM, OS, cassandra versions etc.
>
> At this time I'm confident in saying CASSANDRA-2868 (and thus really
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7066129) is the
> culprit.
>
> On 07/12/2011 09:28 AM, Chris Burroughs wrote:
> > ### Preamble
> >
> > There have been several reports on the mailing list of the JVM running
> > Cassandra using "too much" memory.  That is, the resident set size is
> >>> (max java heap size + mmaped segments) and continues to grow until the
> > process swaps, kernel oom killer comes along, or performance just
> > degrades too far due to the lack of space for the page cache.  It has
> > been unclear from these reports if there is a pattern.  My hope here is
> > that by comparing JVM versions, OS versions, JVM configuration etc., we
> > will find something.  Thank you everyone for your time.
> >
> >
> > Some example reports:
> >  - http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html
> >  -
> >
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html
> >  - https://issues.apache.org/jira/browse/CASSANDRA-2868
> >  -
> >
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html
> >  -
> >
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-memory-problem-td6545642.html
> >
> > For reference theories include (in no particular order):
> >  - memory fragmentation
> >  - JVM bug
> >  - OS/glibc bug
> >  - direct memory
> >  - swap induced fragmentation
> >  - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity.
> >
> > ### Survey
> >
> > 1. Do you think you are experiencing this problem?
> >
> > 2.  Why? (This is a good time to share a graph like
> > http://www.twitpic.com/5fdabn or
> > http://img24.imageshack.us/img24/1754/cassandrarss.png)
> >
> > 2. Are you using mmap? (If yes be sure to have read
> > http://wiki.apache.org/cassandra/FAQ#mmap , and explain how you have
> > used pmap [or another tool] to rule you mmap and top decieving you.)
> >
> > 3. Are you using JNA?  Was mlockall succesful (it's in the logs on
> startup)?
> >
> > 4. Is swap enabled? Are you swapping?
> >
> > 5. What version of Apache Cassandra are you using?
> >
> > 6. What is the earliest version of Apache Cassandra you recall seeing
> > this problem with?
> >
> > 7. Have you tried the patch from CASSANDRA-2654 ?
> >
> > 8. What jvm and version are you using?
> >
> > 9. What OS and version are you using?
> >
> > 10. What are your jvm flags?
> >
> > 11. Have you tried limiting direct memory (-XX:MaxDirectMemorySize)
> >
> > 12. Can you characterise how much GC your cluster is doing?
> >
> > 13. Approximately how many read/writes per unit time is your cluster
> > doing (per node or the whole cluster)?
> >
> > 14.  How are you column families configured (key cache size, row cache
> > size, etc.)?
> >
>
>

Re: Survey: Cassandra/JVM Resident Set Size increase

Posted by Chris Burroughs <ch...@gmail.com>.
Thanks to everyone who responded (I think I learned a few new tricks
from seeing what you tried and how your monitor).  I didn't see any
patterns in JVM, OS, cassandra versions etc.

At this time I'm confident in saying CASSANDRA-2868 (and thus really
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7066129) is the culprit.

On 07/12/2011 09:28 AM, Chris Burroughs wrote:
> ### Preamble
> 
> There have been several reports on the mailing list of the JVM running
> Cassandra using "too much" memory.  That is, the resident set size is
>>> (max java heap size + mmaped segments) and continues to grow until the
> process swaps, kernel oom killer comes along, or performance just
> degrades too far due to the lack of space for the page cache.  It has
> been unclear from these reports if there is a pattern.  My hope here is
> that by comparing JVM versions, OS versions, JVM configuration etc., we
> will find something.  Thank you everyone for your time.
> 
> 
> Some example reports:
>  - http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html
>  -
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html
>  - https://issues.apache.org/jira/browse/CASSANDRA-2868
>  -
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html
>  -
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-memory-problem-td6545642.html
> 
> For reference theories include (in no particular order):
>  - memory fragmentation
>  - JVM bug
>  - OS/glibc bug
>  - direct memory
>  - swap induced fragmentation
>  - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity.
> 
> ### Survey
> 
> 1. Do you think you are experiencing this problem?
> 
> 2.  Why? (This is a good time to share a graph like
> http://www.twitpic.com/5fdabn or
> http://img24.imageshack.us/img24/1754/cassandrarss.png)
> 
> 2. Are you using mmap? (If yes be sure to have read
> http://wiki.apache.org/cassandra/FAQ#mmap , and explain how you have
> used pmap [or another tool] to rule you mmap and top decieving you.)
> 
> 3. Are you using JNA?  Was mlockall succesful (it's in the logs on startup)?
> 
> 4. Is swap enabled? Are you swapping?
> 
> 5. What version of Apache Cassandra are you using?
> 
> 6. What is the earliest version of Apache Cassandra you recall seeing
> this problem with?
> 
> 7. Have you tried the patch from CASSANDRA-2654 ?
> 
> 8. What jvm and version are you using?
> 
> 9. What OS and version are you using?
> 
> 10. What are your jvm flags?
> 
> 11. Have you tried limiting direct memory (-XX:MaxDirectMemorySize)
> 
> 12. Can you characterise how much GC your cluster is doing?
> 
> 13. Approximately how many read/writes per unit time is your cluster
> doing (per node or the whole cluster)?
> 
> 14.  How are you column families configured (key cache size, row cache
> size, etc.)?
> 


Re: Survey: Cassandra/JVM Resident Set Size increase

Posted by William Oberman <ob...@civicscience.com>.
I finally upgraded to 0.7.4 -> 0.8.0 (using riptano packages) 2 days ago.
Before, my resident memory (for the java process) would slowly grow without
bound and the OS would kill the process.  But, over the last 2 days, I
_think_ it's been stable.  I'll let you know in a week :-)

My other stats:
AWS large (64 bit, 7.5GB, 4 "compute units", no swap by default and I didn't
enable it manually)
Centos 5.6
Sun  1.6.0_24-b07
2 column families
4 machine cluster with RF=3
Mostly balanced write/read load (usually more writes)
Not quite "big data" volumes, large 10^6 or small 10^7 ops/day
No deletes or mutations, I only add or read

Everything else is stock, I haven't tuned anything as performance was ok.
No JVM options other than what was in the package.  No JNA.  Not sure the GC
patterns.

will

On Tue, Jul 12, 2011 at 9:28 AM, Chris Burroughs
<ch...@gmail.com>wrote:

> ### Preamble
>
> There have been several reports on the mailing list of the JVM running
> Cassandra using "too much" memory.  That is, the resident set size is
> >>(max java heap size + mmaped segments) and continues to grow until the
> process swaps, kernel oom killer comes along, or performance just
> degrades too far due to the lack of space for the page cache.  It has
> been unclear from these reports if there is a pattern.  My hope here is
> that by comparing JVM versions, OS versions, JVM configuration etc., we
> will find something.  Thank you everyone for your time.
>
>
> Some example reports:
>  - http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html
>  -
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html
>  - https://issues.apache.org/jira/browse/CASSANDRA-2868
>  -
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html
>  -
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-memory-problem-td6545642.html
>
> For reference theories include (in no particular order):
>  - memory fragmentation
>  - JVM bug
>  - OS/glibc bug
>  - direct memory
>  - swap induced fragmentation
>  - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity.
>
> ### Survey
>
> 1. Do you think you are experiencing this problem?
>
> 2.  Why? (This is a good time to share a graph like
> http://www.twitpic.com/5fdabn or
> http://img24.imageshack.us/img24/1754/cassandrarss.png)
>
> 2. Are you using mmap? (If yes be sure to have read
> http://wiki.apache.org/cassandra/FAQ#mmap , and explain how you have
> used pmap [or another tool] to rule you mmap and top decieving you.)
>
> 3. Are you using JNA?  Was mlockall succesful (it's in the logs on
> startup)?
>
> 4. Is swap enabled? Are you swapping?
>
> 5. What version of Apache Cassandra are you using?
>
> 6. What is the earliest version of Apache Cassandra you recall seeing
> this problem with?
>
> 7. Have you tried the patch from CASSANDRA-2654 ?
>
> 8. What jvm and version are you using?
>
> 9. What OS and version are you using?
>
> 10. What are your jvm flags?
>
> 11. Have you tried limiting direct memory (-XX:MaxDirectMemorySize)
>
> 12. Can you characterise how much GC your cluster is doing?
>
> 13. Approximately how many read/writes per unit time is your cluster
> doing (per node or the whole cluster)?
>
> 14.  How are you column families configured (key cache size, row cache
> size, etc.)?
>
>

Re: Survey: Cassandra/JVM Resident Set Size increase

Posted by Zhu Han <sc...@gmail.com>.
On Wed, Jul 13, 2011 at 9:45 PM, Konstantin Naryshkin
<ko...@a-bb.net>wrote:

> Do you mean that it is using all of the available heap? That is the
> expected behavior of most long running Java applications. The JVM will not
> GC until it needs memory (or you explicitly ask it to) and will only free up
> a bit of memory at a time. That is very good behavior from a performance
> stand point since frequent, large GCs would make your application very
> unresponsive. It also makes Java applications take up all the memory you
> give them.
>
> ----- Original Message -----
> From: "Sasha Dolgy" <sd...@gmail.com>
> To: user@cassandra.apache.org
> Sent: Tuesday, July 12, 2011 10:23:02 PM
> Subject: Re: Survey: Cassandra/JVM Resident Set Size increase
>
> I'll post more tomorrow ... However, we set up one node in a single node
> cluster and have left it with no data....reviewing memory consumption
> graphs...it increased daily until it gobbled (highly technical term) all
> memory...the system is now running just below 100% memory usage....which i
> find peculiar seeings that it is doing nothing............with no data and
> no peers.
> On Jul 12, 2011 3:29 PM, "Chris Burroughs" <ch...@gmail.com>
> wrote:
> > ### Preamble
> >
> > There have been several reports on the mailing list of the JVM running
> > Cassandra using "too much" memory. That is, the resident set size is
> >>>(max java heap size + mmaped segments) and continues to grow until the
> > process swaps, kernel oom killer comes along, or performance just
> > degrades too far due to the lack of space for the page cache. It has
> > been unclear from these reports if there is a pattern. My hope here is
> > that by comparing JVM versions, OS versions, JVM configuration etc., we
> > will find something. Thank you everyone for your time.
> >
> >
> > Some example reports:
> > - http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html
> > -
> >
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html
> > - https://issues.apache.org/jira/browse/CASSANDRA-2868
> > -
> >
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html
> > -
> >
>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-memory-problem-td6545642.html
> >
> > For reference theories include (in no particular order):
> > - memory fragmentation
> > - JVM bug
> > - OS/glibc bug
> > - direct memory
> > - swap induced fragmentation
> > - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity.
> >
> > ### Survey
> >
> > 1. Do you think you are experiencing this problem?
>

Yes.


> >
> > 2. Why? (This is a good time to share a graph like
> > http://www.twitpic.com/5fdabn or
> > http://img24.imageshack.us/img24/1754/cassandrarss.png)
>

I observe  the RSS of cassandra process keeps going up to dozens of
gigabytes, even if the dataset (sstables) is just hundreds of megabytes.

> >
> > 2. Are you using mmap? (If yes be sure to have read
> > http://wiki.apache.org/cassandra/FAQ#mmap , and explain how you have
> > used pmap [or another tool] to rule you mmap and top decieving you.)
>

Yes. pmap tells me a lot of anonymous regions are created and expanded
during the life cycle
of cassandra process. That is is primary reason of RSS occupy. I'm pretty
these anonymous regions are  not the Java heap used by JVM, as they are not
continuous.

>
> > 3. Are you using JNA? Was mlockall succesful (it's in the logs on
> startup)?
>

Yes. mlockall is successful either. I have not tried other settings.


> >
> > 4. Is swap enabled? Are you swapping?
>

No. Swap is disabled.


> >
> > 5. What version of Apache Cassandra are you using?
>

0.6.13


> >
> > 6. What is the earliest version of Apache Cassandra you recall seeing
> > this problem with?
>

Earlier version of 0.6.x branch.


> >
> > 7. Have you tried the patch from CASSANDRA-2654 ?
>

Not yet, as I do not query large datasets.


> >
> > 8. What jvm and version are you using?
>

"java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)"

I also tried openJDK.


>
> > 9. What OS and version are you using?
>

 The kernel version is "2.6.18-194.26.1.el5.028stab079.2", which is from
CentOS 5.4

The user level environment is Ubuntu 10.04 (Lucid) server edition.  This
strange combination is because cassandra runs inside OpenVZ container
(Ubuntu 10.04) above Cent OS host.

I am afraid the old kernel caused the memory fragmentation of cassandra
process. But I can not prove it as I did not try it on latest kernel.

>
> > 10. What are your jvm flags?
>

Both CMS and parallel old GC can observe the problem. These are the flags
used:

"        -ea         -Xms3G        -Xmx3G         -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC         -XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8      -XX:MaxTenuringThreshold=1
        -XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly         -XX:+HeapDumpOnOutOfMemoryError "

"-ea -Xms3G -Xmx3G -XX:+UseParallelOldGC -XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1 -XX:+HeapDumpOnOutOfMemoryError"



> >
> > 11. Have you tried limiting direct memory (-XX:MaxDirectMemorySize)
> >
>

Not yet. Is it helpful?


> > 12. Can you characterise how much GC your cluster is doing?
>

This is one node of the test cluster. It has been idle most of the time
since it was restarted 12 days ago.

$ sudo jstat -gcutil 26166
  S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT
 11.92   0.00   4.60  78.92  49.76    282    5.756     1    0.639    6.395



> >
> > 13. Approximately how many read/writes per unit time is your cluster
> > doing (per node or the whole cluster)?
>

The load is very light as it is a test cluster.

>
> > 14. How are you column families configured (key cache size, row cache
> > size, etc.)?
> >
>

Row cache is disabled totally.

Key cache size is disabled for the two of the largest CFs. For other CFs, it
is enabled. However, the total size of these SSTables with key cache enabled
is just 30 MBs.

Re: Survey: Cassandra/JVM Resident Set Size increase

Posted by Konstantin Naryshkin <ko...@a-bb.net>.
Do you mean that it is using all of the available heap? That is the expected behavior of most long running Java applications. The JVM will not GC until it needs memory (or you explicitly ask it to) and will only free up a bit of memory at a time. That is very good behavior from a performance stand point since frequent, large GCs would make your application very unresponsive. It also makes Java applications take up all the memory you give them.

----- Original Message -----
From: "Sasha Dolgy" <sd...@gmail.com>
To: user@cassandra.apache.org
Sent: Tuesday, July 12, 2011 10:23:02 PM
Subject: Re: Survey: Cassandra/JVM Resident Set Size increase

I'll post more tomorrow ... However, we set up one node in a single node
cluster and have left it with no data....reviewing memory consumption
graphs...it increased daily until it gobbled (highly technical term) all
memory...the system is now running just below 100% memory usage....which i
find peculiar seeings that it is doing nothing............with no data and
no peers.
On Jul 12, 2011 3:29 PM, "Chris Burroughs" <ch...@gmail.com>
wrote:
> ### Preamble
>
> There have been several reports on the mailing list of the JVM running
> Cassandra using "too much" memory. That is, the resident set size is
>>>(max java heap size + mmaped segments) and continues to grow until the
> process swaps, kernel oom killer comes along, or performance just
> degrades too far due to the lack of space for the page cache. It has
> been unclear from these reports if there is a pattern. My hope here is
> that by comparing JVM versions, OS versions, JVM configuration etc., we
> will find something. Thank you everyone for your time.
>
>
> Some example reports:
> - http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html
> -
>
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html
> - https://issues.apache.org/jira/browse/CASSANDRA-2868
> -
>
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html
> -
>
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-memory-problem-td6545642.html
>
> For reference theories include (in no particular order):
> - memory fragmentation
> - JVM bug
> - OS/glibc bug
> - direct memory
> - swap induced fragmentation
> - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity.
>
> ### Survey
>
> 1. Do you think you are experiencing this problem?
>
> 2. Why? (This is a good time to share a graph like
> http://www.twitpic.com/5fdabn or
> http://img24.imageshack.us/img24/1754/cassandrarss.png)
>
> 2. Are you using mmap? (If yes be sure to have read
> http://wiki.apache.org/cassandra/FAQ#mmap , and explain how you have
> used pmap [or another tool] to rule you mmap and top decieving you.)
>
> 3. Are you using JNA? Was mlockall succesful (it's in the logs on
startup)?
>
> 4. Is swap enabled? Are you swapping?
>
> 5. What version of Apache Cassandra are you using?
>
> 6. What is the earliest version of Apache Cassandra you recall seeing
> this problem with?
>
> 7. Have you tried the patch from CASSANDRA-2654 ?
>
> 8. What jvm and version are you using?
>
> 9. What OS and version are you using?
>
> 10. What are your jvm flags?
>
> 11. Have you tried limiting direct memory (-XX:MaxDirectMemorySize)
>
> 12. Can you characterise how much GC your cluster is doing?
>
> 13. Approximately how many read/writes per unit time is your cluster
> doing (per node or the whole cluster)?
>
> 14. How are you column families configured (key cache size, row cache
> size, etc.)?
>

Re: Survey: Cassandra/JVM Resident Set Size increase

Posted by Sasha Dolgy <sd...@gmail.com>.
I'll post more tomorrow ... However, we set up one node in a single node
cluster and have left it with no data....reviewing memory consumption
graphs...it increased daily until it gobbled (highly technical term) all
memory...the system is now running just below 100% memory usage....which i
find peculiar seeings that it is doing nothing............with no data and
no peers.
On Jul 12, 2011 3:29 PM, "Chris Burroughs" <ch...@gmail.com>
wrote:
> ### Preamble
>
> There have been several reports on the mailing list of the JVM running
> Cassandra using "too much" memory. That is, the resident set size is
>>>(max java heap size + mmaped segments) and continues to grow until the
> process swaps, kernel oom killer comes along, or performance just
> degrades too far due to the lack of space for the page cache. It has
> been unclear from these reports if there is a pattern. My hope here is
> that by comparing JVM versions, OS versions, JVM configuration etc., we
> will find something. Thank you everyone for your time.
>
>
> Some example reports:
> - http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html
> -
>
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html
> - https://issues.apache.org/jira/browse/CASSANDRA-2868
> -
>
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html
> -
>
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-memory-problem-td6545642.html
>
> For reference theories include (in no particular order):
> - memory fragmentation
> - JVM bug
> - OS/glibc bug
> - direct memory
> - swap induced fragmentation
> - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity.
>
> ### Survey
>
> 1. Do you think you are experiencing this problem?
>
> 2. Why? (This is a good time to share a graph like
> http://www.twitpic.com/5fdabn or
> http://img24.imageshack.us/img24/1754/cassandrarss.png)
>
> 2. Are you using mmap? (If yes be sure to have read
> http://wiki.apache.org/cassandra/FAQ#mmap , and explain how you have
> used pmap [or another tool] to rule you mmap and top decieving you.)
>
> 3. Are you using JNA? Was mlockall succesful (it's in the logs on
startup)?
>
> 4. Is swap enabled? Are you swapping?
>
> 5. What version of Apache Cassandra are you using?
>
> 6. What is the earliest version of Apache Cassandra you recall seeing
> this problem with?
>
> 7. Have you tried the patch from CASSANDRA-2654 ?
>
> 8. What jvm and version are you using?
>
> 9. What OS and version are you using?
>
> 10. What are your jvm flags?
>
> 11. Have you tried limiting direct memory (-XX:MaxDirectMemorySize)
>
> 12. Can you characterise how much GC your cluster is doing?
>
> 13. Approximately how many read/writes per unit time is your cluster
> doing (per node or the whole cluster)?
>
> 14. How are you column families configured (key cache size, row cache
> size, etc.)?
>