You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "olek.stasiak@gmail.com" <ol...@gmail.com> on 2013/11/08 11:31:53 UTC

OOM while reading key cache

Hello,
I'm facing OOM on reading key_cache
Cluster conf is as follows:
-6 machines which 8gb RAM each and three 150GB disks each
-default heap configuration
-deafult key cache configuration
-the biggest keyspace has abt 500GB size (RF: 2, so in fact there is
250GB of raw data).

After upgrading first of the machines from 1.2.11 to 2.0.2 i've recieved error:
 INFO [main] 2013-11-08 10:53:16,716 AutoSavingCache.java (line 114)
reading saved cache
/home/synat/nosql_filesystem/cassandra/data/saved_caches/production_storage-METADATA-KeyCache-b.db
ERROR [main] 2013-11-08 10:53:16,895 CassandraDaemon.java (line 478)
Exception encountered during startup
java.lang.OutOfMemoryError: Java heap space
        at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394)
        at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
        at org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:352)
        at org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:119)
        at org.apache.cassandra.db.ColumnFamilyStore.<init>(ColumnFamilyStore.java:264)
        at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:409)
        at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:381)
        at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:314)
        at org.apache.cassandra.db.Keyspace.<init>(Keyspace.java:268)
        at org.apache.cassandra.db.Keyspace.open(Keyspace.java:110)
        at org.apache.cassandra.db.Keyspace.open(Keyspace.java:88)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:274)
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:461)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:504)


Error appears every start, so I've decided to disable key cache (this
was not helpful) and temporarily moved key cache out of cache folder
(file was of size 13M). That helps in starting node, but this is only
workaround and it's not demanded configuration. Anyone has any idea
what is the real cause of problem with oom?
best regards
Aleksander
ps. I've still 5 nodes to upgrade, I'll inform if problem apperas on the rest.

Re: making sense of output from Eclipse Memory Analyzer tool taken from .hprof file

Posted by Aaron Morton <aa...@thelastpickle.com>.
What version of cassandra are you using ?
What are the JVM settings? (check with ps aux | grep cassandra)


OOM in cassandra 1.2+ is rare but there is also https://issues.apache.org/jira/browse/CASSANDRA-5706 and https://issues.apache.org/jira/browse/CASSANDRA-6087

> One instance of "org.apache.cassandra.db.ColumnFamilyStore" loaded by "sun.misc.Launcher$AppClassLoader @ 0x613e1bdc8" occupies 984,094,664 (11.64%) bytes.
938MB is a bit of memory, the CFS and data tracker are dealing with the memtable. This may indicate things are not being flushed from memory correctly. 

> •java.lang.Thread @ 0x73e1f74c8 CompactionExecutor:158 - 839,225,000 (9.92%) bytes.
> •java.lang.Thread @ 0x717f08178 MutationStage:31 - 809,909,192 (9.58%) bytes.
> •java.lang.Thread @ 0x717f082c8 MutationStage:5 - 649,667,472 (7.68%) bytes.
> •java.lang.Thread @ 0x717f083a8 MutationStage:21 - 498,081,544 (5.89%) bytes.
> •java.lang.Thread @ 0x71b357e70 MutationStage:11 - 444,931,288 (5.26%) bytes.
maybe very big rows and/or very big mutations. 

hope that helps. 

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 15/11/2013, at 12:34 pm, Mike Koh <de...@gmail.com> wrote:

> I am investigating Java Out of memory heap errors. So I created an .hprof file and loaded it into Eclipse Memory Analyzer Tool which gave some "Problem Suspects".
> 
> First one looks like:
> ----
> One instance of "org.apache.cassandra.db.ColumnFamilyStore" loaded by "sun.misc.Launcher$AppClassLoader @ 0x613e1bdc8" occupies 984,094,664 (11.64%) bytes. The memory is accumulated in one instance of "org.apache.cassandra.db.DataTracker$View" loaded by "sun.misc.Launcher$AppClassLoader @ 0x613e1bdc8".
> ----
> 
> If I click around into the verbiage, I believe I can pick out the name of a column family but that is about it. Can someone explain what the above means in more detail and if it is indicative of a problem?
> 
> 
> Next one looks like:
> -----
> •java.lang.Thread @ 0x73e1f74c8 CompactionExecutor:158 - 839,225,000 (9.92%) bytes.
> •java.lang.Thread @ 0x717f08178 MutationStage:31 - 809,909,192 (9.58%) bytes.
> •java.lang.Thread @ 0x717f082c8 MutationStage:5 - 649,667,472 (7.68%) bytes.
> •java.lang.Thread @ 0x717f083a8 MutationStage:21 - 498,081,544 (5.89%) bytes.
> •java.lang.Thread @ 0x71b357e70 MutationStage:11 - 444,931,288 (5.26%) bytes.
> ------
> If I click into the verbiage, they above Compaction and Mutations all seem to be referencing the same column family. Are the above related? Is there a way I can tell more exactly what is being compacted and/or mutated more specifically than which column family?


making sense of output from Eclipse Memory Analyzer tool taken from .hprof file

Posted by Mike Koh <de...@gmail.com>.
I am investigating Java Out of memory heap errors. So I created an .hprof 
file and loaded it into Eclipse Memory Analyzer Tool which gave some 
"Problem Suspects".

First one looks like:
----
One instance of "org.apache.cassandra.db.ColumnFamilyStore" loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x613e1bdc8" occupies 984,094,664 
(11.64%) bytes. The memory is accumulated in one instance of 
"org.apache.cassandra.db.DataTracker$View" loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x613e1bdc8".
----

If I click around into the verbiage, I believe I can pick out the name of 
a column family but that is about it. Can someone explain what the above 
means in more detail and if it is indicative of a problem?


Next one looks like:
-----
•java.lang.Thread @ 0x73e1f74c8 CompactionExecutor:158 - 839,225,000 
(9.92%) bytes.
•java.lang.Thread @ 0x717f08178 MutationStage:31 - 809,909,192 (9.58%) bytes.
•java.lang.Thread @ 0x717f082c8 MutationStage:5 - 649,667,472 (7.68%) bytes.
•java.lang.Thread @ 0x717f083a8 MutationStage:21 - 498,081,544 (5.89%) bytes.
•java.lang.Thread @ 0x71b357e70 MutationStage:11 - 444,931,288 (5.26%) bytes.
------
If I click into the verbiage, they above Compaction and Mutations all seem 
to be referencing the same column family. Are the above related? Is there 
a way I can tell more exactly what is being compacted and/or mutated more 
specifically than which column family?

Re: OOM while reading key cache

Posted by Fabien Rousseau <fa...@yakaz.com>.
A few month ago, we've got a similar issue on 1.2.6 :
https://issues.apache.org/jira/browse/CASSANDRA-5706

But it has been fixed and did not encountered this issue anymore (we're
also on 1.2.10)


2013/11/14 olek.stasiak@gmail.com <ol...@gmail.com>

> Yes, as I wrote in first e-mail.  When I removed key cache file
> cassandra started without further problems.
> regards
> Olek
>
> 2013/11/13 Robert Coli <rc...@eventbrite.com>:
> >
> > On Wed, Nov 13, 2013 at 12:35 AM, Tom van den Berge <to...@drillster.com>
> > wrote:
> >>
> >> I'm having the same problem, after upgrading from 1.2.3 to 1.2.10.
> >>
> >> I can remember this was a bug that was solved in the 1.0 or 1.1 version
> >> some time ago, but apparently it got back.
> >> A workaround is to delete the contents of the saved_caches directory
> >> before starting up.
> >
> >
> > Yours is not the first report of this I've heard resulting from a 1.2.x
> to
> > 1.2.x upgrade. Reports are of the form "I had to nuke my saved_caches or
> <I
> > couldn't start my node, it OOMED, etc.>".
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-6325
> >
> > Exists, but doesn't seem  to be the same issue.
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-5986
> >
> > Similar, doesn't seem to be an issue triggered by upgrade..
> >
> > If I were one of the posters on this thread, I would strongly consider
> > filing a JIRA on point.
> >
> > @OP (olek) : did removing the saved_caches also fix your problem?
> >
> > =Rob
> >
> >
>



-- 
Fabien Rousseau


 <au...@yakaz.com>www.yakaz.com

Re: OOM while reading key cache

Posted by "olek.stasiak@gmail.com" <ol...@gmail.com>.
Yes, as I wrote in first e-mail.  When I removed key cache file
cassandra started without further problems.
regards
Olek

2013/11/13 Robert Coli <rc...@eventbrite.com>:
>
> On Wed, Nov 13, 2013 at 12:35 AM, Tom van den Berge <to...@drillster.com>
> wrote:
>>
>> I'm having the same problem, after upgrading from 1.2.3 to 1.2.10.
>>
>> I can remember this was a bug that was solved in the 1.0 or 1.1 version
>> some time ago, but apparently it got back.
>> A workaround is to delete the contents of the saved_caches directory
>> before starting up.
>
>
> Yours is not the first report of this I've heard resulting from a 1.2.x to
> 1.2.x upgrade. Reports are of the form "I had to nuke my saved_caches or <I
> couldn't start my node, it OOMED, etc.>".
>
> https://issues.apache.org/jira/browse/CASSANDRA-6325
>
> Exists, but doesn't seem  to be the same issue.
>
> https://issues.apache.org/jira/browse/CASSANDRA-5986
>
> Similar, doesn't seem to be an issue triggered by upgrade..
>
> If I were one of the posters on this thread, I would strongly consider
> filing a JIRA on point.
>
> @OP (olek) : did removing the saved_caches also fix your problem?
>
> =Rob
>
>

Re: OOM while reading key cache

Posted by Robert Coli <rc...@eventbrite.com>.
On Wed, Nov 13, 2013 at 12:35 AM, Tom van den Berge <to...@drillster.com>wrote:

> I'm having the same problem, after upgrading from 1.2.3 to 1.2.10.
>
> I can remember this was a bug that was solved in the 1.0 or 1.1 version
> some time ago, but apparently it got back.
> A workaround is to delete the contents of the saved_caches directory
> before starting up.
>

Yours is not the first report of this I've heard resulting from a 1.2.x to
1.2.x upgrade. Reports are of the form "I had to nuke my saved_caches or <I
couldn't start my node, it OOMED, etc.>".

https://issues.apache.org/jira/browse/CASSANDRA-6325

Exists, but doesn't seem  to be the same issue.

https://issues.apache.org/jira/browse/CASSANDRA-5986

Similar, doesn't seem to be an issue triggered by upgrade..

If I were one of the posters on this thread, I would strongly consider
filing a JIRA on point.

@OP (olek) : did removing the saved_caches also fix your problem?

=Rob

Re: OOM while reading key cache

Posted by Tom van den Berge <to...@drillster.com>.
I'm having the same problem, after upgrading from 1.2.3 to 1.2.10.

I can remember this was a bug that was solved in the 1.0 or 1.1 version
some time ago, but apparently it got back.
A workaround is to delete the contents of the saved_caches directory before
starting up.


Tom


On Tue, Nov 12, 2013 at 5:15 AM, Aaron Morton <aa...@thelastpickle.com>wrote:

> -6 machines which 8gb RAM each and three 150GB disks each
> -default heap configuration
>
> With 8GB the default heap is 2GB, try kicking that up to 4GB and a 600 to
> 800 MB new heap.
>
> I would guess for the data load  you have 2GB is not enough.
>
> hope that helps.
>
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
>
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> On 8/11/2013, at 11:31 pm, olek.stasiak@gmail.com wrote:
>
> Hello,
> I'm facing OOM on reading key_cache
> Cluster conf is as follows:
> -6 machines which 8gb RAM each and three 150GB disks each
> -default heap configuration
> -deafult key cache configuration
> -the biggest keyspace has abt 500GB size (RF: 2, so in fact there is
> 250GB of raw data).
>
> After upgrading first of the machines from 1.2.11 to 2.0.2 i've recieved
> error:
> INFO [main] 2013-11-08 10:53:16,716 AutoSavingCache.java (line 114)
> reading saved cache
>
> /home/synat/nosql_filesystem/cassandra/data/saved_caches/production_storage-METADATA-KeyCache-b.db
> ERROR [main] 2013-11-08 10:53:16,895 CassandraDaemon.java (line 478)
> Exception encountered during startup
> java.lang.OutOfMemoryError: Java heap space
>        at
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394)
>        at
> org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
>        at
> org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:352)
>        at
> org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:119)
>        at
> org.apache.cassandra.db.ColumnFamilyStore.<init>(ColumnFamilyStore.java:264)
>        at
> org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:409)
>        at
> org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:381)
>        at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:314)
>        at org.apache.cassandra.db.Keyspace.<init>(Keyspace.java:268)
>        at org.apache.cassandra.db.Keyspace.open(Keyspace.java:110)
>        at org.apache.cassandra.db.Keyspace.open(Keyspace.java:88)
>        at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:274)
>        at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:461)
>        at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:504)
>
>
> Error appears every start, so I've decided to disable key cache (this
> was not helpful) and temporarily moved key cache out of cache folder
> (file was of size 13M). That helps in starting node, but this is only
> workaround and it's not demanded configuration. Anyone has any idea
> what is the real cause of problem with oom?
> best regards
> Aleksander
> ps. I've still 5 nodes to upgrade, I'll inform if problem apperas on the
> rest.
>
>
>

Re: OOM while reading key cache

Posted by Aaron Morton <aa...@thelastpickle.com>.
> -6 machines which 8gb RAM each and three 150GB disks each
> -default heap configuration
With 8GB the default heap is 2GB, try kicking that up to 4GB and a 600 to 800 MB new heap. 

I would guess for the data load  you have 2GB is not enough. 

hope that helps. 

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 8/11/2013, at 11:31 pm, olek.stasiak@gmail.com wrote:

> Hello,
> I'm facing OOM on reading key_cache
> Cluster conf is as follows:
> -6 machines which 8gb RAM each and three 150GB disks each
> -default heap configuration
> -deafult key cache configuration
> -the biggest keyspace has abt 500GB size (RF: 2, so in fact there is
> 250GB of raw data).
> 
> After upgrading first of the machines from 1.2.11 to 2.0.2 i've recieved error:
> INFO [main] 2013-11-08 10:53:16,716 AutoSavingCache.java (line 114)
> reading saved cache
> /home/synat/nosql_filesystem/cassandra/data/saved_caches/production_storage-METADATA-KeyCache-b.db
> ERROR [main] 2013-11-08 10:53:16,895 CassandraDaemon.java (line 478)
> Exception encountered during startup
> java.lang.OutOfMemoryError: Java heap space
>        at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394)
>        at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
>        at org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:352)
>        at org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:119)
>        at org.apache.cassandra.db.ColumnFamilyStore.<init>(ColumnFamilyStore.java:264)
>        at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:409)
>        at org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:381)
>        at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:314)
>        at org.apache.cassandra.db.Keyspace.<init>(Keyspace.java:268)
>        at org.apache.cassandra.db.Keyspace.open(Keyspace.java:110)
>        at org.apache.cassandra.db.Keyspace.open(Keyspace.java:88)
>        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:274)
>        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:461)
>        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:504)
> 
> 
> Error appears every start, so I've decided to disable key cache (this
> was not helpful) and temporarily moved key cache out of cache folder
> (file was of size 13M). That helps in starting node, but this is only
> workaround and it's not demanded configuration. Anyone has any idea
> what is the real cause of problem with oom?
> best regards
> Aleksander
> ps. I've still 5 nodes to upgrade, I'll inform if problem apperas on the rest.