You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jedd Rashbrooke <je...@imagini.net> on 2010/12/10 18:29:58 UTC

Memory leak with Sun Java 1.6 ?

 Howdi,

 We're using Cassandra 0.6.6 - intending to wait until 0.7 before
 we do any more upgrades.

 We're running a cluster of 16 boxes of 7.1GB each, on Amazon EC2
 using Ubuntu 10.04 (LTS).

 Today we saw one box kick its little feet up, and after investigating
 the other machines, it looks like they're all approaching the same fate.

 Over the past month or so, it looks like memory has slowly
 been exhausted.  Both nodetool drain and jmap can't run, and
 produce this error:

     Error occurred during initialization of VM
     Could not reserve enough space for object heap

 We've got Xmx/Xms set to 4GB.

 top shows free memory around 50-80MB, file cache under
 10MB, and the java process at 12+GB virt and 7.1GB res.

 This feels like a Java problem, not a Cassandra one, but I'm
 open to suggestions.  To ensure I don't get bothered over
 the weekend we're doing a rolling restart of Cassandra on
 each of the boxes now.  The last time they were restarted
 was just over a month ago.  Now I'm wondering whether I
 should (until 0.7.1 is available) schedule in a slower rolling
 restart over several days, every few weeks.

 I've shared a Zabbix graph of system memory at:

     http://www.imagebam.com/image/3b4213110283969

 cheers,
 Jedd.

Re: Memory leak with Sun Java 1.6 ?

Posted by Peter Schuller <pe...@infidyne.com>.
> If it helps, I also found quite a few of these in the logs
>
> org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=224101
>
> However a single cassandra instance locally (OSX, 1.6.0_22, mmap) runs just perfect for
> hours. No exceptions, no OOM.

Given that these exceptions show up on the receiving end of inter-node
communication, I think it sounds reasonable that you don't see it on a
local instance. Given that the inter-node communication triggers the
bug by concurrent use of the ByteBuffer:s - that trigger point would
no longer exist on a local node.

-- 
/ Peter Schuller

Re: Memory leak with Sun Java 1.6 ?

Posted by Timo Nentwig <ti...@toptarif.de>.
On Dec 14, 2010, at 23:23, Timo Nentwig wrote:

> On Dec 14, 2010, at 21:07, Peter Schuller wrote:
> 
>> In that case, based on the strack trace, I wonder if you're hitting
>> what I was hitting just yesterday/earlier today:
>> 
>>  https://issues.apache.org/jira/browse/CASSANDRA-1860
>> 
>> Which is suspected (currently being tested that it's gone with updated
>> 0.7) to be due to:
>> 
>>  https://issues.apache.org/jira/browse/CASSANDRA-1847
>> 
>> If the corruption happens to be slightly different in your case, maybe
>> it just ends up trying to allocate a large array instead of the
>> failure mode I got (EOF or bad cfId) on the receiving end.
>> 
>> Can you try to reproduce with the latest 0.7 branch as of today?
> 
> I'll try to do so tomorrow.

So I git'ed 0.7.0-rc2-SNAPSHOT 9:30 UTC+2 and this one indeed seems to run much more stable (still disk_access: standard), no problems to far (runs roughly for half an hour, 7.5M INSERTs, running nodetool cleanup concurrently on one node). And it's also significantly faster/back to usual speed. Hooray! :) When is rc3 going to be released?

Thanks for the support!

Re: Memory leak with Sun Java 1.6 ?

Posted by Timo Nentwig <ti...@toptarif.de>.
On Dec 14, 2010, at 21:07, Peter Schuller wrote:

> In that case, based on the strack trace, I wonder if you're hitting
> what I was hitting just yesterday/earlier today:
> 
>   https://issues.apache.org/jira/browse/CASSANDRA-1860
> 
> Which is suspected (currently being tested that it's gone with updated
> 0.7) to be due to:
> 
>   https://issues.apache.org/jira/browse/CASSANDRA-1847
> 
> If the corruption happens to be slightly different in your case, maybe
> it just ends up trying to allocate a large array instead of the
> failure mode I got (EOF or bad cfId) on the receiving end.
> 
> Can you try to reproduce with the latest 0.7 branch as of today?

I'll try to do so tomorrow.

> It seems very similar in nature at least based on the latest stack trace.

If it helps, I also found quite a few of these in the logs

org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=224101

However a single cassandra instance locally (OSX, 1.6.0_22, mmap) runs just perfect for 
hours. No exceptions, no OOM. 

Re: Memory leak with Sun Java 1.6 ?

Posted by Peter Schuller <pe...@infidyne.com>.
>> The stack trace doesn't make sense relative to what I get checking out
>> 0.6.6. Are you *sure* this is 0.6.6, without patches or other changes?
>
> Oh, sorry, the original poster of this thread was/is actually using 0.6, I am (as mentioned in other posts) actually on 0.7rc2. Sorry that I didn't point this out once again.

Sorry, my bad. I tend to catch up on the ML:s during transportation
to/from work and is in 'fast sifting' mode ;) Sometimes multiple
people from the same group/org/company posts int he same thread about
the same issue and I thought this was the case here.

In that case, based on the strack trace, I wonder if you're hitting
what I was hitting just yesterday/earlier today:

   https://issues.apache.org/jira/browse/CASSANDRA-1860

Which is suspected (currently being tested that it's gone with updated
0.7) to be due to:

   https://issues.apache.org/jira/browse/CASSANDRA-1847

If the corruption happens to be slightly different in your case, maybe
it just ends up trying to allocate a large array instead of the
failure mode I got (EOF or bad cfId) on the receiving end.

Can you try to reproduce with the latest 0.7 branch as of today?

It seems very similar in nature at least based on the latest stack trace.

-- 
/ Peter Schuller

Re: Memory leak with Sun Java 1.6 ?

Posted by Timo Nentwig <ti...@toptarif.de>.
On Dec 14, 2010, at 19:45, Peter Schuller wrote:

>> java.lang.OutOfMemoryError: Java heap space
>>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>>        at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
>>        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
>>        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
>>        at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
>>        at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
>>        at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
>>        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
>>        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
>>        at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
>>        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:662)
> 
> Still 0.6.6 right?
> 
> The stack trace doesn't make sense relative to what I get checking out
> 0.6.6. Are you *sure* this is 0.6.6, without patches or other changes?

Oh, sorry, the original poster of this thread was/is actually using 0.6, I am (as mentioned in other posts) actually on 0.7rc2. Sorry that I didn't point this out once again.

Re: Memory leak with Sun Java 1.6 ?

Posted by Peter Schuller <pe...@infidyne.com>.
> java.lang.OutOfMemoryError: Java heap space
>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>        at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
>        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
>        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
>        at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
>        at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
>        at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
>        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
>        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
>        at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
>        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)

Still 0.6.6 right?

The stack trace doesn't make sense relative to what I get checking out
0.6.6. Are you *sure* this is 0.6.6, without patches or other changes?

Maybe I'm making some kind of mistake, but on "my" 0.6.6 the above
doesn't jive. E.g. ColumnFamilySerializer.java:129 is an empty line
between two try blocks.

I want to see what buffer allocation is triggering this to judge
whether a large allocation could be driven by a legitimate large
value, or whether it is something completely different that might
indicate some kind of on-the-wire or in-memory corruption.

-- 
/ Peter Schuller

Re: Memory leak with Sun Java 1.6 ?

Posted by Timo Nentwig <ti...@toptarif.de>.
On Dec 14, 2010, at 18:45, Nate McCall wrote:

> Timo,
> Apologies if I missed it above, but how big is the batch size you are
> sending to batch_mutate?

Actually only a single row at a time for now (hector 0.7-21):

	final Cluster cluster = HFactory.getOrCreateCluster("Test", cassandraHostConfigurator);
	final Keyspace ks = HFactory.createKeyspace("test", cluster);

	final Mutator<String> mutator = HFactory.createMutator(ks, StringSerializer.get());

	mutator.addInsertion(id, "tracking", HFactory.createStringColumn("key", name))
	.addInsertion(id, "tracking", HFactory.createStringColumn("value", value))
	.execute();

Thinking of CASSANDRA-475 (this was also OOM, IIRC), is it possibly a hector/thrift bug?

> On Tue, Dec 14, 2010 at 9:15 AM, Timo Nentwig <ti...@toptarif.de> wrote:
>> On Dec 14, 2010, at 15:31, Timo Nentwig wrote:
>> 
>>> On Dec 14, 2010, at 14:41, Jonathan Ellis wrote:
>>> 
>>>> This is "A row has grown too large" section from that troubleshooting guide.
>>> 
>>> Why? This is what a typical "row" (?) looks like:
>>> 
>>> [default@test] list tracking limit 1;
>>> -------------------
>>> RowKey:  123
>>> => (column=key, value=foo, timestamp=1292238005886000)
>>> => (column=value, value=bar, timestamp=1292238005886000)
>>> 
>>> I'm importing the data from a RDBMS so I can say for sure that none of the 2 columns will contain more than 255 chars.
>> 
>> Started all over again. Deleleted /var/lib/cassandra/*, started 4 nodes (2x2G+2x1G heap, actual RAM twice as much, swap off) and my stress test: first node (1GB) crashed after 212s (!). Wrote roughly about 200k rows like the one above (data wasn't even flushed to disk!).
>> 
>>  INFO [main] 2010-12-14 15:58:23,938 CassandraDaemon.java (line 119) Listening for thrift clients...
>>  INFO [GossipStage:1] 2010-12-14 15:58:24,410 Gossiper.java (line 569) InetAddress /192.168.68.80 is now UP
>>  INFO [HintedHandoff:1] 2010-12-14 15:58:24,410 HintedHandOffManager.java (line 191) Started hinted handoff for endpoint /192.168.68.80
>>  INFO [HintedHandoff:1] 2010-12-14 15:58:24,412 HintedHandOffManager.java (line 247) Finished hinted handoff of 0 rows to endpoint /192.168.68.80
>>  INFO [GossipStage:1] 2010-12-14 15:58:26,335 Gossiper.java (line 577) Node /192.168.68.69 is now part of the cluster
>>  INFO [GossipStage:1] 2010-12-14 15:58:27,319 Gossiper.java (line 569) InetAddress /192.168.68.69 is now UP
>>  INFO [HintedHandoff:1] 2010-12-14 15:58:27,319 HintedHandOffManager.java (line 191) Started hinted handoff for endpoint /192.168.68.69
>>  INFO [HintedHandoff:1] 2010-12-14 15:58:27,320 HintedHandOffManager.java (line 247) Finished hinted handoff of 0 rows to endpoint /192.168.68.69
>>  INFO [GossipStage:1] 2010-12-14 15:58:29,446 Gossiper.java (line 577) Node /192.168.68.70 is now part of the cluster
>>  INFO [GossipStage:1] 2010-12-14 15:58:29,620 Gossiper.java (line 569) InetAddress /192.168.68.70 is now UP
>>  INFO [HintedHandoff:1] 2010-12-14 15:58:29,621 HintedHandOffManager.java (line 191) Started hinted handoff for endpoint /192.168.68.70
>>  INFO [HintedHandoff:1] 2010-12-14 15:58:29,621 HintedHandOffManager.java (line 247) Finished hinted handoff of 0 rows to endpoint /192.168.68.70
>>  INFO [MigrationStage:1] 2010-12-14 16:01:16,535 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for Migrations at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=15059)
>>  INFO [MigrationStage:1] 2010-12-14 16:01:16,536 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-Migrations@768974140(9092 bytes, 1 operations)
>>  INFO [MigrationStage:1] 2010-12-14 16:01:16,536 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for Schema at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=15059)
>>  INFO [MigrationStage:1] 2010-12-14 16:01:16,536 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-Schema@591783334(3765 bytes, 3 operations)
>>  INFO [FlushWriter:1] 2010-12-14 16:01:16,541 Memtable.java (line 155) Writing Memtable-Migrations@768974140(9092 bytes, 1 operations)
>>  INFO [FlushWriter:1] 2010-12-14 16:01:16,633 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/Migrations-e-1-Data.db (9225 bytes)
>>  INFO [FlushWriter:1] 2010-12-14 16:01:16,634 Memtable.java (line 155) Writing Memtable-Schema@591783334(3765 bytes, 3 operations)
>>  INFO [FlushWriter:1] 2010-12-14 16:01:16,706 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/Schema-e-1-Data.db (4053 bytes)
>>  INFO [Create index Indexed1.626972746864617465] 2010-12-14 16:01:16,718 ColumnFamilyStore.java (line 325) Creating index org.apache.cassandra.db.Table@354d581b.Indexed1.626972746864617465
>>  INFO [MigrationStage:1] 2010-12-14 16:01:16,725 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for Migrations at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=27559)
>>  INFO [MigrationStage:1] 2010-12-14 16:01:16,726 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-Migrations@1922556837(7973 bytes, 1 operations)
>>  INFO [Create index Indexed1.626972746864617465] 2010-12-14 16:01:16,729 ColumnFamilyStore.java (line 339) Index Indexed1.626972746864617465 complete
>>  INFO [MigrationStage:1] 2010-12-14 16:01:16,730 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for Schema at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=27559)
>>  INFO [FlushWriter:1] 2010-12-14 16:01:16,730 Memtable.java (line 155) Writing Memtable-Migrations@1922556837(7973 bytes, 1 operations)
>>  INFO [MigrationStage:1] 2010-12-14 16:01:16,730 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-Schema@1373806697(4029 bytes, 4 operations)
>>  INFO [Create index Indexed1.626972746864617465] 2010-12-14 16:01:16,733 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for IndexInfo at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=27707)
>>  INFO [Create index Indexed1.626972746864617465] 2010-12-14 16:01:16,734 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-IndexInfo@1972467582(42 bytes, 1 operations)
>>  INFO [FlushWriter:1] 2010-12-14 16:01:16,808 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/Migrations-e-2-Data.db (8106 bytes)
>>  INFO [FlushWriter:1] 2010-12-14 16:01:16,816 Memtable.java (line 155) Writing Memtable-Schema@1373806697(4029 bytes, 4 operations)
>>  INFO [FlushWriter:1] 2010-12-14 16:01:16,902 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/Schema-e-2-Data.db (4317 bytes)
>>  INFO [FlushWriter:1] 2010-12-14 16:01:16,904 Memtable.java (line 155) Writing Memtable-IndexInfo@1972467582(42 bytes, 1 operations)
>>  INFO [FlushWriter:1] 2010-12-14 16:01:16,968 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/IndexInfo-e-1-Data.db (170 bytes)
>> ERROR [MutationStage:22] 2010-12-14 16:02:00,283 AbstractCassandraDaemon.java (line 90) Fatal exception in thread Thread[MutationStage:22,5,main]
>> java.lang.OutOfMemoryError: Java heap space
>>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>>        at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
>>        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
>>        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
>>        at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
>>        at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
>>        at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
>>        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
>>        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
>>        at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
>>        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:662)
>> 
>> gc:
>> 
>> 210.919: [GC 210.919: [ParNew
>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>> - age   1:    1833480 bytes,    1833480 total
>> : 16139K->4221K(38336K), 0.0377710 secs]210.957: [CMS: 690490K->171727K(1006016K), 1.0604710 secs] 702251K->171727K(1044352K), [CMS Perm : 17750K->17734K(21248K)], 1.0985490 secs] [Times: user=1.11 sys=0.01, real=1.10 secs]
>> 212.017: [Full GC 212.017: [CMS: 171727K->171315K(1006016K), 1.0312260 secs] 171727K->171315K(1044352K), [CMS Perm : 17734K->17504K(29560K)], 1.0314660 secs] [Times: user=1.03 sys=0.00, real=1.03 secs]
>> Total time for which application threads were stopped: 2.1309030 seconds
>> Total time for which application threads were stopped: 4.4463770 seconds
>> Heap
>>  par new generation   total 38336K, used 15546K [0x00000000bae00000, 0x00000000bd790000, 0x00000000bd790000)
>>  eden space 34112K,  45% used [0x00000000bae00000, 0x00000000bbd2ebb8, 0x00000000bcf50000)
>>  from space 4224K,   0% used [0x00000000bd370000, 0x00000000bd370000, 0x00000000bd790000)
>>  to   space 4224K,   0% used [0x00000000bcf50000, 0x00000000bcf50000, 0x00000000bd370000)
>>  concurrent mark-sweep generation total 1006016K, used 171315K [0x00000000bd790000, 0x00000000fae00000, 0x00000000fae00000)
>>  concurrent-mark-sweep perm gen total 29560K, used 17572K [0x00000000fae00000, 0x00000000fcade000, 0x0000000100000000)
>> 
>> 
>> During this short time I experienced 4 times (mentioned this already on the dev list):
>> 
>> me.prettyprint.hector.api.exceptions.HCassandraInternalException: Cassandra encountered an internal error processing this request: TApplicationError type: 6 message:Internal error processing batch_mutate
>>        at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:27)
>>        at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
>>        at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:1)
>>        at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:89)
>>        at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:142)
>>        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
>>        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100)
>>        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106)
>>        at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:149)
>>        at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:1)
>>        at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
>>        at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:65)
>>        at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:146)
>>        at Trackr$3.run(Trackr.java:133)
>>        at Trackr$BoundedExecutor$1.run(Trackr.java:167)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:680)
>> 
>>> That row warning seems to be a 0.6 option, don't find it in 0.7 docs. Didn't find any related WARNs for some default value in the log also.
>>> 
>>>> On Tue, Dec 14, 2010 at 5:27 AM, Timo Nentwig <ti...@toptarif.de> wrote:
>>>> 
>>>> On Dec 12, 2010, at 17:21, Jonathan Ellis wrote:
>>>> 
>>>>> http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors
>>>> 
>>>> I can rule out the first 3. I was running cassandra with default settings, i.e. 1GB heap and 256M memtable. So, with 3 memtables+1GB the JVM should run with >1.75G (although http://wiki.apache.org/cassandra/MemtableThresholds considers to increase heap size only gently). Did so. 4GB machine with 2GB 64bit-JVM seemed to run stable for quite some time but then also crashed with OOM. Looking at the heap dump it's always the same: all memory nearly always bound in CompactionExecutor (ColumnFamilyStore/ConcurrentSkipListMap, respectively).
>>>> 
>>>> This looks like somebody else recently have had a similar problem (->Bottom line: more heap - which is okay, but I'd like to understand why):
>>>> http://www.mail-archive.com/user@cassandra.apache.org/msg07516.html
>>>> 
>>>> This is my only CF currently in use (via JMX):
>>>> 
>>>> - column_families:
>>>> - column_type: Standard
>>>>   comment: tracking column family
>>>>   compare_with: org.apache.cassandra.db.marshal.UTF8Type
>>>>   default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
>>>>   gc_grace_seconds: 864000
>>>>   key_cache_save_period_in_seconds: 3600
>>>>   keys_cached: 200000.0
>>>>   max_compaction_threshold: 32
>>>>   memtable_flush_after_mins: 60
>>>>   min_compaction_threshold: 4
>>>>   name: tracking
>>>>   read_repair_chance: 1.0
>>>>   row_cache_save_period_in_seconds: 0
>>>>   rows_cached: 0.0
>>>> name: test
>>>> replica_placement_strategy: org.apache.cassandra.locator.SimpleStrategy
>>>> replication_factor: 3
>>>> 
>>>> 
>>>> In addition...actually there is plenty of free memory on the heap (?):
>>>> 
>>>> 3605.478: [GC 3605.478: [ParNew
>>>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>>>> - age   1:     416112 bytes,     416112 total
>>>> : 16887K->553K(38336K), 0.0209550 secs]3605.499: [CMS: 1145267K->447565K(2054592K), 1.9143630 secs] 1161938K->447565K(2092928K), [CMS Perm : 18186K->18158K(30472K)], 1.9355340 secs] [Times: user=1.95 sys=0.00, real=1.94 secs]
>>>> 3607.414: [Full GC 3607.414: [CMS: 447565K->447453K(2054592K), 1.9694960 secs] 447565K->447453K(2092928K), [CMS Perm : 18158K->18025K(30472K)], 1.9696450 secs] [Times: user=1.92 sys=0.00, real=1.97 secs]
>>>> Total time for which application threads were stopped: 3.9070380 seconds
>>>> Total time for which application threads were stopped: 7.3388640 seconds
>>>> Total time for which application threads were stopped: 0.0560610 seconds
>>>> 3616.931: [GC 3616.931: [ParNew
>>>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>>>> - age   1:     474264 bytes,     474264 total
>>>> : 34112K->747K(38336K), 0.0098680 secs] 481565K->448201K(2092928K), 0.0099690 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
>>>> Total time for which application threads were stopped: 0.0108670 seconds
>>>> 3617.035: [GC 3617.035: [ParNew
>>>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>>>> - age   1:      63040 bytes,      63040 total
>>>> : 34859K->440K(38336K), 0.0065950 secs] 482313K->448455K(2092928K), 0.0066880 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>>>> Total time for which application threads were stopped: 0.0075850 seconds
>>>> 3617.133: [GC 3617.133: [ParNew
>>>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>>>> - age   1:      23016 bytes,      23016 total
>>>> : 34552K->121K(38336K), 0.0042920 secs] 482567K->448193K(2092928K), 0.0043650 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>>>> Total time for which application threads were stopped: 0.0049630 seconds
>>>> 3617.228: [GC 3617.228: [ParNew
>>>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>>>> - age   1:      16992 bytes,      16992 total
>>>> : 34233K->34K(38336K), 0.0043180 secs] 482305K->448122K(2092928K), 0.0043910 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>>>> Total time for which application threads were stopped: 0.0049150 seconds
>>>> 3617.323: [GC 3617.323: [ParNew
>>>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>>>> - age   1:      18456 bytes,      18456 total
>>>> : 34146K->29K(38336K), 0.0038930 secs] 482234K->448127K(2092928K), 0.0039810 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>>>> Total time for which application threads were stopped: 0.0055390 seconds
>>>> Heap
>>>> par new generation   total 38336K, used 17865K [0x000000077ae00000, 0x000000077d790000, 0x000000077d790000)
>>>> eden space 34112K,  52% used [0x000000077ae00000, 0x000000077bf6afb0, 0x000000077cf50000)
>>>> from space 4224K,   0% used [0x000000077cf50000, 0x000000077cf57720, 0x000000077d370000)
>>>> to   space 4224K,   0% used [0x000000077d370000, 0x000000077d370000, 0x000000077d790000)
>>>> concurrent mark-sweep generation total 2054592K, used 448097K [0x000000077d790000, 0x00000007fae00000, 0x00000007fae00000)
>>>> concurrent-mark-sweep perm gen total 30472K, used 18125K [0x00000007fae00000, 0x00000007fcbc2000, 0x0000000800000000)
>>>> 
>>>>> 
>>>>> On Sun, Dec 12, 2010 at 9:52 AM, Timo Nentwig <ti...@toptarif.de> wrote:
>>>>> 
>>>>> On Dec 10, 2010, at 19:37, Peter Schuller wrote:
>>>>> 
>>>>>> To cargo cult it: Are you running a modern JVM? (Not e.g. openjdk b17
>>>>>> in lenny or some such.) If it is a JVM issue, ensuring you're using a
>>>>>> reasonably recent JVM is probably much easier than to start tracking
>>>>>> it down...
>>>>> 
>>>>> I had OOM problems with OpenJDK, switched to Sun/Oracle's recent 1.6.0_23 and...still have the same problem :-\ Stack trace always looks the same:
>>>>> 
>>>>> java.lang.OutOfMemoryError: Java heap space
>>>>>       at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>>>>>       at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>>>>>       at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
>>>>>       at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
>>>>>       at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
>>>>>       at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
>>>>>       at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
>>>>>       at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
>>>>>       at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
>>>>>       at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
>>>>>       at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
>>>>>       at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
>>>>>       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>>>>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>>>>       at java.lang.Thread.run(Thread.java:636)
>>>>> 
>>>>> I'm writing from 1 client with 50 threads to a cluster of 4 machines (with hector). With QUORUM and ONE 2 machines quite reliably will soon die with OOM. What may cause this? Won't cassandra block/reject when memtable is full and being flushed to disk but grow and if flushing to disk isn't fast enough will run out of memory?
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Jonathan Ellis
>>>>> Project Chair, Apache Cassandra
>>>>> co-founder of Riptano, the source for professional Cassandra support
>>>>> http://riptano.com
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Jonathan Ellis
>>>> Project Chair, Apache Cassandra
>>>> co-founder of Riptano, the source for professional Cassandra support
>>>> http://riptano.com
>>> 
>> 
>> 


Re: Memory leak with Sun Java 1.6 ?

Posted by Nate McCall <na...@riptano.com>.
Timo,
Apologies if I missed it above, but how big is the batch size you are
sending to batch_mutate?

On Tue, Dec 14, 2010 at 9:15 AM, Timo Nentwig <ti...@toptarif.de> wrote:
> On Dec 14, 2010, at 15:31, Timo Nentwig wrote:
>
>> On Dec 14, 2010, at 14:41, Jonathan Ellis wrote:
>>
>>> This is "A row has grown too large" section from that troubleshooting guide.
>>
>> Why? This is what a typical "row" (?) looks like:
>>
>> [default@test] list tracking limit 1;
>> -------------------
>> RowKey:  123
>> => (column=key, value=foo, timestamp=1292238005886000)
>> => (column=value, value=bar, timestamp=1292238005886000)
>>
>> I'm importing the data from a RDBMS so I can say for sure that none of the 2 columns will contain more than 255 chars.
>
> Started all over again. Deleleted /var/lib/cassandra/*, started 4 nodes (2x2G+2x1G heap, actual RAM twice as much, swap off) and my stress test: first node (1GB) crashed after 212s (!). Wrote roughly about 200k rows like the one above (data wasn't even flushed to disk!).
>
>  INFO [main] 2010-12-14 15:58:23,938 CassandraDaemon.java (line 119) Listening for thrift clients...
>  INFO [GossipStage:1] 2010-12-14 15:58:24,410 Gossiper.java (line 569) InetAddress /192.168.68.80 is now UP
>  INFO [HintedHandoff:1] 2010-12-14 15:58:24,410 HintedHandOffManager.java (line 191) Started hinted handoff for endpoint /192.168.68.80
>  INFO [HintedHandoff:1] 2010-12-14 15:58:24,412 HintedHandOffManager.java (line 247) Finished hinted handoff of 0 rows to endpoint /192.168.68.80
>  INFO [GossipStage:1] 2010-12-14 15:58:26,335 Gossiper.java (line 577) Node /192.168.68.69 is now part of the cluster
>  INFO [GossipStage:1] 2010-12-14 15:58:27,319 Gossiper.java (line 569) InetAddress /192.168.68.69 is now UP
>  INFO [HintedHandoff:1] 2010-12-14 15:58:27,319 HintedHandOffManager.java (line 191) Started hinted handoff for endpoint /192.168.68.69
>  INFO [HintedHandoff:1] 2010-12-14 15:58:27,320 HintedHandOffManager.java (line 247) Finished hinted handoff of 0 rows to endpoint /192.168.68.69
>  INFO [GossipStage:1] 2010-12-14 15:58:29,446 Gossiper.java (line 577) Node /192.168.68.70 is now part of the cluster
>  INFO [GossipStage:1] 2010-12-14 15:58:29,620 Gossiper.java (line 569) InetAddress /192.168.68.70 is now UP
>  INFO [HintedHandoff:1] 2010-12-14 15:58:29,621 HintedHandOffManager.java (line 191) Started hinted handoff for endpoint /192.168.68.70
>  INFO [HintedHandoff:1] 2010-12-14 15:58:29,621 HintedHandOffManager.java (line 247) Finished hinted handoff of 0 rows to endpoint /192.168.68.70
>  INFO [MigrationStage:1] 2010-12-14 16:01:16,535 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for Migrations at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=15059)
>  INFO [MigrationStage:1] 2010-12-14 16:01:16,536 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-Migrations@768974140(9092 bytes, 1 operations)
>  INFO [MigrationStage:1] 2010-12-14 16:01:16,536 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for Schema at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=15059)
>  INFO [MigrationStage:1] 2010-12-14 16:01:16,536 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-Schema@591783334(3765 bytes, 3 operations)
>  INFO [FlushWriter:1] 2010-12-14 16:01:16,541 Memtable.java (line 155) Writing Memtable-Migrations@768974140(9092 bytes, 1 operations)
>  INFO [FlushWriter:1] 2010-12-14 16:01:16,633 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/Migrations-e-1-Data.db (9225 bytes)
>  INFO [FlushWriter:1] 2010-12-14 16:01:16,634 Memtable.java (line 155) Writing Memtable-Schema@591783334(3765 bytes, 3 operations)
>  INFO [FlushWriter:1] 2010-12-14 16:01:16,706 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/Schema-e-1-Data.db (4053 bytes)
>  INFO [Create index Indexed1.626972746864617465] 2010-12-14 16:01:16,718 ColumnFamilyStore.java (line 325) Creating index org.apache.cassandra.db.Table@354d581b.Indexed1.626972746864617465
>  INFO [MigrationStage:1] 2010-12-14 16:01:16,725 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for Migrations at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=27559)
>  INFO [MigrationStage:1] 2010-12-14 16:01:16,726 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-Migrations@1922556837(7973 bytes, 1 operations)
>  INFO [Create index Indexed1.626972746864617465] 2010-12-14 16:01:16,729 ColumnFamilyStore.java (line 339) Index Indexed1.626972746864617465 complete
>  INFO [MigrationStage:1] 2010-12-14 16:01:16,730 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for Schema at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=27559)
>  INFO [FlushWriter:1] 2010-12-14 16:01:16,730 Memtable.java (line 155) Writing Memtable-Migrations@1922556837(7973 bytes, 1 operations)
>  INFO [MigrationStage:1] 2010-12-14 16:01:16,730 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-Schema@1373806697(4029 bytes, 4 operations)
>  INFO [Create index Indexed1.626972746864617465] 2010-12-14 16:01:16,733 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for IndexInfo at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=27707)
>  INFO [Create index Indexed1.626972746864617465] 2010-12-14 16:01:16,734 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-IndexInfo@1972467582(42 bytes, 1 operations)
>  INFO [FlushWriter:1] 2010-12-14 16:01:16,808 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/Migrations-e-2-Data.db (8106 bytes)
>  INFO [FlushWriter:1] 2010-12-14 16:01:16,816 Memtable.java (line 155) Writing Memtable-Schema@1373806697(4029 bytes, 4 operations)
>  INFO [FlushWriter:1] 2010-12-14 16:01:16,902 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/Schema-e-2-Data.db (4317 bytes)
>  INFO [FlushWriter:1] 2010-12-14 16:01:16,904 Memtable.java (line 155) Writing Memtable-IndexInfo@1972467582(42 bytes, 1 operations)
>  INFO [FlushWriter:1] 2010-12-14 16:01:16,968 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/IndexInfo-e-1-Data.db (170 bytes)
> ERROR [MutationStage:22] 2010-12-14 16:02:00,283 AbstractCassandraDaemon.java (line 90) Fatal exception in thread Thread[MutationStage:22,5,main]
> java.lang.OutOfMemoryError: Java heap space
>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>        at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
>        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
>        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
>        at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
>        at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
>        at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
>        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
>        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
>        at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
>        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
>
> gc:
>
> 210.919: [GC 210.919: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:    1833480 bytes,    1833480 total
> : 16139K->4221K(38336K), 0.0377710 secs]210.957: [CMS: 690490K->171727K(1006016K), 1.0604710 secs] 702251K->171727K(1044352K), [CMS Perm : 17750K->17734K(21248K)], 1.0985490 secs] [Times: user=1.11 sys=0.01, real=1.10 secs]
> 212.017: [Full GC 212.017: [CMS: 171727K->171315K(1006016K), 1.0312260 secs] 171727K->171315K(1044352K), [CMS Perm : 17734K->17504K(29560K)], 1.0314660 secs] [Times: user=1.03 sys=0.00, real=1.03 secs]
> Total time for which application threads were stopped: 2.1309030 seconds
> Total time for which application threads were stopped: 4.4463770 seconds
> Heap
>  par new generation   total 38336K, used 15546K [0x00000000bae00000, 0x00000000bd790000, 0x00000000bd790000)
>  eden space 34112K,  45% used [0x00000000bae00000, 0x00000000bbd2ebb8, 0x00000000bcf50000)
>  from space 4224K,   0% used [0x00000000bd370000, 0x00000000bd370000, 0x00000000bd790000)
>  to   space 4224K,   0% used [0x00000000bcf50000, 0x00000000bcf50000, 0x00000000bd370000)
>  concurrent mark-sweep generation total 1006016K, used 171315K [0x00000000bd790000, 0x00000000fae00000, 0x00000000fae00000)
>  concurrent-mark-sweep perm gen total 29560K, used 17572K [0x00000000fae00000, 0x00000000fcade000, 0x0000000100000000)
>
>
> During this short time I experienced 4 times (mentioned this already on the dev list):
>
> me.prettyprint.hector.api.exceptions.HCassandraInternalException: Cassandra encountered an internal error processing this request: TApplicationError type: 6 message:Internal error processing batch_mutate
>        at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:27)
>        at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
>        at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:1)
>        at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:89)
>        at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:142)
>        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
>        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100)
>        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106)
>        at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:149)
>        at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:1)
>        at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
>        at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:65)
>        at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:146)
>        at Trackr$3.run(Trackr.java:133)
>        at Trackr$BoundedExecutor$1.run(Trackr.java:167)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:680)
>
>> That row warning seems to be a 0.6 option, don't find it in 0.7 docs. Didn't find any related WARNs for some default value in the log also.
>>
>>> On Tue, Dec 14, 2010 at 5:27 AM, Timo Nentwig <ti...@toptarif.de> wrote:
>>>
>>> On Dec 12, 2010, at 17:21, Jonathan Ellis wrote:
>>>
>>>> http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors
>>>
>>> I can rule out the first 3. I was running cassandra with default settings, i.e. 1GB heap and 256M memtable. So, with 3 memtables+1GB the JVM should run with >1.75G (although http://wiki.apache.org/cassandra/MemtableThresholds considers to increase heap size only gently). Did so. 4GB machine with 2GB 64bit-JVM seemed to run stable for quite some time but then also crashed with OOM. Looking at the heap dump it's always the same: all memory nearly always bound in CompactionExecutor (ColumnFamilyStore/ConcurrentSkipListMap, respectively).
>>>
>>> This looks like somebody else recently have had a similar problem (->Bottom line: more heap - which is okay, but I'd like to understand why):
>>> http://www.mail-archive.com/user@cassandra.apache.org/msg07516.html
>>>
>>> This is my only CF currently in use (via JMX):
>>>
>>> - column_families:
>>> - column_type: Standard
>>>   comment: tracking column family
>>>   compare_with: org.apache.cassandra.db.marshal.UTF8Type
>>>   default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
>>>   gc_grace_seconds: 864000
>>>   key_cache_save_period_in_seconds: 3600
>>>   keys_cached: 200000.0
>>>   max_compaction_threshold: 32
>>>   memtable_flush_after_mins: 60
>>>   min_compaction_threshold: 4
>>>   name: tracking
>>>   read_repair_chance: 1.0
>>>   row_cache_save_period_in_seconds: 0
>>>   rows_cached: 0.0
>>> name: test
>>> replica_placement_strategy: org.apache.cassandra.locator.SimpleStrategy
>>> replication_factor: 3
>>>
>>>
>>> In addition...actually there is plenty of free memory on the heap (?):
>>>
>>> 3605.478: [GC 3605.478: [ParNew
>>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>>> - age   1:     416112 bytes,     416112 total
>>> : 16887K->553K(38336K), 0.0209550 secs]3605.499: [CMS: 1145267K->447565K(2054592K), 1.9143630 secs] 1161938K->447565K(2092928K), [CMS Perm : 18186K->18158K(30472K)], 1.9355340 secs] [Times: user=1.95 sys=0.00, real=1.94 secs]
>>> 3607.414: [Full GC 3607.414: [CMS: 447565K->447453K(2054592K), 1.9694960 secs] 447565K->447453K(2092928K), [CMS Perm : 18158K->18025K(30472K)], 1.9696450 secs] [Times: user=1.92 sys=0.00, real=1.97 secs]
>>> Total time for which application threads were stopped: 3.9070380 seconds
>>> Total time for which application threads were stopped: 7.3388640 seconds
>>> Total time for which application threads were stopped: 0.0560610 seconds
>>> 3616.931: [GC 3616.931: [ParNew
>>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>>> - age   1:     474264 bytes,     474264 total
>>> : 34112K->747K(38336K), 0.0098680 secs] 481565K->448201K(2092928K), 0.0099690 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
>>> Total time for which application threads were stopped: 0.0108670 seconds
>>> 3617.035: [GC 3617.035: [ParNew
>>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>>> - age   1:      63040 bytes,      63040 total
>>> : 34859K->440K(38336K), 0.0065950 secs] 482313K->448455K(2092928K), 0.0066880 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>>> Total time for which application threads were stopped: 0.0075850 seconds
>>> 3617.133: [GC 3617.133: [ParNew
>>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>>> - age   1:      23016 bytes,      23016 total
>>> : 34552K->121K(38336K), 0.0042920 secs] 482567K->448193K(2092928K), 0.0043650 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>>> Total time for which application threads were stopped: 0.0049630 seconds
>>> 3617.228: [GC 3617.228: [ParNew
>>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>>> - age   1:      16992 bytes,      16992 total
>>> : 34233K->34K(38336K), 0.0043180 secs] 482305K->448122K(2092928K), 0.0043910 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>>> Total time for which application threads were stopped: 0.0049150 seconds
>>> 3617.323: [GC 3617.323: [ParNew
>>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>>> - age   1:      18456 bytes,      18456 total
>>> : 34146K->29K(38336K), 0.0038930 secs] 482234K->448127K(2092928K), 0.0039810 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>>> Total time for which application threads were stopped: 0.0055390 seconds
>>> Heap
>>> par new generation   total 38336K, used 17865K [0x000000077ae00000, 0x000000077d790000, 0x000000077d790000)
>>> eden space 34112K,  52% used [0x000000077ae00000, 0x000000077bf6afb0, 0x000000077cf50000)
>>> from space 4224K,   0% used [0x000000077cf50000, 0x000000077cf57720, 0x000000077d370000)
>>> to   space 4224K,   0% used [0x000000077d370000, 0x000000077d370000, 0x000000077d790000)
>>> concurrent mark-sweep generation total 2054592K, used 448097K [0x000000077d790000, 0x00000007fae00000, 0x00000007fae00000)
>>> concurrent-mark-sweep perm gen total 30472K, used 18125K [0x00000007fae00000, 0x00000007fcbc2000, 0x0000000800000000)
>>>
>>>>
>>>> On Sun, Dec 12, 2010 at 9:52 AM, Timo Nentwig <ti...@toptarif.de> wrote:
>>>>
>>>> On Dec 10, 2010, at 19:37, Peter Schuller wrote:
>>>>
>>>>> To cargo cult it: Are you running a modern JVM? (Not e.g. openjdk b17
>>>>> in lenny or some such.) If it is a JVM issue, ensuring you're using a
>>>>> reasonably recent JVM is probably much easier than to start tracking
>>>>> it down...
>>>>
>>>> I had OOM problems with OpenJDK, switched to Sun/Oracle's recent 1.6.0_23 and...still have the same problem :-\ Stack trace always looks the same:
>>>>
>>>> java.lang.OutOfMemoryError: Java heap space
>>>>       at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>>>>       at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>>>>       at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
>>>>       at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
>>>>       at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
>>>>       at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
>>>>       at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
>>>>       at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
>>>>       at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
>>>>       at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
>>>>       at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
>>>>       at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
>>>>       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>>>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>>>       at java.lang.Thread.run(Thread.java:636)
>>>>
>>>> I'm writing from 1 client with 50 threads to a cluster of 4 machines (with hector). With QUORUM and ONE 2 machines quite reliably will soon die with OOM. What may cause this? Won't cassandra block/reject when memtable is full and being flushed to disk but grow and if flushing to disk isn't fast enough will run out of memory?
>>>>
>>>>
>>>>
>>>> --
>>>> Jonathan Ellis
>>>> Project Chair, Apache Cassandra
>>>> co-founder of Riptano, the source for professional Cassandra support
>>>> http://riptano.com
>>>
>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of Riptano, the source for professional Cassandra support
>>> http://riptano.com
>>
>
>

Re: Memory leak with Sun Java 1.6 ?

Posted by Timo Nentwig <ti...@toptarif.de>.
On Dec 14, 2010, at 15:31, Timo Nentwig wrote:

> On Dec 14, 2010, at 14:41, Jonathan Ellis wrote:
> 
>> This is "A row has grown too large" section from that troubleshooting guide.
> 
> Why? This is what a typical "row" (?) looks like:
> 
> [default@test] list tracking limit 1;
> -------------------
> RowKey:  123
> => (column=key, value=foo, timestamp=1292238005886000)
> => (column=value, value=bar, timestamp=1292238005886000)
> 
> I'm importing the data from a RDBMS so I can say for sure that none of the 2 columns will contain more than 255 chars.

Started all over again. Deleleted /var/lib/cassandra/*, started 4 nodes (2x2G+2x1G heap, actual RAM twice as much, swap off) and my stress test: first node (1GB) crashed after 212s (!). Wrote roughly about 200k rows like the one above (data wasn't even flushed to disk!).

 INFO [main] 2010-12-14 15:58:23,938 CassandraDaemon.java (line 119) Listening for thrift clients...
 INFO [GossipStage:1] 2010-12-14 15:58:24,410 Gossiper.java (line 569) InetAddress /192.168.68.80 is now UP
 INFO [HintedHandoff:1] 2010-12-14 15:58:24,410 HintedHandOffManager.java (line 191) Started hinted handoff for endpoint /192.168.68.80
 INFO [HintedHandoff:1] 2010-12-14 15:58:24,412 HintedHandOffManager.java (line 247) Finished hinted handoff of 0 rows to endpoint /192.168.68.80
 INFO [GossipStage:1] 2010-12-14 15:58:26,335 Gossiper.java (line 577) Node /192.168.68.69 is now part of the cluster
 INFO [GossipStage:1] 2010-12-14 15:58:27,319 Gossiper.java (line 569) InetAddress /192.168.68.69 is now UP
 INFO [HintedHandoff:1] 2010-12-14 15:58:27,319 HintedHandOffManager.java (line 191) Started hinted handoff for endpoint /192.168.68.69
 INFO [HintedHandoff:1] 2010-12-14 15:58:27,320 HintedHandOffManager.java (line 247) Finished hinted handoff of 0 rows to endpoint /192.168.68.69
 INFO [GossipStage:1] 2010-12-14 15:58:29,446 Gossiper.java (line 577) Node /192.168.68.70 is now part of the cluster
 INFO [GossipStage:1] 2010-12-14 15:58:29,620 Gossiper.java (line 569) InetAddress /192.168.68.70 is now UP
 INFO [HintedHandoff:1] 2010-12-14 15:58:29,621 HintedHandOffManager.java (line 191) Started hinted handoff for endpoint /192.168.68.70
 INFO [HintedHandoff:1] 2010-12-14 15:58:29,621 HintedHandOffManager.java (line 247) Finished hinted handoff of 0 rows to endpoint /192.168.68.70
 INFO [MigrationStage:1] 2010-12-14 16:01:16,535 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for Migrations at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=15059)
 INFO [MigrationStage:1] 2010-12-14 16:01:16,536 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-Migrations@768974140(9092 bytes, 1 operations)
 INFO [MigrationStage:1] 2010-12-14 16:01:16,536 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for Schema at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=15059)
 INFO [MigrationStage:1] 2010-12-14 16:01:16,536 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-Schema@591783334(3765 bytes, 3 operations)
 INFO [FlushWriter:1] 2010-12-14 16:01:16,541 Memtable.java (line 155) Writing Memtable-Migrations@768974140(9092 bytes, 1 operations)
 INFO [FlushWriter:1] 2010-12-14 16:01:16,633 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/Migrations-e-1-Data.db (9225 bytes)
 INFO [FlushWriter:1] 2010-12-14 16:01:16,634 Memtable.java (line 155) Writing Memtable-Schema@591783334(3765 bytes, 3 operations)
 INFO [FlushWriter:1] 2010-12-14 16:01:16,706 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/Schema-e-1-Data.db (4053 bytes)
 INFO [Create index Indexed1.626972746864617465] 2010-12-14 16:01:16,718 ColumnFamilyStore.java (line 325) Creating index org.apache.cassandra.db.Table@354d581b.Indexed1.626972746864617465
 INFO [MigrationStage:1] 2010-12-14 16:01:16,725 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for Migrations at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=27559)
 INFO [MigrationStage:1] 2010-12-14 16:01:16,726 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-Migrations@1922556837(7973 bytes, 1 operations)
 INFO [Create index Indexed1.626972746864617465] 2010-12-14 16:01:16,729 ColumnFamilyStore.java (line 339) Index Indexed1.626972746864617465 complete
 INFO [MigrationStage:1] 2010-12-14 16:01:16,730 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for Schema at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=27559)
 INFO [FlushWriter:1] 2010-12-14 16:01:16,730 Memtable.java (line 155) Writing Memtable-Migrations@1922556837(7973 bytes, 1 operations)
 INFO [MigrationStage:1] 2010-12-14 16:01:16,730 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-Schema@1373806697(4029 bytes, 4 operations)
 INFO [Create index Indexed1.626972746864617465] 2010-12-14 16:01:16,733 ColumnFamilyStore.java (line 639) switching in a fresh Memtable for IndexInfo at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1292338703483.log', position=27707)
 INFO [Create index Indexed1.626972746864617465] 2010-12-14 16:01:16,734 ColumnFamilyStore.java (line 943) Enqueuing flush of Memtable-IndexInfo@1972467582(42 bytes, 1 operations)
 INFO [FlushWriter:1] 2010-12-14 16:01:16,808 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/Migrations-e-2-Data.db (8106 bytes)
 INFO [FlushWriter:1] 2010-12-14 16:01:16,816 Memtable.java (line 155) Writing Memtable-Schema@1373806697(4029 bytes, 4 operations)
 INFO [FlushWriter:1] 2010-12-14 16:01:16,902 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/Schema-e-2-Data.db (4317 bytes)
 INFO [FlushWriter:1] 2010-12-14 16:01:16,904 Memtable.java (line 155) Writing Memtable-IndexInfo@1972467582(42 bytes, 1 operations)
 INFO [FlushWriter:1] 2010-12-14 16:01:16,968 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/system/IndexInfo-e-1-Data.db (170 bytes)
ERROR [MutationStage:22] 2010-12-14 16:02:00,283 AbstractCassandraDaemon.java (line 90) Fatal exception in thread Thread[MutationStage:22,5,main]
java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
        at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
        at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
        at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
        at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
        at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

gc:

210.919: [GC 210.919: [ParNew
Desired survivor size 2162688 bytes, new threshold 1 (max 1)
- age   1:    1833480 bytes,    1833480 total
: 16139K->4221K(38336K), 0.0377710 secs]210.957: [CMS: 690490K->171727K(1006016K), 1.0604710 secs] 702251K->171727K(1044352K), [CMS Perm : 17750K->17734K(21248K)], 1.0985490 secs] [Times: user=1.11 sys=0.01, real=1.10 secs] 
212.017: [Full GC 212.017: [CMS: 171727K->171315K(1006016K), 1.0312260 secs] 171727K->171315K(1044352K), [CMS Perm : 17734K->17504K(29560K)], 1.0314660 secs] [Times: user=1.03 sys=0.00, real=1.03 secs] 
Total time for which application threads were stopped: 2.1309030 seconds
Total time for which application threads were stopped: 4.4463770 seconds
Heap
 par new generation   total 38336K, used 15546K [0x00000000bae00000, 0x00000000bd790000, 0x00000000bd790000)
  eden space 34112K,  45% used [0x00000000bae00000, 0x00000000bbd2ebb8, 0x00000000bcf50000)
  from space 4224K,   0% used [0x00000000bd370000, 0x00000000bd370000, 0x00000000bd790000)
  to   space 4224K,   0% used [0x00000000bcf50000, 0x00000000bcf50000, 0x00000000bd370000)
 concurrent mark-sweep generation total 1006016K, used 171315K [0x00000000bd790000, 0x00000000fae00000, 0x00000000fae00000)
 concurrent-mark-sweep perm gen total 29560K, used 17572K [0x00000000fae00000, 0x00000000fcade000, 0x0000000100000000)


During this short time I experienced 4 times (mentioned this already on the dev list):

me.prettyprint.hector.api.exceptions.HCassandraInternalException: Cassandra encountered an internal error processing this request: TApplicationError type: 6 message:Internal error processing batch_mutate
	at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:27)
	at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:95)
	at me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceImpl.java:1)
	at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:89)
	at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:142)
	at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129)
	at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:100)
	at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:106)
	at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:149)
	at me.prettyprint.cassandra.model.MutatorImpl$2.doInKeyspace(MutatorImpl.java:1)
	at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
	at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:65)
	at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:146)
	at Trackr$3.run(Trackr.java:133)
	at Trackr$BoundedExecutor$1.run(Trackr.java:167)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:680)

> That row warning seems to be a 0.6 option, don't find it in 0.7 docs. Didn't find any related WARNs for some default value in the log also.
> 
>> On Tue, Dec 14, 2010 at 5:27 AM, Timo Nentwig <ti...@toptarif.de> wrote:
>> 
>> On Dec 12, 2010, at 17:21, Jonathan Ellis wrote:
>> 
>>> http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors
>> 
>> I can rule out the first 3. I was running cassandra with default settings, i.e. 1GB heap and 256M memtable. So, with 3 memtables+1GB the JVM should run with >1.75G (although http://wiki.apache.org/cassandra/MemtableThresholds considers to increase heap size only gently). Did so. 4GB machine with 2GB 64bit-JVM seemed to run stable for quite some time but then also crashed with OOM. Looking at the heap dump it's always the same: all memory nearly always bound in CompactionExecutor (ColumnFamilyStore/ConcurrentSkipListMap, respectively).
>> 
>> This looks like somebody else recently have had a similar problem (->Bottom line: more heap - which is okay, but I'd like to understand why):
>> http://www.mail-archive.com/user@cassandra.apache.org/msg07516.html
>> 
>> This is my only CF currently in use (via JMX):
>> 
>> - column_families:
>> - column_type: Standard
>>   comment: tracking column family
>>   compare_with: org.apache.cassandra.db.marshal.UTF8Type
>>   default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
>>   gc_grace_seconds: 864000
>>   key_cache_save_period_in_seconds: 3600
>>   keys_cached: 200000.0
>>   max_compaction_threshold: 32
>>   memtable_flush_after_mins: 60
>>   min_compaction_threshold: 4
>>   name: tracking
>>   read_repair_chance: 1.0
>>   row_cache_save_period_in_seconds: 0
>>   rows_cached: 0.0
>> name: test
>> replica_placement_strategy: org.apache.cassandra.locator.SimpleStrategy
>> replication_factor: 3
>> 
>> 
>> In addition...actually there is plenty of free memory on the heap (?):
>> 
>> 3605.478: [GC 3605.478: [ParNew
>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>> - age   1:     416112 bytes,     416112 total
>> : 16887K->553K(38336K), 0.0209550 secs]3605.499: [CMS: 1145267K->447565K(2054592K), 1.9143630 secs] 1161938K->447565K(2092928K), [CMS Perm : 18186K->18158K(30472K)], 1.9355340 secs] [Times: user=1.95 sys=0.00, real=1.94 secs]
>> 3607.414: [Full GC 3607.414: [CMS: 447565K->447453K(2054592K), 1.9694960 secs] 447565K->447453K(2092928K), [CMS Perm : 18158K->18025K(30472K)], 1.9696450 secs] [Times: user=1.92 sys=0.00, real=1.97 secs]
>> Total time for which application threads were stopped: 3.9070380 seconds
>> Total time for which application threads were stopped: 7.3388640 seconds
>> Total time for which application threads were stopped: 0.0560610 seconds
>> 3616.931: [GC 3616.931: [ParNew
>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>> - age   1:     474264 bytes,     474264 total
>> : 34112K->747K(38336K), 0.0098680 secs] 481565K->448201K(2092928K), 0.0099690 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
>> Total time for which application threads were stopped: 0.0108670 seconds
>> 3617.035: [GC 3617.035: [ParNew
>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>> - age   1:      63040 bytes,      63040 total
>> : 34859K->440K(38336K), 0.0065950 secs] 482313K->448455K(2092928K), 0.0066880 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>> Total time for which application threads were stopped: 0.0075850 seconds
>> 3617.133: [GC 3617.133: [ParNew
>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>> - age   1:      23016 bytes,      23016 total
>> : 34552K->121K(38336K), 0.0042920 secs] 482567K->448193K(2092928K), 0.0043650 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
>> Total time for which application threads were stopped: 0.0049630 seconds
>> 3617.228: [GC 3617.228: [ParNew
>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>> - age   1:      16992 bytes,      16992 total
>> : 34233K->34K(38336K), 0.0043180 secs] 482305K->448122K(2092928K), 0.0043910 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>> Total time for which application threads were stopped: 0.0049150 seconds
>> 3617.323: [GC 3617.323: [ParNew
>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>> - age   1:      18456 bytes,      18456 total
>> : 34146K->29K(38336K), 0.0038930 secs] 482234K->448127K(2092928K), 0.0039810 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
>> Total time for which application threads were stopped: 0.0055390 seconds
>> Heap
>> par new generation   total 38336K, used 17865K [0x000000077ae00000, 0x000000077d790000, 0x000000077d790000)
>> eden space 34112K,  52% used [0x000000077ae00000, 0x000000077bf6afb0, 0x000000077cf50000)
>> from space 4224K,   0% used [0x000000077cf50000, 0x000000077cf57720, 0x000000077d370000)
>> to   space 4224K,   0% used [0x000000077d370000, 0x000000077d370000, 0x000000077d790000)
>> concurrent mark-sweep generation total 2054592K, used 448097K [0x000000077d790000, 0x00000007fae00000, 0x00000007fae00000)
>> concurrent-mark-sweep perm gen total 30472K, used 18125K [0x00000007fae00000, 0x00000007fcbc2000, 0x0000000800000000)
>> 
>>> 
>>> On Sun, Dec 12, 2010 at 9:52 AM, Timo Nentwig <ti...@toptarif.de> wrote:
>>> 
>>> On Dec 10, 2010, at 19:37, Peter Schuller wrote:
>>> 
>>>> To cargo cult it: Are you running a modern JVM? (Not e.g. openjdk b17
>>>> in lenny or some such.) If it is a JVM issue, ensuring you're using a
>>>> reasonably recent JVM is probably much easier than to start tracking
>>>> it down...
>>> 
>>> I had OOM problems with OpenJDK, switched to Sun/Oracle's recent 1.6.0_23 and...still have the same problem :-\ Stack trace always looks the same:
>>> 
>>> java.lang.OutOfMemoryError: Java heap space
>>>       at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>>>       at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>>>       at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
>>>       at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
>>>       at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
>>>       at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
>>>       at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
>>>       at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
>>>       at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
>>>       at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
>>>       at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
>>>       at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
>>>       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>>       at java.lang.Thread.run(Thread.java:636)
>>> 
>>> I'm writing from 1 client with 50 threads to a cluster of 4 machines (with hector). With QUORUM and ONE 2 machines quite reliably will soon die with OOM. What may cause this? Won't cassandra block/reject when memtable is full and being flushed to disk but grow and if flushing to disk isn't fast enough will run out of memory?
>>> 
>>> 
>>> 
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of Riptano, the source for professional Cassandra support
>>> http://riptano.com
>> 
>> 
>> 
>> 
>> -- 
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
> 


Re: Memory leak with Sun Java 1.6 ?

Posted by Timo Nentwig <ti...@toptarif.de>.
On Dec 14, 2010, at 14:41, Jonathan Ellis wrote:

> This is "A row has grown too large" section from that troubleshooting guide.

Why? This is what a typical "row" (?) looks like:

[default@test] list tracking limit 1;
-------------------
RowKey:  123
=> (column=key, value=foo, timestamp=1292238005886000)
=> (column=value, value=bar, timestamp=1292238005886000)

I'm importing the data from a RDBMS so I can say for sure that none of the 2 columns will contain more than 255 chars.

1 Row Returned.

That row warning seems to be a 0.6 option, don't find it in 0.7 docs. Didn't find any related WARNs for some default value in the log also.

> On Tue, Dec 14, 2010 at 5:27 AM, Timo Nentwig <ti...@toptarif.de> wrote:
> 
> On Dec 12, 2010, at 17:21, Jonathan Ellis wrote:
> 
> > http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors
> 
> I can rule out the first 3. I was running cassandra with default settings, i.e. 1GB heap and 256M memtable. So, with 3 memtables+1GB the JVM should run with >1.75G (although http://wiki.apache.org/cassandra/MemtableThresholds considers to increase heap size only gently). Did so. 4GB machine with 2GB 64bit-JVM seemed to run stable for quite some time but then also crashed with OOM. Looking at the heap dump it's always the same: all memory nearly always bound in CompactionExecutor (ColumnFamilyStore/ConcurrentSkipListMap, respectively).
> 
> This looks like somebody else recently have had a similar problem (->Bottom line: more heap - which is okay, but I'd like to understand why):
> http://www.mail-archive.com/user@cassandra.apache.org/msg07516.html
> 
> This is my only CF currently in use (via JMX):
> 
> - column_families:
>  - column_type: Standard
>    comment: tracking column family
>    compare_with: org.apache.cassandra.db.marshal.UTF8Type
>    default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
>    gc_grace_seconds: 864000
>    key_cache_save_period_in_seconds: 3600
>    keys_cached: 200000.0
>    max_compaction_threshold: 32
>    memtable_flush_after_mins: 60
>    min_compaction_threshold: 4
>    name: tracking
>    read_repair_chance: 1.0
>    row_cache_save_period_in_seconds: 0
>    rows_cached: 0.0
>  name: test
>  replica_placement_strategy: org.apache.cassandra.locator.SimpleStrategy
>  replication_factor: 3
> 
> 
> In addition...actually there is plenty of free memory on the heap (?):
> 
> 3605.478: [GC 3605.478: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:     416112 bytes,     416112 total
> : 16887K->553K(38336K), 0.0209550 secs]3605.499: [CMS: 1145267K->447565K(2054592K), 1.9143630 secs] 1161938K->447565K(2092928K), [CMS Perm : 18186K->18158K(30472K)], 1.9355340 secs] [Times: user=1.95 sys=0.00, real=1.94 secs]
> 3607.414: [Full GC 3607.414: [CMS: 447565K->447453K(2054592K), 1.9694960 secs] 447565K->447453K(2092928K), [CMS Perm : 18158K->18025K(30472K)], 1.9696450 secs] [Times: user=1.92 sys=0.00, real=1.97 secs]
> Total time for which application threads were stopped: 3.9070380 seconds
> Total time for which application threads were stopped: 7.3388640 seconds
> Total time for which application threads were stopped: 0.0560610 seconds
> 3616.931: [GC 3616.931: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:     474264 bytes,     474264 total
> : 34112K->747K(38336K), 0.0098680 secs] 481565K->448201K(2092928K), 0.0099690 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
> Total time for which application threads were stopped: 0.0108670 seconds
> 3617.035: [GC 3617.035: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:      63040 bytes,      63040 total
> : 34859K->440K(38336K), 0.0065950 secs] 482313K->448455K(2092928K), 0.0066880 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0075850 seconds
> 3617.133: [GC 3617.133: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:      23016 bytes,      23016 total
> : 34552K->121K(38336K), 0.0042920 secs] 482567K->448193K(2092928K), 0.0043650 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0049630 seconds
> 3617.228: [GC 3617.228: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:      16992 bytes,      16992 total
> : 34233K->34K(38336K), 0.0043180 secs] 482305K->448122K(2092928K), 0.0043910 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0049150 seconds
> 3617.323: [GC 3617.323: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:      18456 bytes,      18456 total
> : 34146K->29K(38336K), 0.0038930 secs] 482234K->448127K(2092928K), 0.0039810 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0055390 seconds
> Heap
>  par new generation   total 38336K, used 17865K [0x000000077ae00000, 0x000000077d790000, 0x000000077d790000)
>  eden space 34112K,  52% used [0x000000077ae00000, 0x000000077bf6afb0, 0x000000077cf50000)
>  from space 4224K,   0% used [0x000000077cf50000, 0x000000077cf57720, 0x000000077d370000)
>  to   space 4224K,   0% used [0x000000077d370000, 0x000000077d370000, 0x000000077d790000)
>  concurrent mark-sweep generation total 2054592K, used 448097K [0x000000077d790000, 0x00000007fae00000, 0x00000007fae00000)
>  concurrent-mark-sweep perm gen total 30472K, used 18125K [0x00000007fae00000, 0x00000007fcbc2000, 0x0000000800000000)
> 
> >
> > On Sun, Dec 12, 2010 at 9:52 AM, Timo Nentwig <ti...@toptarif.de> wrote:
> >
> > On Dec 10, 2010, at 19:37, Peter Schuller wrote:
> >
> > > To cargo cult it: Are you running a modern JVM? (Not e.g. openjdk b17
> > > in lenny or some such.) If it is a JVM issue, ensuring you're using a
> > > reasonably recent JVM is probably much easier than to start tracking
> > > it down...
> >
> > I had OOM problems with OpenJDK, switched to Sun/Oracle's recent 1.6.0_23 and...still have the same problem :-\ Stack trace always looks the same:
> >
> > java.lang.OutOfMemoryError: Java heap space
> >        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
> >        at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
> >        at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
> >        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
> >        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
> >        at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
> >        at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
> >        at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
> >        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
> >        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
> >        at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
> >        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
> >        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> >        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> >        at java.lang.Thread.run(Thread.java:636)
> >
> > I'm writing from 1 client with 50 threads to a cluster of 4 machines (with hector). With QUORUM and ONE 2 machines quite reliably will soon die with OOM. What may cause this? Won't cassandra block/reject when memtable is full and being flushed to disk but grow and if flushing to disk isn't fast enough will run out of memory?
> >
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder of Riptano, the source for professional Cassandra support
> > http://riptano.com
> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com


Re: Memory leak with Sun Java 1.6 ?

Posted by Jonathan Ellis <jb...@gmail.com>.
This is "A row has grown too large" section from that troubleshooting guide.

On Tue, Dec 14, 2010 at 5:27 AM, Timo Nentwig <ti...@toptarif.de>wrote:

>
> On Dec 12, 2010, at 17:21, Jonathan Ellis wrote:
>
> >
> http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors
>
> I can rule out the first 3. I was running cassandra with default settings,
> i.e. 1GB heap and 256M memtable. So, with 3 memtables+1GB the JVM should run
> with >1.75G (although http://wiki.apache.org/cassandra/MemtableThresholdsconsiders to increase heap size only gently). Did so. 4GB machine with 2GB
> 64bit-JVM seemed to run stable for quite some time but then also crashed
> with OOM. Looking at the heap dump it's always the same: all memory nearly
> always bound in CompactionExecutor (ColumnFamilyStore/ConcurrentSkipListMap,
> respectively).
>
> This looks like somebody else recently have had a similar problem (->Bottom
> line: more heap - which is okay, but I'd like to understand why):
> http://www.mail-archive.com/user@cassandra.apache.org/msg07516.html
>
> This is my only CF currently in use (via JMX):
>
> - column_families:
>  - column_type: Standard
>    comment: tracking column family
>    compare_with: org.apache.cassandra.db.marshal.UTF8Type
>    default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
>    gc_grace_seconds: 864000
>    key_cache_save_period_in_seconds: 3600
>    keys_cached: 200000.0
>    max_compaction_threshold: 32
>    memtable_flush_after_mins: 60
>    min_compaction_threshold: 4
>    name: tracking
>    read_repair_chance: 1.0
>    row_cache_save_period_in_seconds: 0
>    rows_cached: 0.0
>  name: test
>  replica_placement_strategy: org.apache.cassandra.locator.SimpleStrategy
>  replication_factor: 3
>
>
> In addition...actually there is plenty of free memory on the heap (?):
>
> 3605.478: [GC 3605.478: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:     416112 bytes,     416112 total
> : 16887K->553K(38336K), 0.0209550 secs]3605.499: [CMS:
> 1145267K->447565K(2054592K), 1.9143630 secs] 1161938K->447565K(2092928K),
> [CMS Perm : 18186K->18158K(30472K)], 1.9355340 secs] [Times: user=1.95
> sys=0.00, real=1.94 secs]
> 3607.414: [Full GC 3607.414: [CMS: 447565K->447453K(2054592K), 1.9694960
> secs] 447565K->447453K(2092928K), [CMS Perm : 18158K->18025K(30472K)],
> 1.9696450 secs] [Times: user=1.92 sys=0.00, real=1.97 secs]
> Total time for which application threads were stopped: 3.9070380 seconds
> Total time for which application threads were stopped: 7.3388640 seconds
> Total time for which application threads were stopped: 0.0560610 seconds
> 3616.931: [GC 3616.931: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:     474264 bytes,     474264 total
> : 34112K->747K(38336K), 0.0098680 secs] 481565K->448201K(2092928K),
> 0.0099690 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
> Total time for which application threads were stopped: 0.0108670 seconds
> 3617.035: [GC 3617.035: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:      63040 bytes,      63040 total
> : 34859K->440K(38336K), 0.0065950 secs] 482313K->448455K(2092928K),
> 0.0066880 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0075850 seconds
> 3617.133: [GC 3617.133: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:      23016 bytes,      23016 total
> : 34552K->121K(38336K), 0.0042920 secs] 482567K->448193K(2092928K),
> 0.0043650 secs] [Times: user=0.01 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0049630 seconds
> 3617.228: [GC 3617.228: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:      16992 bytes,      16992 total
> : 34233K->34K(38336K), 0.0043180 secs] 482305K->448122K(2092928K),
> 0.0043910 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0049150 seconds
> 3617.323: [GC 3617.323: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:      18456 bytes,      18456 total
> : 34146K->29K(38336K), 0.0038930 secs] 482234K->448127K(2092928K),
> 0.0039810 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0055390 seconds
> Heap
>  par new generation   total 38336K, used 17865K [0x000000077ae00000,
> 0x000000077d790000, 0x000000077d790000)
>  eden space 34112K,  52% used [0x000000077ae00000, 0x000000077bf6afb0,
> 0x000000077cf50000)
>  from space 4224K,   0% used [0x000000077cf50000, 0x000000077cf57720,
> 0x000000077d370000)
>  to   space 4224K,   0% used [0x000000077d370000, 0x000000077d370000,
> 0x000000077d790000)
>  concurrent mark-sweep generation total 2054592K, used 448097K
> [0x000000077d790000, 0x00000007fae00000, 0x00000007fae00000)
>  concurrent-mark-sweep perm gen total 30472K, used 18125K
> [0x00000007fae00000, 0x00000007fcbc2000, 0x0000000800000000)
>
> >
> > On Sun, Dec 12, 2010 at 9:52 AM, Timo Nentwig <ti...@toptarif.de>
> wrote:
> >
> > On Dec 10, 2010, at 19:37, Peter Schuller wrote:
> >
> > > To cargo cult it: Are you running a modern JVM? (Not e.g. openjdk b17
> > > in lenny or some such.) If it is a JVM issue, ensuring you're using a
> > > reasonably recent JVM is probably much easier than to start tracking
> > > it down...
> >
> > I had OOM problems with OpenJDK, switched to Sun/Oracle's recent 1.6.0_23
> and...still have the same problem :-\ Stack trace always looks the same:
> >
> > java.lang.OutOfMemoryError: Java heap space
> >        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
> >        at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
> >        at
> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
> >        at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
> >        at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
> >        at
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
> >        at
> org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
> >        at
> org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
> >        at
> org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
> >        at
> org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
> >        at
> org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
> >        at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
> >        at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> >        at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> >        at java.lang.Thread.run(Thread.java:636)
> >
> > I'm writing from 1 client with 50 threads to a cluster of 4 machines
> (with hector). With QUORUM and ONE 2 machines quite reliably will soon die
> with OOM. What may cause this? Won't cassandra block/reject when memtable is
> full and being flushed to disk but grow and if flushing to disk isn't fast
> enough will run out of memory?
> >
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder of Riptano, the source for professional Cassandra support
> > http://riptano.com
>
>


-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Memory leak with Sun Java 1.6 ?

Posted by Peter Schuller <pe...@infidyne.com>.
>> For debugging purposes you may want to switch Cassandra to "standard"
>> IO mode instead of mmap. This will have a performance-penalty, but the
>> virtual/resident sizes won't be polluted with mmap():ed data.
>
> Already did so. It *seems* to run more stable, but it's still far off from being stable.

Sorry, that was written under confusion. I meant for the original
poster and his monitoring of virtual/resident memory and the fact that
he seemed to have a legitimate OS level out of memory condition. In
other words, I recommend that to the OP just to remove one variable in
terms of monitoring what is consuming memory.

-- 
/ Peter Schuller

Re: Memory leak with Sun Java 1.6 ?

Posted by Peter Schuller <pe...@infidyne.com>.
> I just uncommented the GC JVMOPTS from the shipped cassandra start script and use Sun JVM 1.6.0_23. Hmm, but these "GC tuning options" are also uncommented. I'll comment them again and try again.

Maybe I was just too quick trying to mentally parse it and given the
jumbled line endings. You're right; standard output...

-- 
/ Peter Schuller

Re: Memory leak with Sun Java 1.6 ?

Posted by Timo Nentwig <ti...@toptarif.de>.
On Dec 14, 2010, at 19:38, Peter Schuller wrote:

> For debugging purposes you may want to switch Cassandra to "standard"
> IO mode instead of mmap. This will have a performance-penalty, but the
> virtual/resident sizes won't be polluted with mmap():ed data.

Already did so. It *seems* to run more stable, but it's still far off from being stable.

I actually already put 100 millions rows into a local cassandra instance (on OSX [and on rc1], not xen'ed Linux), 
so this is unlikely a cassandra Java code problem but rather something native code/platform related.

> In general, unless you're hitting something particularly strange or
> just a bug in Cassandra, you shouldn't be randomly getting OOM:s
> unless you are truly using that heap space. What do you mean by
> "always bound in compactionexecutor" - by what method did you
> determine this to be the case?

heap dumps -> MAT (http://www.eclipse.org/mat/)

> There should be no magic need for CPU. Unless you are severely taxing
> it in terms of very high write load or similar, an out-of-the-box
> configured cassandra should be needing limited amounts of memory. Did
> you run with default memtable thresholds (memtable_throughput_in_mb i

Yes

>> This is my only CF currently in use (via JMX):
>> 
>> - column_families:
>>  - column_type: Standard
>>    comment: tracking column family
>>    compare_with: org.apache.cassandra.db.marshal.UTF8Type
>>    default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
>>    gc_grace_seconds: 864000
>>    key_cache_save_period_in_seconds: 3600
>>    keys_cached: 200000.0
>>    max_compaction_threshold: 32
>>    memtable_flush_after_mins: 60
>>    min_compaction_threshold: 4
>>    name: tracking
>>    read_repair_chance: 1.0
>>    row_cache_save_period_in_seconds: 0
>>    rows_cached: 0.0
>>  name: test
>>  replica_placement_strategy: org.apache.cassandra.locator.SimpleStrategy
>>  replication_factor: 3
> 
> This is the only column family being used?

Current, for testing, yes.

>> In addition...actually there is plenty of free memory on the heap (?):
>> 
>> 3605.478: [GC 3605.478: [ParNew
>> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
>> - age   1:     416112 bytes,     416112 total
>> : 16887K->553K(38336K), 0.0209550 secs]3605.499: [CMS: 1145267K->447565K(2054592K), 1.9143630 secs] 1161938K->447565K(2092928K), [CMS Perm : 18186K->18158K(30472K)], 1.9355340 secs] [Times: user=1.95 sys=0.00, real=1.94 secs]
>> 3607.414: [Full GC 3607.414: [CMS: 447565K->447453K(2054592K), 1.9694960 secs] 447565K->447453K(2092928K), [CMS Perm : 18158K->18025K(30472K)], 1.9696450 secs] [Times: user=1.92 sys=0.00, real=1.97 secs]
> 
> 1.9 seconds to do [CMS: 1145267K->447565K(2054592K) is completely
> abnormal if that represents a pause (but not if it's just concurrent
> mark/sweep time). I don't quite recognize the format of this log...
> I'm suddenly unsure what this log output is coming from. A normal
> -XX:+PrintGC and -XX:+PrintGCDetails should yield stuff like:

I just uncommented the GC JVMOPTS from the shipped cassandra start script and use Sun JVM 1.6.0_23. Hmm, but these "GC tuning options" are also uncommented. I'll comment them again and try again.


Re: Memory leak with Sun Java 1.6 ?

Posted by Peter Schuller <pe...@infidyne.com>.
> I can rule out the first 3. I was running cassandra with default settings, i.e. 1GB heap and 256M memtable. So, with 3 memtables+1GB the JVM should run with >1.75G (although http://wiki.apache.org/cassandra/MemtableThresholds considers to increase heap size only gently). Did so. 4GB machine with 2GB 64bit-JVM seemed to run stable for quite some time but then also crashed with OOM. Looking at the heap dump it's always the same: all memory nearly always bound in CompactionExecutor (ColumnFamilyStore/ConcurrentSkipListMap, respectively).

Note that the memtables are just data structures in memory and are in
the heap. They are not in addition to the heap. mmap():ed files will
be though.

For debugging purposes you may want to switch Cassandra to "standard"
IO mode instead of mmap. This will have a performance-penalty, but the
virtual/resident sizes won't be polluted with mmap():ed data.

In general, unless you're hitting something particularly strange or
just a bug in Cassandra, you shouldn't be randomly getting OOM:s
unless you are truly using that heap space. What do you mean by
"always bound in compactionexecutor" - by what method did you
determine this to be the case?

Note that memtables are ConcurrentSkipListMap:s, so they are expected
to contain a lot of data (and maybe be significant themselves if lots
of small items are stored).

> This looks like somebody else recently have had a similar problem (->Bottom line: more heap - which is okay, but I'd like to understand why):
> http://www.mail-archive.com/user@cassandra.apache.org/msg07516.html

There should be no magic need for CPU. Unless you are severely taxing
it in terms of very high write load or similar, an out-of-the-box
configured cassandra should be needing limited amounts of memory. Did
you run with default memtable thresholds (memtable_throughput_in_mb in
particular)? How many column families do you have? You need memory for
*each* column family, taking into account it's memtable thresholds.

> This is my only CF currently in use (via JMX):
>
> - column_families:
>  - column_type: Standard
>    comment: tracking column family
>    compare_with: org.apache.cassandra.db.marshal.UTF8Type
>    default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
>    gc_grace_seconds: 864000
>    key_cache_save_period_in_seconds: 3600
>    keys_cached: 200000.0
>    max_compaction_threshold: 32
>    memtable_flush_after_mins: 60
>    min_compaction_threshold: 4
>    name: tracking
>    read_repair_chance: 1.0
>    row_cache_save_period_in_seconds: 0
>    rows_cached: 0.0
>  name: test
>  replica_placement_strategy: org.apache.cassandra.locator.SimpleStrategy
>  replication_factor: 3

This is the only column family being used?

> In addition...actually there is plenty of free memory on the heap (?):
>
> 3605.478: [GC 3605.478: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:     416112 bytes,     416112 total
> : 16887K->553K(38336K), 0.0209550 secs]3605.499: [CMS: 1145267K->447565K(2054592K), 1.9143630 secs] 1161938K->447565K(2092928K), [CMS Perm : 18186K->18158K(30472K)], 1.9355340 secs] [Times: user=1.95 sys=0.00, real=1.94 secs]
> 3607.414: [Full GC 3607.414: [CMS: 447565K->447453K(2054592K), 1.9694960 secs] 447565K->447453K(2092928K), [CMS Perm : 18158K->18025K(30472K)], 1.9696450 secs] [Times: user=1.92 sys=0.00, real=1.97 secs]

1.9 seconds to do [CMS: 1145267K->447565K(2054592K) is completely
abnormal if that represents a pause (but not if it's just concurrent
mark/sweep time). I don't quite recognize the format of this log...
I'm suddenly unsure what this log output is coming from. A normal
-XX:+PrintGC and -XX:+PrintGCDetails should yield stuff like:

22.634: [GC 22.634: [ParNew: 19136K->2112K(19136K), 0.0058490 secs]
123155K->108012K(202688K), 0.0059050 secs] [Times: user=0.02 sys=0.00,
real=0.00 secs]
22.660: [GC 22.660: [ParNew: 19136K->2112K(19136K), 0.0063220 secs]
125036K->109893K(202688K), 0.0063910 secs] [Times: user=0.02 sys=0.00,
real=0.01 secs]
22.686: [GC 22.686: [ParNew: 19136K->2112K(19136K), 0.0067150 secs]
126917K->111864K(202688K), 0.0067730 secs] [Times: user=0.02 sys=0.00,
real=0.01 secs]
22.724: [CMS-concurrent-abortable-preclean: 0.041/0.257 secs] [Times:
user=0.24 sys=0.01, real=0.26 secs]
22.724: [GC[YG occupancy: 19037 K (19136 K)]22.724: [Rescan (parallel)
, 0.0085230 secs]22.732: [weak refs processing, 0.0142440 secs] [1
CMS-remark: 109752K(183552K)] 128790K(202688K), 0.0228750 secs]
[Times: user=0.03 sys=0.00, real=0.02 secs]
22.747: [CMS-concurrent-sweep-start]
22.749: [GC 22.749: [ParNew: 19136K->2112K(19136K), 0.0082060 secs]
128813K->113837K(202688K), 0.0082620 secs] [Times: user=0.02 sys=0.00,
real=0.01 secs]
22.772: [GC 22.772: [ParNew: 19136K->1500K(19136K), 0.0064660 secs]
116825K->101174K(202688K), 0.0065400 secs] [Times: user=0.02 sys=0.00,
real=0.01 secs]
22.804: [CMS-concurrent-sweep: 0.041/0.057 secs] [Times: user=0.12
sys=0.00, real=0.05 secs]
22.804: [CMS-concurrent-reset-start]
22.816: [CMS-concurrent-reset: 0.012/0.012 secs] [Times: user=0.02
sys=0.00, real=0.02 secs]
22.910: [GC 22.910: [ParNew: 18524K->2112K(19136K), 0.0024290 secs]
98398K->81987K(202688K), 0.0024860 secs] [Times: user=0.01 sys=0.00,
real=0.00 secs]
22.938: [GC 22.938: [ParNew: 19136K->2112K(19136K), 0.0077940 secs]
99011K->83980K(202688K), 0.0078510 secs] [Times: user=0.02 sys=0.00,
real=0.01 secs]
22.965: [GC 22.965: [ParNew: 19136K->2112K(19136K), 0.0063160 secs]
101004K->85959K(202688K), 0.0063760 secs] [Times: user=0.02 sys=0.00,
real=0.01 secs]

Where you can clearly see what constitute stop times and what are just
timing the CMS activity in the background. Yours seems to be somewhere
in between. I vaguely recognize it but I don't remember what produces
that...

Anyways it certainly looks like you have reasonable heap usage going
on as expected.

I'm concerned since you reported that independent JVM:s like nodetool
suffered OOM:s as well; that really indicates there is a real problem
of some kind.

-- 
/ Peter Schuller

Re: Memory leak with Sun Java 1.6 ?

Posted by Timo Nentwig <ti...@toptarif.de>.
On Dec 12, 2010, at 17:21, Jonathan Ellis wrote:

> http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors

I can rule out the first 3. I was running cassandra with default settings, i.e. 1GB heap and 256M memtable. So, with 3 memtables+1GB the JVM should run with >1.75G (although http://wiki.apache.org/cassandra/MemtableThresholds considers to increase heap size only gently). Did so. 4GB machine with 2GB 64bit-JVM seemed to run stable for quite some time but then also crashed with OOM. Looking at the heap dump it's always the same: all memory nearly always bound in CompactionExecutor (ColumnFamilyStore/ConcurrentSkipListMap, respectively).

This looks like somebody else recently have had a similar problem (->Bottom line: more heap - which is okay, but I'd like to understand why):
http://www.mail-archive.com/user@cassandra.apache.org/msg07516.html

This is my only CF currently in use (via JMX):

- column_families:
  - column_type: Standard
    comment: tracking column family
    compare_with: org.apache.cassandra.db.marshal.UTF8Type
    default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
    gc_grace_seconds: 864000
    key_cache_save_period_in_seconds: 3600
    keys_cached: 200000.0
    max_compaction_threshold: 32
    memtable_flush_after_mins: 60
    min_compaction_threshold: 4
    name: tracking
    read_repair_chance: 1.0
    row_cache_save_period_in_seconds: 0
    rows_cached: 0.0
  name: test
  replica_placement_strategy: org.apache.cassandra.locator.SimpleStrategy
  replication_factor: 3


In addition...actually there is plenty of free memory on the heap (?):

3605.478: [GC 3605.478: [ParNew
Desired survivor size 2162688 bytes, new threshold 1 (max 1)
- age   1:     416112 bytes,     416112 total
: 16887K->553K(38336K), 0.0209550 secs]3605.499: [CMS: 1145267K->447565K(2054592K), 1.9143630 secs] 1161938K->447565K(2092928K), [CMS Perm : 18186K->18158K(30472K)], 1.9355340 secs] [Times: user=1.95 sys=0.00, real=1.94 secs] 
3607.414: [Full GC 3607.414: [CMS: 447565K->447453K(2054592K), 1.9694960 secs] 447565K->447453K(2092928K), [CMS Perm : 18158K->18025K(30472K)], 1.9696450 secs] [Times: user=1.92 sys=0.00, real=1.97 secs] 
Total time for which application threads were stopped: 3.9070380 seconds
Total time for which application threads were stopped: 7.3388640 seconds
Total time for which application threads were stopped: 0.0560610 seconds
3616.931: [GC 3616.931: [ParNew
Desired survivor size 2162688 bytes, new threshold 1 (max 1)
- age   1:     474264 bytes,     474264 total
: 34112K->747K(38336K), 0.0098680 secs] 481565K->448201K(2092928K), 0.0099690 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] 
Total time for which application threads were stopped: 0.0108670 seconds
3617.035: [GC 3617.035: [ParNew
Desired survivor size 2162688 bytes, new threshold 1 (max 1)
- age   1:      63040 bytes,      63040 total
: 34859K->440K(38336K), 0.0065950 secs] 482313K->448455K(2092928K), 0.0066880 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
Total time for which application threads were stopped: 0.0075850 seconds
3617.133: [GC 3617.133: [ParNew
Desired survivor size 2162688 bytes, new threshold 1 (max 1)
- age   1:      23016 bytes,      23016 total
: 34552K->121K(38336K), 0.0042920 secs] 482567K->448193K(2092928K), 0.0043650 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
Total time for which application threads were stopped: 0.0049630 seconds
3617.228: [GC 3617.228: [ParNew
Desired survivor size 2162688 bytes, new threshold 1 (max 1)
- age   1:      16992 bytes,      16992 total
: 34233K->34K(38336K), 0.0043180 secs] 482305K->448122K(2092928K), 0.0043910 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
Total time for which application threads were stopped: 0.0049150 seconds
3617.323: [GC 3617.323: [ParNew
Desired survivor size 2162688 bytes, new threshold 1 (max 1)
- age   1:      18456 bytes,      18456 total
: 34146K->29K(38336K), 0.0038930 secs] 482234K->448127K(2092928K), 0.0039810 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
Total time for which application threads were stopped: 0.0055390 seconds
Heap
 par new generation   total 38336K, used 17865K [0x000000077ae00000, 0x000000077d790000, 0x000000077d790000)
  eden space 34112K,  52% used [0x000000077ae00000, 0x000000077bf6afb0, 0x000000077cf50000)
  from space 4224K,   0% used [0x000000077cf50000, 0x000000077cf57720, 0x000000077d370000)
  to   space 4224K,   0% used [0x000000077d370000, 0x000000077d370000, 0x000000077d790000)
 concurrent mark-sweep generation total 2054592K, used 448097K [0x000000077d790000, 0x00000007fae00000, 0x00000007fae00000)
 concurrent-mark-sweep perm gen total 30472K, used 18125K [0x00000007fae00000, 0x00000007fcbc2000, 0x0000000800000000)

> 
> On Sun, Dec 12, 2010 at 9:52 AM, Timo Nentwig <ti...@toptarif.de> wrote:
> 
> On Dec 10, 2010, at 19:37, Peter Schuller wrote:
> 
> > To cargo cult it: Are you running a modern JVM? (Not e.g. openjdk b17
> > in lenny or some such.) If it is a JVM issue, ensuring you're using a
> > reasonably recent JVM is probably much easier than to start tracking
> > it down...
> 
> I had OOM problems with OpenJDK, switched to Sun/Oracle's recent 1.6.0_23 and...still have the same problem :-\ Stack trace always looks the same:
> 
> java.lang.OutOfMemoryError: Java heap space
>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>        at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
>        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
>        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
>        at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
>        at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
>        at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
>        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
>        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
>        at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
>        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
>        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>        at java.lang.Thread.run(Thread.java:636)
> 
> I'm writing from 1 client with 50 threads to a cluster of 4 machines (with hector). With QUORUM and ONE 2 machines quite reliably will soon die with OOM. What may cause this? Won't cassandra block/reject when memtable is full and being flushed to disk but grow and if flushing to disk isn't fast enough will run out of memory?
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com


Re: Memory leak with Sun Java 1.6 ?

Posted by Jonathan Ellis <jb...@gmail.com>.
http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors

On Sun, Dec 12, 2010 at 9:52 AM, Timo Nentwig <ti...@toptarif.de>wrote:

>
> On Dec 10, 2010, at 19:37, Peter Schuller wrote:
>
> > To cargo cult it: Are you running a modern JVM? (Not e.g. openjdk b17
> > in lenny or some such.) If it is a JVM issue, ensuring you're using a
> > reasonably recent JVM is probably much easier than to start tracking
> > it down...
>
> I had OOM problems with OpenJDK, switched to Sun/Oracle's recent 1.6.0_23
> and...still have the same problem :-\ Stack trace always looks the same:
>
> java.lang.OutOfMemoryError: Java heap space
>        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>        at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>        at
> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
>        at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
>        at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
>        at
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
>        at
> org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
>        at
> org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
>        at
> org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
>        at
> org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
>        at
> org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
>        at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
>        at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>        at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>        at java.lang.Thread.run(Thread.java:636)
>
> I'm writing from 1 client with 50 threads to a cluster of 4 machines (with
> hector). With QUORUM and ONE 2 machines quite reliably will soon die with
> OOM. What may cause this? Won't cassandra block/reject when memtable is full
> and being flushed to disk but grow and if flushing to disk isn't fast enough
> will run out of memory?




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Memory leak with Sun Java 1.6 ?

Posted by Timo Nentwig <ti...@toptarif.de>.
On Dec 10, 2010, at 19:37, Peter Schuller wrote:

> To cargo cult it: Are you running a modern JVM? (Not e.g. openjdk b17
> in lenny or some such.) If it is a JVM issue, ensuring you're using a
> reasonably recent JVM is probably much easier than to start tracking
> it down...

I had OOM problems with OpenJDK, switched to Sun/Oracle's recent 1.6.0_23 and...still have the same problem :-\ Stack trace always looks the same:

java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
        at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
        at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
        at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
        at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
        at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:636)

I'm writing from 1 client with 50 threads to a cluster of 4 machines (with hector). With QUORUM and ONE 2 machines quite reliably will soon die with OOM. What may cause this? Won't cassandra block/reject when memtable is full and being flushed to disk but grow and if flushing to disk isn't fast enough will run out of memory?

Re: Memory leak with Sun Java 1.6 ?

Posted by Clint Byrum <cl...@ubuntu.com>.
On Tue, 2010-12-14 at 11:06 +0000, Jedd Rashbrooke wrote:
>  JNA is something I'd read briefly about a while back, but now
>  it might be something I need to explore further.  We're using
>  Cassandra 0.6.6, and our Ubuntu version offers a packaged
>  release of libjna 3.2.3-1 .. rumours on the Internets suggest
>  a more recent version of jna may be required here.  Though

I saw crashes running tests with the packaged jna in Ubuntu, so the
0.6.8 Ubuntu packages that we build bundle jna 3.2.7.

https://launchpad.net/~cassandra-ubuntu/+archive/stable




Re: Memory leak with Sun Java 1.6 ?

Posted by Jedd Rashbrooke <je...@imagini.net>.
On 10 December 2010 18:37, Peter Schuller <pe...@infidyne.com> wrote:
> Memory-mapped files will account for both virtual and, to the extent
> that they are resident in memory, to the resident size of the process.

 To clarify - in our storage-conf we have:

   <!-- This should consume a LOT less memory, and stop the JVM from
ballooning immensely -->
   <DiskAccessMode>mmap_index_only</DiskAccessMode>

 I know it's a matter of degree, but 3GB in a month feels like
 a decent-sized ballooning.

 JNA is something I'd read briefly about a while back, but now
 it might be something I need to explore further.  We're using
 Cassandra 0.6.6, and our Ubuntu version offers a packaged
 release of libjna 3.2.3-1 .. rumours on the Internets suggest
 a more recent version of jna may be required here.  Though
 I'm also considering whether we just wear this as a known
 issue, do a rolling restart every 25 days (very low impact for
 us) and hold out until 0.7.  We have a testing environment,
 but it's a pain to test solutions to problems that take a month
 to manifest.

 cheers,
 Jedd.

Re: Memory leak with Sun Java 1.6 ?

Posted by Zhu Han <sc...@gmail.com>.
This bug is present in both Sun JDK and open JDK because they share the same
hotspot VM. Sun JDK got the fix earlier than open JDK.

best regards,
hanzhu


On Thu, Dec 16, 2010 at 6:43 PM, Jedd Rashbrooke <
jedd.rashbrooke@imagini.net> wrote:

>  Hi Peter,
>
>  I've read through the "Very high memory utilization (not caused by mmap
>  on sstables)" and the impact sounds similar .. alas they're using openJDK
>  and I'm using Sun's.  Of course, it's not impossible that the same bug is
>  present in both JVM's.
>
>  j.
>

Re: Memory leak with Sun Java 1.6 ?

Posted by Jedd Rashbrooke <je...@imagini.net>.
 Hi Peter,

 I've read through the "Very high memory utilization (not caused by mmap
 on sstables)" and the impact sounds similar .. alas they're using openJDK
 and I'm using Sun's.  Of course, it's not impossible that the same bug is
 present in both JVM's.

 j.

Re: Memory leak with Sun Java 1.6 ?

Posted by Peter Schuller <pe...@infidyne.com>.
Didn't have time yet to properly respond to this one, but check out
the "Very high memory utilization (not caused by mmap on sstables)"
thread from today. You're using the same Ubuntu/JVM, right? Seems to
match your observations.

(Sorry no ML archive link; the ML archives for Apache projects has
never worked properly for me in my browsers... too broken-ajaxy.)

-- 
/ Peter Schuller

Re: Memory leak with Sun Java 1.6 ?

Posted by Jedd Rashbrooke <je...@imagini.net>.
 Hey Peter,

On 14 December 2010 20:19, Peter Schuller <pe...@infidyne.com> wrote:
> So, now that I get that we have two different cases ;)

 Yup.  My problem is java / environment based, occurring
 after several weeks, using 0.6.6 instances.  Our thread
 hijacking friend / learned colleague's problem is 0.7 based
 and occurs within a few minutes, and appears to be related
 to something within the JVM.  (And no .. in this case, it's
 not someone that works with me, but an understandable
 assumption given the context.)

 Flipping to standard IO mode .. might be too big a cost for
 us at the moment.  This particular system is in production,
 and it's probably too expensive to set up a parallel system,
 and run it long enough to determine if we see the same
 trends.  I'm increasingly inclined to hang out for a stable
 0.7.x release before fiddling with our production system.


> Re-reading your part of this thread, it seems you're not even having
> OOM issues in Cassandra itself, right? You're just making (1) the
> observation that memory graphs show little left, and (2) that nodetool
> etc start failing on start with OOM?

 Pretty much, yes.

 I've got two new graphs that may be of casual interest.  The first
 shows memory usage over the past two months or so, including
 the recent Cassandra restart a week ago:

 http://www.imagebam.com/image/aed254111011473

 The other graph (see below) might be more insightful.


> (2) Enable GC logging with e.g. -Xloggc:/path/to/log and -XX:+PrintGC
> -XX:+PrintGCDetails -XX:+PrintGCTimestamps.

 Perhaps unrelated - though I don't suspect it was a GC problem
 (having suffered several weeks of GC pain) - Cassandra seemed
 to be happy to still do actual reads & writes while it was in this
 state.  This state meaning


> (3) Observe top to confirm JVM memory use.

 I can confirm this was showing as 7 / 4.1  for V / RES across
 all 16 boxes .. up until we did the rolling restart Cassandra.

> (4) Observe the +/- buffers in 'free' output. For example:

 Okay - here's the output of one of our 16 boxes right now:

root@prawn-07:~# free
             total       used       free     shared    buffers     cached
Mem:       7864548    7813220      51328          0      77064    2683440
-/+ buffers/cache:    5052716    2811832
Swap:            0          0          0

 Apologies for the lack of granularity in this graph, but the
 numbers at the bottom will indicate what matches what
 fairly easily.   Memory usage graph for past hour:

 http://www.imagebam.com/image/b06921111011462

 This is more for your interest than anything else - I'm
 not sure there's a 'solution' for the problem as it stands,
 and I'm aware that I'm on a 2-rev old release that's about
 to be major-numbered into obscurity in the next month or
 two anyway.


> One hypothesis: Is your data set slowly increasing, and the growth in
> memory use over time is just a result of the data set  being larger,
> which would be reflected in the amount of data in page cache?

 Our data is increasing, certainly.  Much of that are overwrites, and
 we purge those after three days via GCGraceSeconds, but I don't
 have any stats on ratio or indeed raw growth rate.

> If you
> have no *other* significant activity that fills page cache (including
> log files etc), page cache would be unused otherwise and then grow in
> occupancy as your data set increases.

 Just checked, and no - /var/log is a modest 14MB, and we run nothing
 but Cassandra on these boxes.  Well, a monitoring agent, but it doesn't
 do any local logging, just some network activity.

> And to be clear: What is the actual negative observation that you have
> made - do you see high latencies, saturation on disk I/O etc or is
> this purely about seeming memory growth on graphs and observations of
> memory usage by the JVM in top?

 Ah, okay, sorry.  To reiterate:
 o  nodetool ceases to run - won't even let me drain that instance
 o  jmap won't run either
 o  a consequence of nodetool not working means that we can't monitor
     the health of our cluster, which is a scary place to be for us
 o  all 16 boxes were in the same state last Friday, as they'd all
     restarted around the same time (say 6 weeks earlier)
 o  performance on boxes approaching this state seems to be 'ok',
     but I can't give anything quantitative on this front (we don't do
     any performance monitoring calls at the moment)

 My big concern here is that because these are swapless boxes, and
 actual resident size of the java process matched (to within some tens
 of MB) the physical memory, I was very close to having all the boxes
 OOM'ing - at an OS level, I mean, not within the JVM.

 I thought (hoped?) that because we're on 7GB machines, using
 4GB Xmx, and running 0.6.6 instances for several weeks - this
 would be a common enough scenario that others might have seen
 something similar.  Alas . . .    ;)

 j.

Re: Memory leak with Sun Java 1.6 ?

Posted by Peter Schuller <pe...@infidyne.com>.
>  I posted mostly as a heads up for others using similar profiles (4GB
>  heap on ~8GB boxes) to keep an eye out for.  I expect a few people,
>  particularly if they're on Amazon EC2, are running this type of setup.
>
>  On the other hand, mum always said I was unique.  ;)

So, now that I get that we have two different cases ;)

Re-reading your part of this thread, it seems you're not even having
OOM issues in Cassandra itself, right? You're just making (1) the
observation that memory graphs show little left, and (2) that nodetool
etc start failing on start with OOM?

So, to re-cap and/or add:

(1) Assuming you can live with the performance implications at least
temporarily, I suggest reconfiguring Cassandra for standard I/O mode
just to confirm that the JVM isn't significantly more 'resident' than
the maximum heap size you have configured.

(2) Enable GC logging with e.g. -Xloggc:/path/to/log and -XX:+PrintGC
-XX:+PrintGCDetails -XX:+PrintGCTimestamps.

(3) Observe top to confirm JVM memory use.

(4) Observe the +/- buffers in 'free' output. For example:

                     total       used       free     shared    buffers
    cached
Mem:       3993448    3814416     179032          0     265380    2013404
-/+ buffers/cache:    1535632    2457816
Swap:      9928696     114248    9814448

The number I'm after above is 2457816 which is the amount of "free
memory if we pretend there is no page cache". Please make sure this is
the value you're getting, as some graphs might have weird definitions
of "free" memory or "cached". The 'free' command line tool is a known
entity.

One hypothesis: Is your data set slowly increasing, and the growth in
memory use over time is just a result of the data set  being larger,
which would be reflected in the amount of data in page cache? If you
have no *other* significant activity that fills page cache (including
log files etc), page cache would be unused otherwise and then grow in
occupancy as your data set increases.

And to be clear: What is the actual negative observation that you have
made - do you see high latencies, saturation on disk I/O etc or is
this purely about seeming memory growth on graphs and observations of
memory usage by the JVM in top?

-- 
/ Peter Schuller

Re: Memory leak with Sun Java 1.6 ?

Posted by Peter Schuller <pe...@infidyne.com>.
>> Memory-mapped files will account for both virtual and, to the extent
>> that they are resident in memory, to the resident size of the process.
>
>  This bears further investigation.  Would you consider a 3GB overhead
>  on a 4GB heap a possibility?  (From a position of some naivety, this
>  seems a bit extreme, though I'm continually surprised by Java.)

The point is that you might see 200 gb if you have 200 gb on files on
disk that are mapped, so when using mmap the resident size is
misleading.

>  This was my concern.  As mentioned, we're on 7GB machines,
>  4GB JVM heap, and an expectation that we'll get some gain
>  from the remainder being used sensibly by Linux as cache.

Data mmap():ed on file and resident is essentially LInux page cache -
it is just also mapped into the process.

>> Just to confirm, what does the free +/- buffers show if you run
>> 'free'? (I.e., middle line, under 'free' column)
>
>  Dang.  I thought I'd kept a cap of top and free on those boxes
>  before restarting Cassandra.  I'm confident that the item you're
>  talking about matches top's 'cached' entry - which at the time
>  was showing ~ 10MB.

free +/- buffers shows what the actual free memory is when discounting
the page cache. Since on most systems you'll always have page cache
filling up most memory (after some uptime), this is the only relevant
number to look at to answer the general question of 'how much memory
is left'.

It gets complicated when your software actually depends on the page
cache though.

>> A Java memory leak would likely indicate non-heap managed memory
>> (since I think it's unlikely that the JVM fails to limit the actual
>> heap size). The question is what....
>
>  I guess this comes back to my first question - (how) can a JVM with
>  a heap of 4GB get to 7.1GB of resident?  I understand a process can
>  blow out its virtual footprint, and despite a lack of swap on those
>  boxes, that bit isn't in itself a massive concern.

mmap():ed that's resident gets counted.

-- 
/ Peter Schuller

Re: Memory leak with Sun Java 1.6 ?

Posted by Jedd Rashbrooke <je...@imagini.net>.
 Peter, Jonathon - thank you for your replies.

 I should probably have repeated myself in the body, but as I
 mentioned in the subject line, we're running Sun Java 1.6.


On 10 December 2010 18:37, Peter Schuller <pe...@infidyne.com> wrote:
> Memory-mapped files will account for both virtual and, to the extent
> that they are resident in memory, to the resident size of the process.

 This bears further investigation.  Would you consider a 3GB overhead
 on a 4GB heap a possibility?  (From a position of some naivety, this
 seems a bit extreme, though I'm continually surprised by Java.)


> However, your graph:
>
>>  I've shared a Zabbix graph of system memory at:
>>
>>     http://www.imagebam.com/image/3b4213110283969
>
> Certainly indicates that it is not the explanation since you should be
> seeing cached occupy the remainder of memory above heap size. In
> addition the allocation failures from jmap indicates memory is truly
> short.

 This was my concern.  As mentioned, we're on 7GB machines,
 4GB JVM heap, and an expectation that we'll get some gain
 from the remainder being used sensibly by Linux as cache.

> Just to confirm, what does the free +/- buffers show if you run
> 'free'? (I.e., middle line, under 'free' column)

 Dang.  I thought I'd kept a cap of top and free on those boxes
 before restarting Cassandra.  I'm confident that the item you're
 talking about matches top's 'cached' entry - which at the time
 was showing ~ 10MB.

> A Java memory leak would likely indicate non-heap managed memory
> (since I think it's unlikely that the JVM fails to limit the actual
> heap size). The question is what....

 I guess this comes back to my first question - (how) can a JVM with
 a heap of 4GB get to 7.1GB of resident?  I understand a process can
 blow out its virtual footprint, and despite a lack of swap on those
 boxes, that bit isn't in itself a massive concern.



On 12 December 2010 16:21, Jonathan Ellis <jb...@gmail.com> wrote:
> http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors

 Jonathon - to clarify - this seems to be happening *to* Java, not
 within the JVM.


 I posted mostly as a heads up for others using similar profiles (4GB
 heap on ~8GB boxes) to keep an eye out for.  I expect a few people,
 particularly if they're on Amazon EC2, are running this type of setup.

 On the other hand, mum always said I was unique.  ;)

 cheers,
 Jedd.

Re: Memory leak with Sun Java 1.6 ?

Posted by Peter Schuller <pe...@infidyne.com>.
> Memory-mapped files will account for both virtual and, to the extent
> that they are resident in memory, to the resident size of the process.
> However, your graph:

Correcting myself in the interest of providing correct information,
this doesn't seem to be true - at least not always. I don't know
whether it is a matter of different kernel versions or something else.
I have definitely seen Python processes accessing CDB:s having a high
resident amount and seen it creep down again as it is replaced by
other data being paged in, but some Cassandras now running on 2.6.32
don't exhibit this effect.

-- 
/ Peter Schuller

Re: Memory leak with Sun Java 1.6 ?

Posted by Peter Schuller <pe...@infidyne.com>.
>  Over the past month or so, it looks like memory has slowly
>  been exhausted.  Both nodetool drain and jmap can't run, and
>  produce this error:
>
>     Error occurred during initialization of VM
>     Could not reserve enough space for object heap
>
>  We've got Xmx/Xms set to 4GB.
>
>  top shows free memory around 50-80MB, file cache under
>  10MB, and the java process at 12+GB virt and 7.1GB res.
>
>  This feels like a Java problem, not a Cassandra one, but I'm
>  open to suggestions.  To ensure I don't get bothered over
>  the weekend we're doing a rolling restart of Cassandra on
>  each of the boxes now.  The last time they were restarted
>  was just over a month ago.  Now I'm wondering whether I
>  should (until 0.7.1 is available) schedule in a slower rolling
>  restart over several days, every few weeks.

Memory-mapped files will account for both virtual and, to the extent
that they are resident in memory, to the resident size of the process.
However, your graph:

>  I've shared a Zabbix graph of system memory at:
>
>     http://www.imagebam.com/image/3b4213110283969

Certainly indicates that it is not the explanation since you should be
seeing cached occupy the remainder of memory above heap size. In
addition the allocation failures from jmap indicates memory is truly
short.

Just to confirm, what does the free +/- buffers show if you run
'free'? (I.e., middle line, under 'free' column)

A Java memory leak would likely indicate non-heap managed memory
(since I think it's unlikely that the JVM fails to limit the actual
heap size). The question is what....

To cargo cult it: Are you running a modern JVM? (Not e.g. openjdk b17
in lenny or some such.) If it is a JVM issue, ensuring you're using a
reasonably recent JVM is probably much easier than to start tracking
it down...

-- 
/ Peter Schuller