You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by 박상길 <ha...@kthcorp.com> on 2011/06/24 12:15:17 UTC

Constrantly increasing memory.

When I set the max heap size below 12GB in this node, the cassandra was down by OOM, or nearly down - process is not dead but sending with thrift or gossip resulted in timeout.
When I increased the max heap to 25GB (the node has 32GB of physical memory), it did not go to OOM, but still got timeout in thrift and gossip. 

After restarting node, the GC occured in every second, and heap increased soon. No further insert of updated came for that time.So restarting node doesn't help.

There was a really big super column, which has more than a millons of sub columns(I could not count actual size)-,  in a single row in directory_icf CF. See part of log below.

When I made cassandra performance test, I mistakenly set row key with constant string, so too many subcolumns was inserted in one supercolumn of one row.

I'm not sure that the HUGE supercolumn caused the OOM, but I think that no other special situations in that node. 

It happened in 0.7.5 and 0.7.6-2.

============================
 INFO [ScheduledTasks:1] 2011-06-24 17:40:06,732 GCInspector.java (line 128) GC for ParNew: 275 ms, 1487160 reclaimed leaving 7500031952 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:07,735 GCInspector.java (line 128) GC for ParNew: 358 ms, 354600 reclaimed leaving 8573796544 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:08,738 GCInspector.java (line 128) GC for ParNew: 283 ms, 34336 reclaimed leaving 9647538928 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:09,740 GCInspector.java (line 128) GC for ParNew: 298 ms, 40984 reclaimed leaving 10721288480 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:10,937 GCInspector.java (line 128) GC for ParNew: 341 ms, 17840 reclaimed leaving 12868761576 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:11,940 GCInspector.java (line 128) GC for ParNew: 341 ms, 27264 reclaimed leaving 13942521248 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:12,942 GCInspector.java (line 128) GC for ParNew: 340 ms, 50152 reclaimed leaving 15016258744 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:13,944 GCInspector.java (line 128) GC for ParNew: 306 ms, 60008 reclaimed leaving 16090008328 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:14,947 GCInspector.java (line 128) GC for ParNew: 303 ms, 60952 reclaimed leaving 17163747752 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:15,949 GCInspector.java (line 128) GC for ParNew: 303 ms, 41624 reclaimed leaving 18237495648 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:16,952 GCInspector.java (line 128) GC for ParNew: 305 ms, 72184 reclaimed leaving 19311237008 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:36,207 GCInspector.java (line 128) GC for ParNew: 310 ms, 14983048 reclaimed leaving 7500171776 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:37,210 GCInspector.java (line 128) GC for ParNew: 311 ms, 4999176 reclaimed leaving 8573947928 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:38,213 GCInspector.java (line 128) GC for ParNew: 344 ms, 35560 reclaimed leaving 9647707616 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:39,396 GCInspector.java (line 128) GC for ParNew: 323 ms, 11592 reclaimed leaving 11795184872 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:40,398 GCInspector.java (line 128) GC for ParNew: 298 ms, 30816 reclaimed leaving 12868932272 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:41,402 GCInspector.java (line 128) GC for ParNew: 318 ms, 46496 reclaimed leaving 13942685912 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:40:42,602 GCInspector.java (line 128) GC for ParNew: 297 ms, 21728 reclaimed leaving 16090166416 used; max is 21443379200
 INFO [CompactionExecutor:1] 2011-06-24 17:40:43,440 CompactionManager.java (line 750) Cleaned up to /var/lib/cassandra/data/syncserver/directory_icf-tmp-f-79-Data.db.  751,239,333 to 751,239,333 (~100% of original) bytes for 1 keys.  Time: 161,825ms.
 INFO [CompactionExecutor:1] 2011-06-24 17:40:43,440 CompactionManager.java (line 692) Cleaning up SSTableReader(path='/var/lib/cassandra/data/syncserver/directory_icf-f-56-Data.db')
 INFO [ScheduledTasks:1] 2011-06-24 17:40:43,604 GCInspector.java (line 128) GC for ParNew: 371 ms, 66720 reclaimed leaving 17163918296 used; max is 21443379200
 INFO [CompactionExecutor:1] 2011-06-24 17:40:44,558 CompactionManager.java (line 750) Cleaned up to /var/lib/cassandra/data/syncserver/directory_icf-tmp-f-80-Data.db.  23,409,928 to 23,409,928 (~100% of original) bytes for 1 keys.  Time: 1,117ms.
 INFO [CompactionExecutor:1] 2011-06-24 17:40:44,559 CompactionManager.java (line 692) Cleaning up SSTableReader(path='/var/lib/cassandra/data/syncserver/directory_icf-f-58-Data.db')
 INFO [ScheduledTasks:1] 2011-06-24 17:40:44,607 GCInspector.java (line 128) GC for ParNew: 340 ms, 192440 reclaimed leaving 18247168040 used; max is 21443379200
 INFO [MigrationStage:1] 2011-06-24 17:40:45,303 Migration.java (line 118) Applying migration a6fe6bc1-9e3d-11e0-a4fc-c259afdc7963 Drop keyspace: syncserver
 INFO [MigrationStage:1] 2011-06-24 17:40:45,304 ColumnFamilyStore.java (line 1065) Enqueuing flush of Memtable-Migrations@2075275296(5824 bytes, 1 operations)
 INFO [FlushWriter:7] 2011-06-24 17:40:45,304 Memtable.java (line 157) Writing Memtable-Migrations@2075275296(5824 bytes, 1 operations)
 INFO [MigrationStage:1] 2011-06-24 17:40:45,305 ColumnFamilyStore.java (line 1065) Enqueuing flush of Memtable-Schema@1454862082(2113 bytes, 2 operations)
 INFO [FlushWriter:7] 2011-06-24 17:40:45,326 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/system/Migrations-f-152-Data.db (5888 bytes)
 INFO [FlushWriter:7] 2011-06-24 17:40:45,327 Memtable.java (line 157) Writing Memtable-Schema@1454862082(2113 bytes, 2 operations)
 INFO [FlushWriter:7] 2011-06-24 17:40:45,336 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/system/Schema-f-152-Data.db (2263 bytes)
 INFO [GossipTasks:1] 2011-06-24 17:41:08,177 Gossiper.java (line 228) InetAddress /112.175.18.115 is now dead.
 INFO [GossipStage:1] 2011-06-24 17:41:08,179 Gossiper.java (line 609) InetAddress /112.175.18.115 is now UP
 INFO [ScheduledTasks:1] 2011-06-24 17:41:09,180 GCInspector.java (line 128) GC for ParNew: 333 ms, 13918216 reclaimed leaving 7500325792 used; max is 21443379200
 INFO [ScheduledTasks:1] 2011-06-24 17:41:10,184 GCInspector.java (line 128) GC for ParNew: 303 ms, 36216 reclaimed leaving 8574088880 used; max is 21443379200


Re: Constrantly increasing memory.

Posted by Jonathan Ellis <jb...@gmail.com>.
Yes, attempting to read a huge supercolumn would do it.  I'd delete
the supercolumn and compact.  You'll need your large heap to do the compaction.

2011/6/24 박상길 <ha...@kthcorp.com>:
> When I set the max heap size below 12GB in this node, the cassandra was down by OOM, or nearly down - process is not dead but sending with thrift or gossip resulted in timeout.
> When I increased the max heap to 25GB (the node has 32GB of physical memory), it did not go to OOM, but still got timeout in thrift and gossip.
>
> After restarting node, the GC occured in every second, and heap increased soon. No further insert of updated came for that time.So restarting node doesn't help.
>
> There was a really big super column, which has more than a millons of sub columns(I could not count actual size)-,  in a single row in directory_icf CF. See part of log below.
>
> When I made cassandra performance test, I mistakenly set row key with constant string, so too many subcolumns was inserted in one supercolumn of one row.
>
> I'm not sure that the HUGE supercolumn caused the OOM, but I think that no other special situations in that node.
>
> It happened in 0.7.5 and 0.7.6-2.
>
> ============================
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:06,732 GCInspector.java (line 128) GC for ParNew: 275 ms, 1487160 reclaimed leaving 7500031952 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:07,735 GCInspector.java (line 128) GC for ParNew: 358 ms, 354600 reclaimed leaving 8573796544 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:08,738 GCInspector.java (line 128) GC for ParNew: 283 ms, 34336 reclaimed leaving 9647538928 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:09,740 GCInspector.java (line 128) GC for ParNew: 298 ms, 40984 reclaimed leaving 10721288480 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:10,937 GCInspector.java (line 128) GC for ParNew: 341 ms, 17840 reclaimed leaving 12868761576 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:11,940 GCInspector.java (line 128) GC for ParNew: 341 ms, 27264 reclaimed leaving 13942521248 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:12,942 GCInspector.java (line 128) GC for ParNew: 340 ms, 50152 reclaimed leaving 15016258744 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:13,944 GCInspector.java (line 128) GC for ParNew: 306 ms, 60008 reclaimed leaving 16090008328 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:14,947 GCInspector.java (line 128) GC for ParNew: 303 ms, 60952 reclaimed leaving 17163747752 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:15,949 GCInspector.java (line 128) GC for ParNew: 303 ms, 41624 reclaimed leaving 18237495648 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:16,952 GCInspector.java (line 128) GC for ParNew: 305 ms, 72184 reclaimed leaving 19311237008 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:36,207 GCInspector.java (line 128) GC for ParNew: 310 ms, 14983048 reclaimed leaving 7500171776 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:37,210 GCInspector.java (line 128) GC for ParNew: 311 ms, 4999176 reclaimed leaving 8573947928 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:38,213 GCInspector.java (line 128) GC for ParNew: 344 ms, 35560 reclaimed leaving 9647707616 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:39,396 GCInspector.java (line 128) GC for ParNew: 323 ms, 11592 reclaimed leaving 11795184872 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:40,398 GCInspector.java (line 128) GC for ParNew: 298 ms, 30816 reclaimed leaving 12868932272 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:41,402 GCInspector.java (line 128) GC for ParNew: 318 ms, 46496 reclaimed leaving 13942685912 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:42,602 GCInspector.java (line 128) GC for ParNew: 297 ms, 21728 reclaimed leaving 16090166416 used; max is 21443379200
>  INFO [CompactionExecutor:1] 2011-06-24 17:40:43,440 CompactionManager.java (line 750) Cleaned up to /var/lib/cassandra/data/syncserver/directory_icf-tmp-f-79-Data.db.  751,239,333 to 751,239,333 (~100% of original) bytes for 1 keys.  Time: 161,825ms.
>  INFO [CompactionExecutor:1] 2011-06-24 17:40:43,440 CompactionManager.java (line 692) Cleaning up SSTableReader(path='/var/lib/cassandra/data/syncserver/directory_icf-f-56-Data.db')
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:43,604 GCInspector.java (line 128) GC for ParNew: 371 ms, 66720 reclaimed leaving 17163918296 used; max is 21443379200
>  INFO [CompactionExecutor:1] 2011-06-24 17:40:44,558 CompactionManager.java (line 750) Cleaned up to /var/lib/cassandra/data/syncserver/directory_icf-tmp-f-80-Data.db.  23,409,928 to 23,409,928 (~100% of original) bytes for 1 keys.  Time: 1,117ms.
>  INFO [CompactionExecutor:1] 2011-06-24 17:40:44,559 CompactionManager.java (line 692) Cleaning up SSTableReader(path='/var/lib/cassandra/data/syncserver/directory_icf-f-58-Data.db')
>  INFO [ScheduledTasks:1] 2011-06-24 17:40:44,607 GCInspector.java (line 128) GC for ParNew: 340 ms, 192440 reclaimed leaving 18247168040 used; max is 21443379200
>  INFO [MigrationStage:1] 2011-06-24 17:40:45,303 Migration.java (line 118) Applying migration a6fe6bc1-9e3d-11e0-a4fc-c259afdc7963 Drop keyspace: syncserver
>  INFO [MigrationStage:1] 2011-06-24 17:40:45,304 ColumnFamilyStore.java (line 1065) Enqueuing flush of Memtable-Migrations@2075275296(5824 bytes, 1 operations)
>  INFO [FlushWriter:7] 2011-06-24 17:40:45,304 Memtable.java (line 157) Writing Memtable-Migrations@2075275296(5824 bytes, 1 operations)
>  INFO [MigrationStage:1] 2011-06-24 17:40:45,305 ColumnFamilyStore.java (line 1065) Enqueuing flush of Memtable-Schema@1454862082(2113 bytes, 2 operations)
>  INFO [FlushWriter:7] 2011-06-24 17:40:45,326 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/system/Migrations-f-152-Data.db (5888 bytes)
>  INFO [FlushWriter:7] 2011-06-24 17:40:45,327 Memtable.java (line 157) Writing Memtable-Schema@1454862082(2113 bytes, 2 operations)
>  INFO [FlushWriter:7] 2011-06-24 17:40:45,336 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/system/Schema-f-152-Data.db (2263 bytes)
>  INFO [GossipTasks:1] 2011-06-24 17:41:08,177 Gossiper.java (line 228) InetAddress /112.175.18.115 is now dead.
>  INFO [GossipStage:1] 2011-06-24 17:41:08,179 Gossiper.java (line 609) InetAddress /112.175.18.115 is now UP
>  INFO [ScheduledTasks:1] 2011-06-24 17:41:09,180 GCInspector.java (line 128) GC for ParNew: 333 ms, 13918216 reclaimed leaving 7500325792 used; max is 21443379200
>  INFO [ScheduledTasks:1] 2011-06-24 17:41:10,184 GCInspector.java (line 128) GC for ParNew: 303 ms, 36216 reclaimed leaving 8574088880 used; max is 21443379200
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com