You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Shammi Jayasinghe <sh...@wso2.com> on 2013/12/19 03:52:20 UTC

How to tune cassandra to avoid OOM

Hi,


We are facing with a problem on Cassandra tuning. In that we have faced
with following OOM scenario[1], after running the system for 6 days. We
have tuned the cassandra with following values. These values also obtained
by going through huge number of testing cycles. But still it has gone OOM.
I would like to know if someone can help on identifying tuning parameters.

In this server , we have given 6GB for the Xmx value and the total memory
in the server is 8GB. Cassandra version is : apache-cassandra-1.2.4

Tuning parameters:

flush_largest_memtables_at: 0.5

reduce_cache_sizes_at: 0.85

reduce_cache_capacity_to: 0.6

commitlog_total_space_in_mb: 16

commitlog_segment_size_in_mb: 16



As i mentioned in the above parameters ( Flush_largest_memtable_at to 0,5)
, i feel that it has not be affected to the server. Is there any way that
we can check whether it is affected as expected to the server ?



[1]WARN 19:16:50,355 Heap is 0.9971737408184552 full.  You may need to
reduce memtable and/or cache sizes.  Cassandra will now flush up to the two
largest memtables to free up memory.  Adjust flush_largest_memtables_at
threshold in cassandra.yaml if you don't want Cassandra to do this
automatically

 WARN 19:18:19,784 Flushing CFS(Keyspace='QpidKeySpace',
ColumnFamily='DestinationSubscriptionsCountRow') to relieve memory pressure

ERROR 19:20:50,316 Exception in thread Thread[ReadStage:63,5,main]

java.lang.OutOfMemoryError: Java heap space

        at java.nio.ByteBuffer.wrap(ByteBuffer.java:350)

        at java.nio.ByteBuffer.wrap(ByteBuffer.java:373)

        at
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:391)

        at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)

        at
org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:371)

        at
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:84)

        at
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73)

        at
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:370)

        at
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:325)

        at
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:151)

        at
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:48)

        at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)

        at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)

        at
org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:90)

        at
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:171)

        at
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)

        at
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)

        at
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)

        at
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)

        at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)

        at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)

        at
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)

        at
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)

        at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)

        at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)

        at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)

        at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)

        at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)

        at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)

        at org.apache.cassandra.db.Table.getRow(Table.java:347)

        at
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)

        at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052)

 INFO 19:20:51,397 Stop listening to thrift clients

-- 
Best Regards,

*  Shammi Jayasinghe*
Associate Tech Lead
WSO2, Inc.; http://wso2.com,
mobile: +94 71 4493085

Re: How to tune cassandra to avoid OOM

Posted by Aaron Morton <aa...@thelastpickle.com>.
> Cassandra version is : apache-cassandra-1.2.4	
The latest 1.2 version is 1.2.13, you really should be on that. 

> commitlog_total_space_in_mb: 16
> commitlog_segment_size_in_mb: 16
Reducing the total commit log size to 16 MB is a very bad idea, you should return it to 4096 and the segment size to 32.

The commit log is kept on disk and has no impact on the memory footprint. Reducing the size will cause much more disk IO. 

It’s kind of unusual to go OOM in 1.2+, but I’ve seen it happen with large number of SSTables (30k+) and LCS. Also wide rows, or lots of tombstones, and bad queries can result in a lot of premature tenuring. Finally custom comparators can create a lot of garbage or a low powered CPU may not be able to keep up. 

How many cores do you have ? 

You may want to make these changes to reduce how quickly objects are tenured, also pay attention to how low the total heap use get’s to after CMS.

JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=4" 
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=2"
JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=50”  

Hope that helps. 

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 19/12/2013, at 4:47 pm, Lee Mighdoll <le...@underneath.ca> wrote:

> I'd suggest setting some cassandra jvm parameters so that you can analyze a heap dump and peek through the gc logs.  That'll give you some clues e.g. if the memory problem is growing steadily or suddenly, and clues from a peek at which object are using the memory.  
> 
> -XX:+HeapDumpOnOutOfMemoryError
> 
> And if you don't want to wait six days for another failure, you can collect a heap sooner with jmap -F.
> 
> -Xloggc:/path/to/where/to/put/the/gc.log
> -XX:+PrintGCDetails
> -XX:+PrintGCDateStamps
> -XX:+PrintHeapAtGC
> -XX:+PrintTenuringDistribution
> -XX:+PrintGCApplicationStoppedTime
> -XX:+PrintPromotionFailure
> 
> Cheers,
> Lee
> 
> 
> 
> On Wed, Dec 18, 2013 at 6:52 PM, Shammi Jayasinghe <sh...@wso2.com> wrote:
> Hi,
> 
> 
> We are facing with a problem on Cassandra tuning. In that we have faced with following OOM scenario[1], after running the system for 6 days. We have tuned the cassandra with following values. These values also obtained by going through huge number of testing cycles. But still it has gone OOM. I would like to know if someone can help on identifying tuning parameters.
> 
> In this server , we have given 6GB for the Xmx value and the total memory in the server is 8GB. Cassandra version is : apache-cassandra-1.2.4	
> 
> Tuning parameters:
> flush_largest_memtables_at: 0.5
> reduce_cache_sizes_at: 0.85
> reduce_cache_capacity_to: 0.6
> commitlog_total_space_in_mb: 16
> commitlog_segment_size_in_mb: 16
> 
> 
> 
> As i mentioned in the above parameters ( Flush_largest_memtable_at to 0,5) , i feel that it has not be affected to the server. Is there any way that we can check whether it is affected as expected to the server ?
> 
> 
> 
> [1]WARN 19:16:50,355 Heap is 0.9971737408184552 full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
>  WARN 19:18:19,784 Flushing CFS(Keyspace='QpidKeySpace', ColumnFamily='DestinationSubscriptionsCountRow') to relieve memory pressure
> ERROR 19:20:50,316 Exception in thread Thread[ReadStage:63,5,main]
> java.lang.OutOfMemoryError: Java heap space
>         at java.nio.ByteBuffer.wrap(ByteBuffer.java:350)
>         at java.nio.ByteBuffer.wrap(ByteBuffer.java:373)
>         at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:391)
>         at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
>         at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:371)
>         at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:84)
>         at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73)
>         at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:370)
>         at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:325)
>         at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:151)
>         at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:48)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>         at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:90)
>         at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:171)
>         at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>         at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>         at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>         at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>         at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>         at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>         at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
>         at org.apache.cassandra.db.Table.getRow(Table.java:347)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
>         at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052)
>  INFO 19:20:51,397 Stop listening to thrift clients
> 
> -- 
> Best Regards, 
>  
> Shammi Jayasinghe
> Associate Tech Lead
> WSO2, Inc.; http://wso2.com,
> mobile: +94 71 4493085 
> 
> 


Re: How to tune cassandra to avoid OOM

Posted by Lee Mighdoll <le...@underneath.ca>.
I'd suggest setting some cassandra jvm parameters so that you can analyze a
heap dump and peek through the gc logs.  That'll give you some clues e.g.
if the memory problem is growing steadily or suddenly, and clues from a
peek at which object are using the memory.

-XX:+HeapDumpOnOutOfMemoryError

And if you don't want to wait six days for another failure, you can collect
a heap sooner with jmap -F.

-Xloggc:/path/to/where/to/put/the/gc.log
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintPromotionFailure

Cheers,
Lee



On Wed, Dec 18, 2013 at 6:52 PM, Shammi Jayasinghe <sh...@wso2.com> wrote:

> Hi,
>
>
> We are facing with a problem on Cassandra tuning. In that we have faced
> with following OOM scenario[1], after running the system for 6 days. We
> have tuned the cassandra with following values. These values also obtained
> by going through huge number of testing cycles. But still it has gone OOM.
> I would like to know if someone can help on identifying tuning parameters.
>
> In this server , we have given 6GB for the Xmx value and the total memory
> in the server is 8GB. Cassandra version is : apache-cassandra-1.2.4
>
> Tuning parameters:
>
> flush_largest_memtables_at: 0.5
>
> reduce_cache_sizes_at: 0.85
>
> reduce_cache_capacity_to: 0.6
>
> commitlog_total_space_in_mb: 16
>
> commitlog_segment_size_in_mb: 16
>
>
>
> As i mentioned in the above parameters ( Flush_largest_memtable_at to 0,5)
> , i feel that it has not be affected to the server. Is there any way that
> we can check whether it is affected as expected to the server ?
>
>
>
> [1]WARN 19:16:50,355 Heap is 0.9971737408184552 full.  You may need to
> reduce memtable and/or cache sizes.  Cassandra will now flush up to the
> two largest memtables to free up memory.  Adjust
> flush_largest_memtables_at threshold in cassandra.yaml if you don't want
> Cassandra to do this automatically
>
>  WARN 19:18:19,784 Flushing CFS(Keyspace='QpidKeySpace',
> ColumnFamily='DestinationSubscriptionsCountRow') to relieve memory pressure
>
> ERROR 19:20:50,316 Exception in thread Thread[ReadStage:63,5,main]
>
> java.lang.OutOfMemoryError: Java heap space
>
>         at java.nio.ByteBuffer.wrap(ByteBuffer.java:350)
>
>         at java.nio.ByteBuffer.wrap(ByteBuffer.java:373)
>
>         at
> org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:391)
>
>         at
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
>
>         at
> org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:371)
>
>         at
> org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:84)
>
>         at
> org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73)
>
>         at
> org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:370)
>
>         at
> org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:325)
>
>         at
> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:151)
>
>         at
> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:48)
>
>         at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
>         at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
>         at
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:90)
>
>         at
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:171)
>
>         at
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>
>         at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>
>         at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
>         at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
>         at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
>         at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
>         at
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>
>         at
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>
>         at
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>
>         at
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>
>         at
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>
>         at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>
>         at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>
>         at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
>
>         at org.apache.cassandra.db.Table.getRow(Table.java:347)
>
>         at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
>
>         at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052)
>
>  INFO 19:20:51,397 Stop listening to thrift clients
>
> --
> Best Regards,
>
> *  Shammi Jayasinghe*
> Associate Tech Lead
> WSO2, Inc.; http://wso2.com,
> mobile: +94 71 4493085
>
>