You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Віталій Тимчишин <ti...@gmail.com> on 2012/01/03 11:18:56 UTC

Cassandra OOM

Hello.

We are using cassandra for some time in our project. Currently we are on
1.1 trunk (it was accidental migration, but since it's hard to migrate back
and it's performing nice enough we are currently on 1.1).
During New Year holidays one of the servers've produces a number of OOM
messages in the log.
According to heap dump taken, most of the memory is taken by MutationStage
queue (over 2millions of items).
So, I am curious now if cassandra have any flow control for messages? We
are using Quorum for writes and it seems to me that one slow server may
start getting more messages than it can consume. The writes will still
succeed performed by other servers in the replication set.
If there is no flow control, it should eventually get OOM. Is it the case?
Are there any plans to handle this?
BTW: A lot of memory (~half) is taken by Inet4Address objects, so making a
cache of such objects would make this problem less possible.

-- 
Best regards,
 Vitalii Tymchyshyn

Re: Cassandra OOM

Posted by Віталій Тимчишин <ti...@gmail.com>.
2012/1/4 Vitalii Tymchyshyn <ti...@gmail.com>

> 04.01.12 14:25, Radim Kolar написав(ла):
>
>  > So, what are cassandra memory requirement? Is it 1% or 2% of disk data?
>> It depends on number of rows you have. if you have lot of rows then
>> primary memory eaters are index sampling data and bloom filters. I use
>> index sampling 512 and bloom filters set to 4% to cut down memory needed.
>>
> I've raised index sampling and bloom filter setting seems not to be on
> trunk yet. For me memtables is what's eating heap :(
>
>
Hello, all.

I've found out and fixed the problem today (after one my node OOMed
constantly replaying heap on start-up). full-key deletes are not accounted
and so column families with delete-only operations are not flushed. Here is
Jira: https://issues.apache.org/jira/browse/CASSANDRA-3741 and my pull
request to fix it: https://github.com/apache/cassandra/pull/5

Best regards, Vitalii Tymchyshyn

Re: Cassandra OOM

Posted by Vitalii Tymchyshyn <ti...@gmail.com>.
04.01.12 14:25, Radim Kolar написав(ла):
> > So, what are cassandra memory requirement? Is it 1% or 2% of disk data?
> It depends on number of rows you have. if you have lot of rows then 
> primary memory eaters are index sampling data and bloom filters. I use 
> index sampling 512 and bloom filters set to 4% to cut down memory needed.
I've raised index sampling and bloom filter setting seems not to be on 
trunk yet. For me memtables is what's eating heap :(

Best regards, Vitalii Tymchyshyn.

Re: Cassandra OOM

Posted by Radim Kolar <hs...@sendmail.cz>.
 > Looking at heap dumps, a lot of memory is taken by memtables, much 
more than 1/3 of heap. At the same time, logs say that it has nothing to 
flush since there are not dirty memtables.
I seen this too.

 > So, what are cassandra memory requirement? Is it 1% or 2% of disk data?
It depends on number of rows you have. if you have lot of rows then 
primary memory eaters are index sampling data and bloom filters. I use 
index sampling 512 and bloom filters set to 4% to cut down memory needed.

Re: Cassandra OOM

Posted by Vitalii Tymchyshyn <ti...@gmail.com>.
Hello.

BTW: It would be great for cassandra to shutdown on Errors like OOM 
because now I am not sure if the problem described in previous email is 
the root cause or some of OOM error found in log made some "writer" stop.

I am now looking at different OOMs in my cluster. Currently each node 
has up to 300G of data in ~10 column families. Previous Heap Size of 3G 
seems to be not enough, I am raising to to 5G. Looking at heap dumps, a 
lot of memory is taken by memtables, much more than 1/3 of heap. At the 
same time, logs say that it has nothing to flush since there are not 
dirty memtables. So, what are cassandra memory requirement? Is it 1% or 
2% of disk data? Or may be I am doing something wrong?

Best regards, Vitalii Tymchyshyn

03.01.12 20:58, aaron morton написав(ла):
> The DynamicSnitch can result in less read operations been sent to a 
> node, but as long as a node is marked as UP mutations are sent to all 
> replicas. Nodes will shed load when they pull messages off the queue 
> that have expired past rpc_timeout, but they will not feed back flow 
> control to the other nodes. Other than going down or performing slow 
> enough for the dynamic snitch to route reads around them.
>
> There are also safety valves in there to reduce the size of the 
> memtables and caches in response to low memory. Perhaps that process 
> could also shed messages from thread pools with a high number of 
> pending messages.
>
> **But** going OOM with 2M+ mutations in the thread pool sounds like 
> the server was going down anyway. Did you look into why all the 
> messages were there ?
>
> Cheers
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 3/01/2012, at 11:18 PM, Віталій Тимчишин wrote:
>
>> Hello.
>>
>> We are using cassandra for some time in our project. Currently we are 
>> on 1.1 trunk (it was accidental migration, but since it's hard to 
>> migrate back and it's performing nice enough we are currently on 1.1).
>> During New Year holidays one of the servers've produces a number of 
>> OOM messages in the log.
>> According to heap dump taken, most of the memory is taken by 
>> MutationStage queue (over 2millions of items).
>> So, I am curious now if cassandra have any flow control for messages? 
>> We are using Quorum for writes and it seems to me that one slow 
>> server may start getting more messages than it can consume. The 
>> writes will still succeed performed by other servers in the 
>> replication set.
>> If there is no flow control, it should eventually get OOM. Is it the 
>> case? Are there any plans to handle this?
>> BTW: A lot of memory (~half) is taken by Inet4Address objects, so 
>> making a cache of such objects would make this problem less possible.
>>
>> -- 
>> Best regards,
>>  Vitalii Tymchyshyn
>


Re: Cassandra OOM

Posted by aaron morton <aa...@thelastpickle.com>.
The DynamicSnitch can result in less read operations been sent to a node, but as long as a node is marked as UP mutations are sent to all replicas. Nodes will shed load when they pull messages off the queue that have expired past rpc_timeout, but they will not feed back flow control to the other nodes. Other than going down or performing slow enough for the dynamic snitch to route reads around them.

There are also safety valves in there to reduce the size of the memtables and caches in response to low memory. Perhaps that process could also shed messages from thread pools with a high number of pending messages. 

**But** going OOM with 2M+ mutations in the thread pool sounds like the server was going down anyway. Did you look into why all the messages were there ? 

Cheers
 
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 3/01/2012, at 11:18 PM, Віталій Тимчишин wrote:

> Hello.
> 
> We are using cassandra for some time in our project. Currently we are on 1.1 trunk (it was accidental migration, but since it's hard to migrate back and it's performing nice enough we are currently on 1.1).
> During New Year holidays one of the servers've produces a number of OOM messages in the log.
> According to heap dump taken, most of the memory is taken by MutationStage queue (over 2millions of items).
> So, I am curious now if cassandra have any flow control for messages? We are using Quorum for writes and it seems to me that one slow server may start getting more messages than it can consume. The writes will still succeed performed by other servers in the replication set.
> If there is no flow control, it should eventually get OOM. Is it the case? Are there any plans to handle this?
> BTW: A lot of memory (~half) is taken by Inet4Address objects, so making a cache of such objects would make this problem less possible. 
> 
> -- 
> Best regards,
>  Vitalii Tymchyshyn