You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Andrey Stepachev <oc...@gmail.com> on 2011/07/15 10:21:00 UTC

Cassandra OOM on repair.

Hi all.

Cassandra constantly OOM on repair or compaction. Increasing memory doesn't
help (6G)
I can give more, but I think that this is not a regular situation. Cluster
has 4 nodes. RF=3.
Cassandra version 0.8.1

Ring looks like this:
 Address         DC          Rack        Status State   Load            Owns
   Token

   127605887595351923798765477786913079296
xxx.xxx.xxx.66  datacenter1 rack1       Up     Normal  176.96 GB
25.00%  0
xxx.xxx.xxx.69  datacenter1 rack1       Up     Normal  178.19 GB
25.00%  42535295865117307932921825928971026432
xxx.xxx.xxx.67  datacenter1 rack1       Up     Normal  178.26 GB
25.00%  85070591730234615865843651857942052864
xxx.xxx.xxx.68  datacenter1 rack1       Up     Normal  175.2 GB
 25.00%  127605887595351923798765477786913079296

About schema:
I have big rows (>100k, up to several millions). But as I know, it is normal
for cassandra.
All things work relatively good, until I start long running pre-production
tests. I load
data and after a while (~4hours) cluster begin timeout and them some nodes
die with OOM.
My app retries to send, so after short period all nodes becomes down. Very
nasty.

But now, I can OOM nodes by simple call nodetool repair.
In logs http://paste.kde.org/96811/ it is clear, how heap rocketjump to
upper limit.
cfstats shows: http://paste.kde.org/96817/
config is: http://paste.kde.org/96823/
A question is: does anybody knows, what this means. Why cassandra tries to
load
something big into memory at once?

A.

Re: Cassandra OOM on repair.

Posted by Jonathan Ellis <jb...@gmail.com>.
Can't think of any.

On Sun, Jul 17, 2011 at 1:27 PM, Andrey Stepachev <oc...@gmail.com> wrote:
> Looks like problem in code:
>     public IndexSummary(long expectedKeys)
>     {
>         long expectedEntries = expectedKeys /
> DatabaseDescriptor.getIndexInterval();
>         if (expectedEntries > Integer.MAX_VALUE)
>             // TODO: that's a _lot_ of keys, or a very low interval
>             throw new RuntimeException("Cannot use index_interval of " +
> DatabaseDescriptor.getIndexInterval() + " with " + expectedKeys + "
> (expected) keys.");
>         indexPositions = new ArrayList<KeyPosition>((int)expectedEntries);
>     }
> I have too many keys, and too small index interval.
> To fix this, I can:
> 1) reduce number of keys - rewrite app and sacrifice balance
> 2) increase index_interval - hurt another column families
> A question:
> Are there any drawbacks for using different indexInterval for column
> families
> in keyspace? (suppose I'll write a patch)
> 2011/7/15 Andrey Stepachev <oc...@gmail.com>
>>
>> Looks like key indexes eat all memory:
>> http://paste.kde.org/97213/
>>
>> 2011/7/15 Andrey Stepachev <oc...@gmail.com>
>>>
>>> UPDATE:
>>> I found, that
>>> a) with min10G cassandra survive.
>>> b) I have ~1000 sstables
>>> c) CompactionManager uses PrecompactedRows instead of LazilyCompactedRow
>>> So, I have a question:
>>> a) if row is bigger then 64mb before compaction, why it compacted in
>>> memory
>>> b) if it smaller, what eats so much memory?
>>> 2011/7/15 Andrey Stepachev <oc...@gmail.com>
>>>>
>>>> Hi all.
>>>> Cassandra constantly OOM on repair or compaction. Increasing memory
>>>> doesn't help (6G)
>>>> I can give more, but I think that this is not a regular situation.
>>>> Cluster has 4 nodes. RF=3.
>>>> Cassandra version 0.8.1
>>>> Ring looks like this:
>>>>  Address         DC          Rack        Status State   Load
>>>>  Owns    Token
>>>>
>>>>        127605887595351923798765477786913079296
>>>> xxx.xxx.xxx.66  datacenter1 rack1       Up     Normal  176.96 GB
>>>> 25.00%  0
>>>> xxx.xxx.xxx.69  datacenter1 rack1       Up     Normal  178.19 GB
>>>> 25.00%  42535295865117307932921825928971026432
>>>> xxx.xxx.xxx.67  datacenter1 rack1       Up     Normal  178.26 GB
>>>> 25.00%  85070591730234615865843651857942052864
>>>> xxx.xxx.xxx.68  datacenter1 rack1       Up     Normal  175.2 GB
>>>>  25.00%  127605887595351923798765477786913079296
>>>> About schema:
>>>> I have big rows (>100k, up to several millions). But as I know, it is
>>>> normal for cassandra.
>>>> All things work relatively good, until I start long running
>>>> pre-production tests. I load
>>>> data and after a while (~4hours) cluster begin timeout and them some
>>>> nodes die with OOM.
>>>> My app retries to send, so after short period all nodes becomes down.
>>>> Very nasty.
>>>> But now, I can OOM nodes by simple call nodetool repair.
>>>> In logs http://paste.kde.org/96811/ it is clear, how heap rocketjump to
>>>> upper limit.
>>>> cfstats shows: http://paste.kde.org/96817/
>>>> config is: http://paste.kde.org/96823/
>>>> A question is: does anybody knows, what this means. Why cassandra tries
>>>> to load
>>>> something big into memory at once?
>>>> A.
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Cassandra OOM on repair.

Posted by Andrey Stepachev <oc...@gmail.com>.
Looks like problem in code:

    public IndexSummary(long expectedKeys)
    {
        long expectedEntries = expectedKeys /
DatabaseDescriptor.getIndexInterval();
        if (expectedEntries > Integer.MAX_VALUE)
            // TODO: that's a _lot_ of keys, or a very low interval
            throw new RuntimeException("Cannot use index_interval of " +
DatabaseDescriptor.getIndexInterval() + " with " + expectedKeys + "
(expected) keys.");
        indexPositions = new ArrayList<KeyPosition>((int)expectedEntries);
    }

I have too many keys, and too small index interval.

To fix this, I can:
1) reduce number of keys - rewrite app and sacrifice balance
2) increase index_interval - hurt another column families

A question:
Are there any drawbacks for using different indexInterval for column
families
in keyspace? (suppose I'll write a patch)

2011/7/15 Andrey Stepachev <oc...@gmail.com>

> Looks like key indexes eat all memory:
>
> http://paste.kde.org/97213/
>
>
> 2011/7/15 Andrey Stepachev <oc...@gmail.com>
>
>> UPDATE:
>>
>> I found, that
>> a) with min10G cassandra survive.
>> b) I have ~1000 sstables
>> c) CompactionManager uses PrecompactedRows instead of LazilyCompactedRow
>>
>> So, I have a question:
>> a) if row is bigger then 64mb before compaction, why it compacted in
>> memory
>> b) if it smaller, what eats so much memory?
>>
>> 2011/7/15 Andrey Stepachev <oc...@gmail.com>
>>
>>> Hi all.
>>>
>>> Cassandra constantly OOM on repair or compaction. Increasing memory
>>> doesn't help (6G)
>>> I can give more, but I think that this is not a regular situation.
>>> Cluster has 4 nodes. RF=3.
>>> Cassandra version 0.8.1
>>>
>>> Ring looks like this:
>>>  Address         DC          Rack        Status State   Load
>>>  Owns    Token
>>>
>>>      127605887595351923798765477786913079296
>>> xxx.xxx.xxx.66  datacenter1 rack1       Up     Normal  176.96 GB
>>> 25.00%  0
>>> xxx.xxx.xxx.69  datacenter1 rack1       Up     Normal  178.19 GB
>>> 25.00%  42535295865117307932921825928971026432
>>> xxx.xxx.xxx.67  datacenter1 rack1       Up     Normal  178.26 GB
>>> 25.00%  85070591730234615865843651857942052864
>>> xxx.xxx.xxx.68  datacenter1 rack1       Up     Normal  175.2 GB
>>>  25.00%  127605887595351923798765477786913079296
>>>
>>> About schema:
>>> I have big rows (>100k, up to several millions). But as I know, it is
>>> normal for cassandra.
>>> All things work relatively good, until I start long running
>>> pre-production tests. I load
>>> data and after a while (~4hours) cluster begin timeout and them some
>>> nodes die with OOM.
>>> My app retries to send, so after short period all nodes becomes down.
>>> Very nasty.
>>>
>>> But now, I can OOM nodes by simple call nodetool repair.
>>> In logs http://paste.kde.org/96811/ it is clear, how heap rocketjump to
>>> upper limit.
>>> cfstats shows: http://paste.kde.org/96817/
>>> config is: http://paste.kde.org/96823/
>>> A question is: does anybody knows, what this means. Why cassandra tries
>>> to load
>>> something big into memory at once?
>>>
>>> A.
>>>
>>
>>
>

Re: Cassandra OOM on repair.

Posted by Andrey Stepachev <oc...@gmail.com>.
Looks like key indexes eat all memory:

http://paste.kde.org/97213/


2011/7/15 Andrey Stepachev <oc...@gmail.com>

> UPDATE:
>
> I found, that
> a) with min10G cassandra survive.
> b) I have ~1000 sstables
> c) CompactionManager uses PrecompactedRows instead of LazilyCompactedRow
>
> So, I have a question:
> a) if row is bigger then 64mb before compaction, why it compacted in memory
> b) if it smaller, what eats so much memory?
>
> 2011/7/15 Andrey Stepachev <oc...@gmail.com>
>
>> Hi all.
>>
>> Cassandra constantly OOM on repair or compaction. Increasing memory
>> doesn't help (6G)
>> I can give more, but I think that this is not a regular situation. Cluster
>> has 4 nodes. RF=3.
>> Cassandra version 0.8.1
>>
>> Ring looks like this:
>>  Address         DC          Rack        Status State   Load
>>  Owns    Token
>>
>>      127605887595351923798765477786913079296
>> xxx.xxx.xxx.66  datacenter1 rack1       Up     Normal  176.96 GB
>> 25.00%  0
>> xxx.xxx.xxx.69  datacenter1 rack1       Up     Normal  178.19 GB
>> 25.00%  42535295865117307932921825928971026432
>> xxx.xxx.xxx.67  datacenter1 rack1       Up     Normal  178.26 GB
>> 25.00%  85070591730234615865843651857942052864
>> xxx.xxx.xxx.68  datacenter1 rack1       Up     Normal  175.2 GB
>>  25.00%  127605887595351923798765477786913079296
>>
>> About schema:
>> I have big rows (>100k, up to several millions). But as I know, it is
>> normal for cassandra.
>> All things work relatively good, until I start long running pre-production
>> tests. I load
>> data and after a while (~4hours) cluster begin timeout and them some nodes
>> die with OOM.
>> My app retries to send, so after short period all nodes becomes down. Very
>> nasty.
>>
>> But now, I can OOM nodes by simple call nodetool repair.
>> In logs http://paste.kde.org/96811/ it is clear, how heap rocketjump to
>> upper limit.
>> cfstats shows: http://paste.kde.org/96817/
>> config is: http://paste.kde.org/96823/
>> A question is: does anybody knows, what this means. Why cassandra tries to
>> load
>> something big into memory at once?
>>
>> A.
>>
>
>

Re: Cassandra OOM on repair.

Posted by Andrey Stepachev <oc...@gmail.com>.
UPDATE:

I found, that
a) with min10G cassandra survive.
b) I have ~1000 sstables
c) CompactionManager uses PrecompactedRows instead of LazilyCompactedRow

So, I have a question:
a) if row is bigger then 64mb before compaction, why it compacted in memory
b) if it smaller, what eats so much memory?

2011/7/15 Andrey Stepachev <oc...@gmail.com>

> Hi all.
>
> Cassandra constantly OOM on repair or compaction. Increasing memory doesn't
> help (6G)
> I can give more, but I think that this is not a regular situation. Cluster
> has 4 nodes. RF=3.
> Cassandra version 0.8.1
>
> Ring looks like this:
>  Address         DC          Rack        Status State   Load
>  Owns    Token
>
>    127605887595351923798765477786913079296
> xxx.xxx.xxx.66  datacenter1 rack1       Up     Normal  176.96 GB
> 25.00%  0
> xxx.xxx.xxx.69  datacenter1 rack1       Up     Normal  178.19 GB
> 25.00%  42535295865117307932921825928971026432
> xxx.xxx.xxx.67  datacenter1 rack1       Up     Normal  178.26 GB
> 25.00%  85070591730234615865843651857942052864
> xxx.xxx.xxx.68  datacenter1 rack1       Up     Normal  175.2 GB
>  25.00%  127605887595351923798765477786913079296
>
> About schema:
> I have big rows (>100k, up to several millions). But as I know, it is
> normal for cassandra.
> All things work relatively good, until I start long running pre-production
> tests. I load
> data and after a while (~4hours) cluster begin timeout and them some nodes
> die with OOM.
> My app retries to send, so after short period all nodes becomes down. Very
> nasty.
>
> But now, I can OOM nodes by simple call nodetool repair.
> In logs http://paste.kde.org/96811/ it is clear, how heap rocketjump to
> upper limit.
> cfstats shows: http://paste.kde.org/96817/
> config is: http://paste.kde.org/96823/
> A question is: does anybody knows, what this means. Why cassandra tries to
> load
> something big into memory at once?
>
> A.
>