You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Romain Hardouin (Created) (JIRA)" <ji...@apache.org> on 2012/04/12 09:26:23 UTC

[jira] [Created] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

OOM Exception during repair session with LeveledCompactionStrategy
------------------------------------------------------------------

Key: CASSANDRA-4142
URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
Project: Cassandra
Issue Type: Improvement
Components: Core
Affects Versions: 1.0.6
Environment: OS: Linux CentOs 6
JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)

Node configuration:
Quad-core
10 GB RAM
Xmx set to 2,5 GB (as computed by default).
Reporter: Romain Hardouin

We encountered an OOM Exception on 2 nodes during repair session.
Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
These two options used together maybe the key to the problem.

Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been generated.
Nonetheless a memory analysis on a live node doing a repair reveals an hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as many objects as there are SSTables on disk.
This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore each object is about 140 KB.

Eclipse Memory Analyzer's denominator tree shows that 99% of a SSTableBoundedScanner object's memory is consumed by a CompressedRandomAccessReader which contains two big byte arrays.

Cluster information:
9 nodes
Each node handles 35 GB (RandomPartitioner)

This JIRA was created following this discussion:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-4142:
----------------------------------------

    Attachment: 4142.txt

Attaching patch to only open 1 sstableScanner per level during repair. It only use the trick for repair because that is the only place where we open lots of sstableScanner, but I suppose we could apply the same to normal compaction and lower slightly the memory consumption there too.

Anyway, the patch includes a unit test to exercise the new code.
                
> OOM Exception during repair session with LeveledCompactionStrategy
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-4142
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.0
>         Environment: OS: Linux CentOs 6 
> JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
> Node configuration:
> Quad-core
> 10 GB RAM
> Xmx set to 2,5 GB (as computed by default).
>            Reporter: Romain Hardouin
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.1
>
>         Attachments: 4142.txt
>
>
> We encountered an OOM Exception on 2 nodes during repair session.
> Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
> These two options used together maybe the key to the problem.
> Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been generated.
> Nonetheless a memory analysis on a live node doing a repair reveals an hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as many objects as there are SSTables on disk. 
> This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore each object is about 140 KB.
> Eclipse Memory Analyzer's denominator tree shows that 99% of a SSTableBoundedScanner object's memory is consumed by a CompressedRandomAccessReader which contains two big byte arrays.
> Cluster information:
> 9 nodes
> Each node handles 35 GB (RandomPartitioner)
> This JIRA was created following this discussion:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252517#comment-13252517 ] 

Jonathan Ellis commented on CASSANDRA-4142:
-------------------------------------------

bq. Eclipse Memory Analyzer's denominator tree shows that 99% of a SSTableBoundedScanner object's memory is consumed by a CompressedRandomAccessReader which contains two big byte arrays.

One would be the 64KB buffer in RandomAccessReader; the other is the CRAR compression buffer.

The comments in CRAR say that it can't use super.read, so is the RAR buffer wasted?

bq. an ArrayList of SSTableBoundedScanner which appears to contain as many objects as there are SSTables on disk

Not sure how badly validation needs all of these at once.  It definitely seems like we could limit it to at most the overlapping sstables for a single L1 target at a time, though, which would cut it by a factor of 10.
                
> OOM Exception during repair session with LeveledCompactionStrategy
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-4142
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.6
>         Environment: OS: Linux CentOs 6 
> JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
> Node configuration:
> Quad-core
> 10 GB RAM
> Xmx set to 2,5 GB (as computed by default).
>            Reporter: Romain Hardouin
>
> We encountered an OOM Exception on 2 nodes during repair session.
> Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
> These two options used together maybe the key to the problem.
> Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been generated.
> Nonetheless a memory analysis on a live node doing a repair reveals an hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as many objects as there are SSTables on disk. 
> This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore each object is about 140 KB.
> Eclipse Memory Analyzer's denominator tree shows that 99% of a SSTableBoundedScanner object's memory is consumed by a CompressedRandomAccessReader which contains two big byte arrays.
> Cluster information:
> 9 nodes
> Each node handles 35 GB (RandomPartitioner)
> This JIRA was created following this discussion:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-4142.
---------------------------------------

    Resolution: Fixed

Added a special case for L0 in 67ed39fa9bf71be4cfc13fccbdd7b76dcb46c062.  Still getting errors.  I think these are due to a pre-existing bug in LCS.  Opened CASSANDRA-4233 to follow up.
                
> OOM Exception during repair session with LeveledCompactionStrategy
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-4142
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.0
>         Environment: OS: Linux CentOs 6 
> JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
> Node configuration:
> Quad-core
> 10 GB RAM
> Xmx set to 2,5 GB (as computed by default).
>            Reporter: Romain Hardouin
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.1
>
>         Attachments: 4142-v2.txt, 4142.txt
>
>
> We encountered an OOM Exception on 2 nodes during repair session.
> Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
> These two options used together maybe the key to the problem.
> Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been generated.
> Nonetheless a memory analysis on a live node doing a repair reveals an hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as many objects as there are SSTables on disk. 
> This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore each object is about 140 KB.
> Eclipse Memory Analyzer's denominator tree shows that 99% of a SSTableBoundedScanner object's memory is consumed by a CompressedRandomAccessReader which contains two big byte arrays.
> Cluster information:
> 9 nodes
> Each node handles 35 GB (RandomPartitioner)
> This JIRA was created following this discussion:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-4142:
--------------------------------------

    Attachment: 4142-v2.txt

v2 attached:

- Renames KeyScanner to o.a.c.db.compaction.ICompactionScanner
- Remove AbstractCompactionIterable.getScanners in favor of ACS.getScanners, which wires it in to normal compactions as well.  (This lets ParallelCompactionIterator be more agressive about how much memory it lets each scanner use, too.)
- Updated LCS.getScanners constructor to use a Multimap, which simplifies things a bit.
                
> OOM Exception during repair session with LeveledCompactionStrategy
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-4142
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.0
>         Environment: OS: Linux CentOs 6 
> JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
> Node configuration:
> Quad-core
> 10 GB RAM
> Xmx set to 2,5 GB (as computed by default).
>            Reporter: Romain Hardouin
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.1
>
>         Attachments: 4142-v2.txt, 4142.txt
>
>
> We encountered an OOM Exception on 2 nodes during repair session.
> Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
> These two options used together maybe the key to the problem.
> Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been generated.
> Nonetheless a memory analysis on a live node doing a repair reveals an hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as many objects as there are SSTables on disk. 
> This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore each object is about 140 KB.
> Eclipse Memory Analyzer's denominator tree shows that 99% of a SSTableBoundedScanner object's memory is consumed by a CompressedRandomAccessReader which contains two big byte arrays.
> Cluster information:
> 9 nodes
> Each node handles 35 GB (RandomPartitioner)
> This JIRA was created following this discussion:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

Posted by "Pavel Yaskevich (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252523#comment-13252523 ] 

Pavel Yaskevich commented on CASSANDRA-4142:
--------------------------------------------

bq. The comments in CRAR say that it can't use super.read, so is the RAR buffer wasted?

Buffer in CRAR used to read compressed data from disk (instead of creating separate buffer each time) and it uses RAR.buffer for decompression, so non of the buffers is wasted.
                
> OOM Exception during repair session with LeveledCompactionStrategy
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-4142
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.6
>         Environment: OS: Linux CentOs 6 
> JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
> Node configuration:
> Quad-core
> 10 GB RAM
> Xmx set to 2,5 GB (as computed by default).
>            Reporter: Romain Hardouin
>
> We encountered an OOM Exception on 2 nodes during repair session.
> Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
> These two options used together maybe the key to the problem.
> Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been generated.
> Nonetheless a memory analysis on a live node doing a repair reveals an hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as many objects as there are SSTables on disk. 
> This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore each object is about 140 KB.
> Eclipse Memory Analyzer's denominator tree shows that 99% of a SSTableBoundedScanner object's memory is consumed by a CompressedRandomAccessReader which contains two big byte arrays.
> Cluster information:
> 9 nodes
> Each node handles 35 GB (RandomPartitioner)
> This JIRA was created following this discussion:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reopened CASSANDRA-4142:
---------------------------------------


this caused CompactionsTest to start failing
                
> OOM Exception during repair session with LeveledCompactionStrategy
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-4142
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.0
>         Environment: OS: Linux CentOs 6 
> JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
> Node configuration:
> Quad-core
> 10 GB RAM
> Xmx set to 2,5 GB (as computed by default).
>            Reporter: Romain Hardouin
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.1
>
>         Attachments: 4142-v2.txt, 4142.txt
>
>
> We encountered an OOM Exception on 2 nodes during repair session.
> Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
> These two options used together maybe the key to the problem.
> Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been generated.
> Nonetheless a memory analysis on a live node doing a repair reveals an hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as many objects as there are SSTables on disk. 
> This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore each object is about 140 KB.
> Eclipse Memory Analyzer's denominator tree shows that 99% of a SSTableBoundedScanner object's memory is consumed by a CompressedRandomAccessReader which contains two big byte arrays.
> Cluster information:
> 9 nodes
> Each node handles 35 GB (RandomPartitioner)
> This JIRA was created following this discussion:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252526#comment-13252526 ] 

Sylvain Lebresne commented on CASSANDRA-4142:
---------------------------------------------

bq. The comments in CRAR say that it can't use super.read, so is the RAR buffer wasted?

The buffer in CRAR is for compressed data, the one in RAR is for uncompressed data, so we do need both.

bq. Not sure how badly validation needs all of these at once.

It doesn't. I haven't given it much though yet, but I think the easiest way/more efficient way to deal with that would be to have only have 1 sstable per level loaded (and all of L0), which should always be reasonable. 
                
> OOM Exception during repair session with LeveledCompactionStrategy
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-4142
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.6
>         Environment: OS: Linux CentOs 6 
> JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
> Node configuration:
> Quad-core
> 10 GB RAM
> Xmx set to 2,5 GB (as computed by default).
>            Reporter: Romain Hardouin
>
> We encountered an OOM Exception on 2 nodes during repair session.
> Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
> These two options used together maybe the key to the problem.
> Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been generated.
> Nonetheless a memory analysis on a live node doing a repair reveals an hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as many objects as there are SSTables on disk. 
> This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore each object is about 140 KB.
> Eclipse Memory Analyzer's denominator tree shows that 99% of a SSTableBoundedScanner object's memory is consumed by a CompressedRandomAccessReader which contains two big byte arrays.
> Cluster information:
> 9 nodes
> Each node handles 35 GB (RandomPartitioner)
> This JIRA was created following this discussion:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-4142:
--------------------------------------

    Affects Version/s:     (was: 1.0.6)
                       1.0.0
        Fix Version/s: 1.1.1
             Assignee: Sylvain Lebresne
    
> OOM Exception during repair session with LeveledCompactionStrategy
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-4142
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.0
>         Environment: OS: Linux CentOs 6 
> JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
> Node configuration:
> Quad-core
> 10 GB RAM
> Xmx set to 2,5 GB (as computed by default).
>            Reporter: Romain Hardouin
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.1
>
>
> We encountered an OOM Exception on 2 nodes during repair session.
> Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
> These two options used together maybe the key to the problem.
> Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been generated.
> Nonetheless a memory analysis on a live node doing a repair reveals an hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as many objects as there are SSTables on disk. 
> This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore each object is about 140 KB.
> Eclipse Memory Analyzer's denominator tree shows that 99% of a SSTableBoundedScanner object's memory is consumed by a CompressedRandomAccessReader which contains two big byte arrays.
> Cluster information:
> 9 nodes
> Each node handles 35 GB (RandomPartitioner)
> This JIRA was created following this discussion:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266456#comment-13266456 ] 

Sylvain Lebresne commented on CASSANDRA-4142:
---------------------------------------------

lgtm, +1 on v2
                
> OOM Exception during repair session with LeveledCompactionStrategy
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-4142
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.0
>         Environment: OS: Linux CentOs 6 
> JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
> Node configuration:
> Quad-core
> 10 GB RAM
> Xmx set to 2,5 GB (as computed by default).
>            Reporter: Romain Hardouin
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.1
>
>         Attachments: 4142-v2.txt, 4142.txt
>
>
> We encountered an OOM Exception on 2 nodes during repair session.
> Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
> These two options used together maybe the key to the problem.
> Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been generated.
> Nonetheless a memory analysis on a live node doing a repair reveals an hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as many objects as there are SSTables on disk. 
> This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore each object is about 140 KB.
> Eclipse Memory Analyzer's denominator tree shows that 99% of a SSTableBoundedScanner object's memory is consumed by a CompressedRandomAccessReader which contains two big byte arrays.
> Cluster information:
> 9 nodes
> Each node handles 35 GB (RandomPartitioner)
> This JIRA was created following this discussion:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-4142) OOM Exception during repair session with LeveledCompactionStrategy

Posted by "Pavel Yaskevich (Issue Comment Edited) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252523#comment-13252523 ] 

Pavel Yaskevich edited comment on CASSANDRA-4142 at 4/12/12 4:08 PM:
---------------------------------------------------------------------

bq. The comments in CRAR say that it can't use super.read, so is the RAR buffer wasted?

Buffer in CRAR used to read compressed data from disk (instead of creating separate buffer each time) and it uses RAR.buffer for decompression, so none of the buffers are wasted.
                
      was (Author: xedin):
    bq. The comments in CRAR say that it can't use super.read, so is the RAR buffer wasted?

Buffer in CRAR used to read compressed data from disk (instead of creating separate buffer each time) and it uses RAR.buffer for decompression, so non of the buffers is wasted.
                  
> OOM Exception during repair session with LeveledCompactionStrategy
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-4142
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4142
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.6
>         Environment: OS: Linux CentOs 6 
> JDK: Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
> Node configuration:
> Quad-core
> 10 GB RAM
> Xmx set to 2,5 GB (as computed by default).
>            Reporter: Romain Hardouin
>
> We encountered an OOM Exception on 2 nodes during repair session.
> Our CF are set up to use LeveledCompactionStrategy and SnappyCompressor.
> These two options used together maybe the key to the problem.
> Despite of setting XX:+HeapDumpOnOutOfMemoryError, no dump have been generated.
> Nonetheless a memory analysis on a live node doing a repair reveals an hotspot: an ArrayList of SSTableBoundedScanner which appears to contain as many objects as there are SSTables on disk. 
> This ArrayList consumes 786 MB of the heap space for 5757 objects. Therefore each object is about 140 KB.
> Eclipse Memory Analyzer's denominator tree shows that 99% of a SSTableBoundedScanner object's memory is consumed by a CompressedRandomAccessReader which contains two big byte arrays.
> Cluster information:
> 9 nodes
> Each node handles 35 GB (RandomPartitioner)
> This JIRA was created following this discussion:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Why-so-many-SSTables-td7453033.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira