You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Pardeep <ps...@gmail.com> on 2014/11/16 20:47:30 UTC

Cassandra 2.1.1 Out of Memory Errors

I'm running a 4 node cluster with RF=3, CL of QUORUM for writes and ONE for 
reads. Each node has 3.7GB RAM with 32GB SSD HD, commitlog is on 
another HD. Currently each node has about 12GB of data. Cluster is always 
normal unless repair happens, that's when some nodes go to medium health in 
terms of OpsCenter.

MAX_HEAP_SIZE="2G"
HEAP_NEWSIZE="400M"

I've looked everywhere to get info on what might be causing these errors but 
no luck. Can anyone please guide me to what I should look at or tweak to get 
around these errors?

All column families are using SizeTieredCompactionStrategy, I've thought 
about moving to LeveledCompactionStrategy since Cassandra is running on 
SSD but haven't made the move yet.

All writes are write once, data is rarely updated and no TTL columns. I do use 
wide rows that can span few thousands to a few million, I'm not sure if range 
slices happen using the memory.

Let me know if further info is needed. I do have hproc files but those are about 
3.2 GB in size.

java.lang.OutOfMemoryError: Java heap space
org.apache.cassandra.io.util.RandomAccessReader.<init>
org.apache.cassandra.io.util.RandomAccessReader.open
org.apache.cassandra.io.sstable.SSTableReader
org.apache.cassandra.io.sstable.SSTableScanner
org.apache.cassandra.io.sstable.SSTableReader
org.apache.cassandra.db.RowIteratorFactory.getIterator
org.apache.cassandra.db.ColumnFamilyStore.getSequentialIterator
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice
org.apache.cassandra.db.RangeSliceCommand.executeLocally
StorageProxy$LocalRangeSliceRunnable.runMayThrow
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run
java.util.concurrent.Executors$RunnableAdapter.call
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService
org.apache.cassandra.concurrent.SEPWorker.run
java.lang.Thread.run

Re: Cassandra 2.1.1 Out of Memory Errors

Posted by Pardeep <ps...@gmail.com>.

DuyHai Doan,

For wide rows, would it be better to switch to LeveledCompactionStrategy.
The number of SSTables will decrease and it's also optimized for reading data.

I have read in quite a few places that LeveledCompactionStrategy is better for 
wide rows.
Is it true and would you recommend it?

Re: Cassandra 2.1.1 Out of Memory Errors

Posted by DuyHai Doan <do...@gmail.com>.

If the table is fragmented on many sstables on disk, you may run into
trouble.

Let me explain the reason. Your query is perfectly fine, but if you're
querying a partition of, let's say 1 millions of rows spread across 10
SSTables, Cassandra may need to read the partition splits in all those
SSTables before returning the results.

Indeed to optimize disk seeks, we have bloom filters. But they only tell
you whether an SSTable contains or not a partition. In the scenario where
your partition does exist really and spans on many sstables, touching disk
is mandatory.

There was a slight optimization with
https://issues.apache.org/jira/browse/CASSANDRA-5514 but the resolution is
too coarse and it does not help much if you have a lot of fresh SSTables.



On Sun, Nov 16, 2014 at 11:04 PM, Pardeep <ps...@gmail.com> wrote:

> DuyHai Doan <doanduyhai <at> gmail.com> writes:
>
> >
> >
> > "I do use rows that can span few thousands to a few million, I'm not
> sure if
> rangeslices happen using the memory."
> > What are your query patterns ? For a given partition, take a slice of xxx
> columns ? Or give me a range of partitions ?
> >
> >  For the 1st scenario, depending on how many columns you want to retrieve
> at at time, there can be pressure on the JVM heap. For the second scenario
> where you perform query over a range of partition keys, it's worst.
> >
> >  In any case, 2Gb of heap size is very very small and will put the node
> in
> danger whenever it faces an important load.
> >
> > "Cluster is always unless repair happens, that's when some nodes go to
> medium health in of OpsCenter" --> repair trigger computation of Merkle
> tree
> and load the sstables in memory. With your limited amount of RAM it may
> explain the yellow state in OpsCenter
> >
> >
> > On Sun, Nov 16, 2014 at 8:47 PM, Pardeep <ps0296 <at> gmail.com>
> wrote:I'm running a 4 node cluster with RF=3, CL of QUORUM for writes and
> ONE for
> > reads. Each node has 3.7GB RAM with 32GB SSD HD, commitlog is on
> > another HD. Currently each node has about 12GB of data. Cluster is always
> > normal unless repair happens, that's when some nodes go to medium health
> in
> > terms of OpsCenter.
> > MAX_HEAP_SIZE="2G"
> > HEAP_NEWSIZE="400M"
> > I've looked everywhere to get info on what might be causing these errors
> but
> > no luck. Can anyone please guide me to what I should look at or tweak to
> get
> > around these errors?
> > All column families are using SizeTieredCompactionStrategy, I've thought
> > about moving to LeveledCompactionStrategy since Cassandra is running on
> > SSD but haven't made the move yet.
> > All writes are write once, data is rarely updated and no TTL columns. I
> do use
> > wide rows that can span few thousands to a few million, I'm not sure if
> range
> > slices happen using the memory.
> > Let me know if further info is needed. I do have hproc files but those
> are
> about
> > 3.2 GB in size.
> > java.lang.OutOfMemoryError: Java heap space
> > org.apache.cassandra.io.util.RandomAccessReader.<init>
> > org.apache.cassandra.io.util.RandomAccessReader.open
> > org.apache.cassandra.io.sstable.SSTableReader
> > org.apache.cassandra.io.sstable.SSTableScanner
> > org.apache.cassandra.io.sstable.SSTableReader
> > org.apache.cassandra.db.RowIteratorFactory.getIterator
> > org.apache.cassandra.db.ColumnFamilyStore.getSequentialIterator
> > org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice
> > org.apache.cassandra.db.RangeSliceCommand.executeLocally
> > StorageProxy$LocalRangeSliceRunnable.runMayThrow
> > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run
> > java.util.concurrent.Executors$RunnableAdapter.call
> > org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService
> > org.apache.cassandra.concurrent.SEPWorker.run
> > java.lang.Thread.run
> >
> >
> >
> >
> >
>
>
> "What are your query patterns ? For a given partition, take a slice of xxx
> columns ? Or give me a range of partitions ?"
>
> All wide row tables are similar to this:
> CREATE TABLE tag_timeline (
>         tag text,
>         pid text,
>         d text,
>         PRIMARY KEY (tag, pid)
> )
> WITH CLUSTERING ORDER BY (pid DESC);
>
> query:
> Select pid from tag_timline WHERE tag="test" AND pid>"FA8afA" ORDER BY
> pid DESC LIMIT 20;
>
> I don't think I can optimize the query any further. A tag can have
> millions of
> posts but the database isn't that large yet. Could the query cause memory
> problems or the way table is created?
>
>
>

Re: Cassandra 2.1.1 Out of Memory Errors

Posted by Pardeep <ps...@gmail.com>.

DuyHai Doan <doanduyhai <at> gmail.com> writes:

> 
> 
> "I do use rows that can span few thousands to a few million, I'm not sure if 
rangeslices happen using the memory."
> What are your query patterns ? For a given partition, take a slice of xxx 
columns ? Or give me a range of partitions ?
> 
>  For the 1st scenario, depending on how many columns you want to retrieve 
at at time, there can be pressure on the JVM heap. For the second scenario 
where you perform query over a range of partition keys, it's worst.
> 
>  In any case, 2Gb of heap size is very very small and will put the node in 
danger whenever it faces an important load.
> 
> "Cluster is always unless repair happens, that's when some nodes go to 
medium health in of OpsCenter" --> repair trigger computation of Merkle tree 
and load the sstables in memory. With your limited amount of RAM it may 
explain the yellow state in OpsCenter  
> 
> 
> On Sun, Nov 16, 2014 at 8:47 PM, Pardeep <ps0296 <at> gmail.com> 
wrote:I'm running a 4 node cluster with RF=3, CL of QUORUM for writes and 
ONE for
> reads. Each node has 3.7GB RAM with 32GB SSD HD, commitlog is on
> another HD. Currently each node has about 12GB of data. Cluster is always
> normal unless repair happens, that's when some nodes go to medium health 
in
> terms of OpsCenter.
> MAX_HEAP_SIZE="2G"
> HEAP_NEWSIZE="400M"
> I've looked everywhere to get info on what might be causing these errors but
> no luck. Can anyone please guide me to what I should look at or tweak to get
> around these errors?
> All column families are using SizeTieredCompactionStrategy, I've thought
> about moving to LeveledCompactionStrategy since Cassandra is running on
> SSD but haven't made the move yet.
> All writes are write once, data is rarely updated and no TTL columns. I do use
> wide rows that can span few thousands to a few million, I'm not sure if range
> slices happen using the memory.
> Let me know if further info is needed. I do have hproc files but those are 
about
> 3.2 GB in size.
> java.lang.OutOfMemoryError: Java heap space
> org.apache.cassandra.io.util.RandomAccessReader.<init>
> org.apache.cassandra.io.util.RandomAccessReader.open
> org.apache.cassandra.io.sstable.SSTableReader
> org.apache.cassandra.io.sstable.SSTableScanner
> org.apache.cassandra.io.sstable.SSTableReader
> org.apache.cassandra.db.RowIteratorFactory.getIterator
> org.apache.cassandra.db.ColumnFamilyStore.getSequentialIterator
> org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice
> org.apache.cassandra.db.RangeSliceCommand.executeLocally
> StorageProxy$LocalRangeSliceRunnable.runMayThrow
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run
> java.util.concurrent.Executors$RunnableAdapter.call
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService
> org.apache.cassandra.concurrent.SEPWorker.run
> java.lang.Thread.run
> 
> 
> 
> 
> 


"What are your query patterns ? For a given partition, take a slice of xxx 
columns ? Or give me a range of partitions ?"

All wide row tables are similar to this:
CREATE TABLE tag_timeline (
	tag text,
	pid text,
	d text,
	PRIMARY KEY (tag, pid)
)
WITH CLUSTERING ORDER BY (pid DESC);

query:
Select pid from tag_timline WHERE tag="test" AND pid>"FA8afA" ORDER BY 
pid DESC LIMIT 20;

I don't think I can optimize the query any further. A tag can have millions of 
posts but the database isn't that large yet. Could the query cause memory 
problems or the way table is created?

Re: Cassandra 2.1.1 Out of Memory Errors

Posted by DuyHai Doan <do...@gmail.com>.

"I do use rows that can span few thousands to a few million, I'm not sure
if range
slices happen using the memory."

What are your query patterns ? For a given partition, take a slice of xxx
columns ? Or give me a range of partitions ?

 For the 1st scenario, depending on how many columns you want to retrieve
at at time, there can be pressure on the JVM heap. For the second scenario
where you perform query over a range of partition keys, it's worst.

 In any case, 2Gb of heap size is very very small and will put the node in
danger whenever it faces an important load.

"Cluster is always unless repair happens, that's when some nodes go to
medium health in of OpsCenter" --> repair trigger computation of Merkle
tree and load the sstables in memory. With your limited amount of RAM it
may explain the yellow state in OpsCenter

On Sun, Nov 16, 2014 at 8:47 PM, Pardeep <ps...@gmail.com> wrote:

> I'm running a 4 node cluster with RF=3, CL of QUORUM for writes and ONE for
> reads. Each node has 3.7GB RAM with 32GB SSD HD, commitlog is on
> another HD. Currently each node has about 12GB of data. Cluster is always
> normal unless repair happens, that's when some nodes go to medium health in
> terms of OpsCenter.
>
> MAX_HEAP_SIZE="2G"
> HEAP_NEWSIZE="400M"
>
> I've looked everywhere to get info on what might be causing these errors
> but
> no luck. Can anyone please guide me to what I should look at or tweak to
> get
> around these errors?
>
> All column families are using SizeTieredCompactionStrategy, I've thought
> about moving to LeveledCompactionStrategy since Cassandra is running on
> SSD but haven't made the move yet.
>
> All writes are write once, data is rarely updated and no TTL columns. I do
> use
> wide rows that can span few thousands to a few million, I'm not sure if
> range
> slices happen using the memory.
>
> Let me know if further info is needed. I do have hproc files but those are
> about
> 3.2 GB in size.
>
> java.lang.OutOfMemoryError: Java heap space
> org.apache.cassandra.io.util.RandomAccessReader.<init>
> org.apache.cassandra.io.util.RandomAccessReader.open
> org.apache.cassandra.io.sstable.SSTableReader
> org.apache.cassandra.io.sstable.SSTableScanner
> org.apache.cassandra.io.sstable.SSTableReader
> org.apache.cassandra.db.RowIteratorFactory.getIterator
> org.apache.cassandra.db.ColumnFamilyStore.getSequentialIterator
> org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice
> org.apache.cassandra.db.RangeSliceCommand.executeLocally
> StorageProxy$LocalRangeSliceRunnable.runMayThrow
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run
> java.util.concurrent.Executors$RunnableAdapter.call
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService
> org.apache.cassandra.concurrent.SEPWorker.run
> java.lang.Thread.run
>
>