You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Rajat Chopra <rc...@makara.com> on 2011/01/29 01:48:10 UTC
get_slice OOM on large row
Hi!
Trying to test the 0.7 release with some offbeat settings to check the behavior.
- Single node cluster
- Key_cache_size - default
- Row_cache_size - default
- Min/max compaction threshold - 0 (so this is disabled)
- Disk_access_mode : standard
- Memtable_throughput_in_mb : 2 (yes two mb only)
Basically I wanted to run the process with least amount of memory usage. Without much hassle (probably because compaction was disabled) I put 400k columns in a single row of a super column of a cf. Each column value being about 20kb. So, I have about 8G of data.
Now the issue is that I run into OOM on any kind of read that I do, even with -Xmx2G. Using pycassa as the high level client.
The trace is pasted below.
Is it because the entire row is loaded even if some keys are asked for?
Please help for my better understanding of limitations/usage.
Thanks,
Rajat
Heap dump file created [2147332020 bytes in 16.621 secs] ERROR 16:38:06,420 Fatal exception in thread Thread[ReadStage:5,5,main]
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
at org.apache.cassandra.db.Table.getRow(Table.java:384)
at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
at org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
ERROR 16:38:06,717 Internal error processing get_slice
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:282)
at org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:224)
at org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:98)
at org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:195)
at org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:271)
at org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:233)
at org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:2699)
at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
at java.util.concurrent.FutureTask.get(FutureTask.java:111)
at org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:278)
... 11 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
at org.apache.cassandra.db.Table.getRow(Table.java:384)
at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
at org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
... 3 more
Re: get_slice OOM on large row
Posted by Jonathan Ellis <jb...@gmail.com>.
http://wiki.apache.org/cassandra/CassandraLimitations
"Any request for a subcolumn deserializes _all_ the subcolumns in that
supercolumn, so you want to avoid a data model that requires large
numbers of subcolumns."
On Fri, Jan 28, 2011 at 7:40 PM, Rajat Chopra <rc...@makara.com> wrote:
> Thanks Jonathan.
> But the read fails in all cases, even when start_column/end_column span is 10 columns here, and even when column_count is set appropriately. Or did I miss what you said?
> The trace seems to suggest an entire super_column is being deserialized.
>
> Rajat
>
> -----Original Message-----
> From: Jonathan Ellis [mailto:jbellis@gmail.com]
> Sent: Friday, January 28, 2011 5:32 PM
> To: user@cassandra.apache.org
> Subject: Re: get_slice OOM on large row
>
> Requesting too much data in a single request is user error. That is
> why you have start columns/rows, so you can page through a large set.
>
> On Fri, Jan 28, 2011 at 6:48 PM, Rajat Chopra <rc...@makara.com> wrote:
>> Hi!
>>
>> Trying to test the 0.7 release with some offbeat settings to check the
>> behavior.
>>
>>
>>
>> - Single node cluster
>>
>> - Key_cache_size - default
>>
>> - Row_cache_size - default
>>
>> - Min/max compaction threshold - 0 (so this is disabled)
>>
>> - Disk_access_mode : standard
>>
>> - Memtable_throughput_in_mb : 2 (yes two mb only)
>>
>>
>>
>> Basically I wanted to run the process with least amount of memory usage.
>> Without much hassle (probably because compaction was disabled) I put 400k
>> columns in a single row of a super column of a cf. Each column value being
>> about 20kb. So, I have about 8G of data.
>>
>>
>>
>> Now the issue is that I run into OOM on any kind of read that I do, even
>> with -Xmx2G. Using pycassa as the high level client.
>>
>> The trace is pasted below.
>>
>>
>>
>> Is it because the entire row is loaded even if some keys are asked for?
>>
>> Please help for my better understanding of limitations/usage.
>>
>>
>>
>> Thanks,
>>
>> Rajat
>>
>>
>>
>>
>>
>> Heap dump file created [2147332020 bytes in 16.621 secs] ERROR 16:38:06,420
>> Fatal exception in thread Thread[ReadStage:5,5,main]
>>
>> java.lang.OutOfMemoryError: Java heap space
>>
>> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>>
>> at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>>
>> at
>> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
>>
>> at
>> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>>
>> at
>> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
>>
>> at
>> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
>>
>> at
>> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
>>
>> at
>> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
>>
>> at
>> org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
>>
>> at
>> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
>>
>> at
>> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
>>
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
>>
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
>>
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
>>
>> at org.apache.cassandra.db.Table.getRow(Table.java:384)
>>
>> at
>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>>
>> at
>> org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
>>
>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>
>> at java.lang.Thread.run(Thread.java:636)
>>
>> ERROR 16:38:06,717 Internal error processing get_slice
>>
>> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
>> java.lang.OutOfMemoryError: Java heap space
>>
>> at
>> org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:282)
>>
>> at
>> org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:224)
>>
>> at
>> org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:98)
>>
>> at
>> org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:195)
>>
>> at
>> org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:271)
>>
>> at
>> org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:233)
>>
>> at
>> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:2699)
>>
>> at
>> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
>>
>> at
>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>
>> at java.lang.Thread.run(Thread.java:636)
>>
>> Caused by: java.util.concurrent.ExecutionException:
>> java.lang.OutOfMemoryError: Java heap space
>>
>> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
>>
>> at java.util.concurrent.FutureTask.get(FutureTask.java:111)
>>
>> at
>> org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:278)
>>
>> ... 11 more
>>
>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>
>> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>>
>> at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>>
>> at
>> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
>>
>> at
>> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>>
>> at
>> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
>>
>> at
>> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
>>
>> at
>> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
>>
>> at
>> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
>>
>> at
>> org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
>>
>> at
>> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
>>
>> at
>> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
>>
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
>>
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
>>
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
>>
>> at org.apache.cassandra.db.Table.getRow(Table.java:384)
>>
>> at
>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>>
>> at
>> org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
>>
>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>
>> ... 3 more
>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com
RE: get_slice OOM on large row
Posted by Rajat Chopra <rc...@makara.com>.
Thanks Jonathan.
But the read fails in all cases, even when start_column/end_column span is 10 columns here, and even when column_count is set appropriately. Or did I miss what you said?
The trace seems to suggest an entire super_column is being deserialized.
Rajat
-----Original Message-----
From: Jonathan Ellis [mailto:jbellis@gmail.com]
Sent: Friday, January 28, 2011 5:32 PM
To: user@cassandra.apache.org
Subject: Re: get_slice OOM on large row
Requesting too much data in a single request is user error. That is
why you have start columns/rows, so you can page through a large set.
On Fri, Jan 28, 2011 at 6:48 PM, Rajat Chopra <rc...@makara.com> wrote:
> Hi!
>
> Trying to test the 0.7 release with some offbeat settings to check the
> behavior.
>
>
>
> - Single node cluster
>
> - Key_cache_size - default
>
> - Row_cache_size - default
>
> - Min/max compaction threshold - 0 (so this is disabled)
>
> - Disk_access_mode : standard
>
> - Memtable_throughput_in_mb : 2 (yes two mb only)
>
>
>
> Basically I wanted to run the process with least amount of memory usage.
> Without much hassle (probably because compaction was disabled) I put 400k
> columns in a single row of a super column of a cf. Each column value being
> about 20kb. So, I have about 8G of data.
>
>
>
> Now the issue is that I run into OOM on any kind of read that I do, even
> with -Xmx2G. Using pycassa as the high level client.
>
> The trace is pasted below.
>
>
>
> Is it because the entire row is loaded even if some keys are asked for?
>
> Please help for my better understanding of limitations/usage.
>
>
>
> Thanks,
>
> Rajat
>
>
>
>
>
> Heap dump file created [2147332020 bytes in 16.621 secs] ERROR 16:38:06,420
> Fatal exception in thread Thread[ReadStage:5,5,main]
>
> java.lang.OutOfMemoryError: Java heap space
>
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>
> at
> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
>
> at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>
> at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
>
> at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
>
> at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
>
> at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
>
> at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
>
> at
> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
>
> at
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
>
> at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
>
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
>
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
>
> at org.apache.cassandra.db.Table.getRow(Table.java:384)
>
> at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>
> at
> org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
>
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>
> at java.lang.Thread.run(Thread.java:636)
>
> ERROR 16:38:06,717 Internal error processing get_slice
>
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> java.lang.OutOfMemoryError: Java heap space
>
> at
> org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:282)
>
> at
> org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:224)
>
> at
> org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:98)
>
> at
> org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:195)
>
> at
> org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:271)
>
> at
> org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:233)
>
> at
> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:2699)
>
> at
> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
>
> at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>
> at java.lang.Thread.run(Thread.java:636)
>
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.OutOfMemoryError: Java heap space
>
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
>
> at java.util.concurrent.FutureTask.get(FutureTask.java:111)
>
> at
> org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:278)
>
> ... 11 more
>
> Caused by: java.lang.OutOfMemoryError: Java heap space
>
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>
> at
> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
>
> at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>
> at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
>
> at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
>
> at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
>
> at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
>
> at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
>
> at
> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
>
> at
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
>
> at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
>
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
>
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
>
> at org.apache.cassandra.db.Table.getRow(Table.java:384)
>
> at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>
> at
> org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
>
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>
> ... 3 more
>
>
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com
Re: get_slice OOM on large row
Posted by Jonathan Ellis <jb...@gmail.com>.
Requesting too much data in a single request is user error. That is
why you have start columns/rows, so you can page through a large set.
On Fri, Jan 28, 2011 at 6:48 PM, Rajat Chopra <rc...@makara.com> wrote:
> Hi!
>
> Trying to test the 0.7 release with some offbeat settings to check the
> behavior.
>
>
>
> - Single node cluster
>
> - Key_cache_size – default
>
> - Row_cache_size – default
>
> - Min/max compaction threshold – 0 (so this is disabled)
>
> - Disk_access_mode : standard
>
> - Memtable_throughput_in_mb : 2 (yes two mb only)
>
>
>
> Basically I wanted to run the process with least amount of memory usage.
> Without much hassle (probably because compaction was disabled) I put 400k
> columns in a single row of a super column of a cf. Each column value being
> about 20kb. So, I have about 8G of data.
>
>
>
> Now the issue is that I run into OOM on any kind of read that I do, even
> with –Xmx2G. Using pycassa as the high level client.
>
> The trace is pasted below.
>
>
>
> Is it because the entire row is loaded even if some keys are asked for?
>
> Please help for my better understanding of limitations/usage.
>
>
>
> Thanks,
>
> Rajat
>
>
>
>
>
> Heap dump file created [2147332020 bytes in 16.621 secs] ERROR 16:38:06,420
> Fatal exception in thread Thread[ReadStage:5,5,main]
>
> java.lang.OutOfMemoryError: Java heap space
>
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>
> at
> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
>
> at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>
> at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
>
> at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
>
> at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
>
> at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
>
> at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
>
> at
> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
>
> at
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
>
> at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
>
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
>
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
>
> at org.apache.cassandra.db.Table.getRow(Table.java:384)
>
> at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>
> at
> org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
>
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>
> at java.lang.Thread.run(Thread.java:636)
>
> ERROR 16:38:06,717 Internal error processing get_slice
>
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> java.lang.OutOfMemoryError: Java heap space
>
> at
> org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:282)
>
> at
> org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:224)
>
> at
> org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:98)
>
> at
> org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:195)
>
> at
> org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:271)
>
> at
> org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:233)
>
> at
> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:2699)
>
> at
> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
>
> at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>
> at java.lang.Thread.run(Thread.java:636)
>
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.OutOfMemoryError: Java heap space
>
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
>
> at java.util.concurrent.FutureTask.get(FutureTask.java:111)
>
> at
> org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:278)
>
> ... 11 more
>
> Caused by: java.lang.OutOfMemoryError: Java heap space
>
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>
> at
> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
>
> at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>
> at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
>
> at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
>
> at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
>
> at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
>
> at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
>
> at
> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
>
> at
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
>
> at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
>
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
>
> at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
>
> at org.apache.cassandra.db.Table.getRow(Table.java:384)
>
> at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>
> at
> org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
>
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>
> ... 3 more
>
>
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com