You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Rajat Chopra <rc...@makara.com> on 2011/01/29 01:48:10 UTC

get_slice OOM on large row

Hi!
   Trying to test the 0.7 release with some offbeat settings to check the behavior.


-          Single node cluster

-          Key_cache_size - default

-          Row_cache_size - default

-          Min/max compaction threshold - 0 (so this is disabled)

-          Disk_access_mode : standard

-          Memtable_throughput_in_mb : 2 (yes two mb only)

Basically I wanted to run the process with least amount of memory usage. Without much hassle (probably because compaction was disabled) I put 400k columns in a single row of a super column of a cf. Each column value being about 20kb. So, I have about 8G of data.

Now the issue is that I run into OOM on any kind of read that I do, even with -Xmx2G. Using pycassa as the high level client.
The trace is pasted below.

Is it because the entire row is loaded even if some keys are asked for?
Please help for my better understanding of limitations/usage.

Thanks,
Rajat



Heap dump file created [2147332020 bytes in 16.621 secs] ERROR 16:38:06,420 Fatal exception in thread Thread[ReadStage:5,5,main]

java.lang.OutOfMemoryError: Java heap space

      at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)

      at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)

      at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)

      at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)

      at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)

      at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)

      at org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)

      at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)

      at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)

      at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)

      at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)

      at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)

      at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)

      at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)

      at org.apache.cassandra.db.Table.getRow(Table.java:384)

      at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)

      at org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)

      at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)

      at java.util.concurrent.FutureTask.run(FutureTask.java:166)

      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

      at java.lang.Thread.run(Thread.java:636)

ERROR 16:38:06,717 Internal error processing get_slice

java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space

      at org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:282)

      at org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:224)

      at org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:98)

      at org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:195)

      at org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:271)

      at org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:233)

      at org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:2699)

      at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)

      at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)

      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

      at java.lang.Thread.run(Thread.java:636)

Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space

      at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)

      at java.util.concurrent.FutureTask.get(FutureTask.java:111)

      at org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:278)

      ... 11 more

Caused by: java.lang.OutOfMemoryError: Java heap space

      at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)

      at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)

      at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)

      at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)

      at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)

      at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)

      at org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)

      at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)

      at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)

      at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)

      at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)

      at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)

      at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)

      at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)

      at org.apache.cassandra.db.Table.getRow(Table.java:384)

      at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)

      at org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)

      at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)

      at java.util.concurrent.FutureTask.run(FutureTask.java:166)

      ... 3 more


Re: get_slice OOM on large row

Posted by Jonathan Ellis <jb...@gmail.com>.
http://wiki.apache.org/cassandra/CassandraLimitations

"Any request for a subcolumn deserializes _all_ the subcolumns in that
supercolumn, so you want to avoid a data model that requires large
numbers of subcolumns."

On Fri, Jan 28, 2011 at 7:40 PM, Rajat Chopra <rc...@makara.com> wrote:
> Thanks Jonathan.
> But the read fails in all cases, even when start_column/end_column span is 10 columns here, and even when column_count is set appropriately. Or did I miss what you said?
> The trace seems to suggest an entire super_column is being deserialized.
>
> Rajat
>
> -----Original Message-----
> From: Jonathan Ellis [mailto:jbellis@gmail.com]
> Sent: Friday, January 28, 2011 5:32 PM
> To: user@cassandra.apache.org
> Subject: Re: get_slice OOM on large row
>
> Requesting too much data in a single request is user error.  That is
> why you have start columns/rows, so you can page through a large set.
>
> On Fri, Jan 28, 2011 at 6:48 PM, Rajat Chopra <rc...@makara.com> wrote:
>> Hi!
>>
>>    Trying to test the 0.7 release with some offbeat settings to check the
>> behavior.
>>
>>
>>
>> -          Single node cluster
>>
>> -          Key_cache_size - default
>>
>> -          Row_cache_size - default
>>
>> -          Min/max compaction threshold - 0 (so this is disabled)
>>
>> -          Disk_access_mode : standard
>>
>> -          Memtable_throughput_in_mb : 2 (yes two mb only)
>>
>>
>>
>> Basically I wanted to run the process with least amount of memory usage.
>> Without much hassle (probably because compaction was disabled) I put 400k
>> columns in a single row of a super column of a cf. Each column value being
>> about 20kb. So, I have about 8G of data.
>>
>>
>>
>> Now the issue is that I run into OOM on any kind of read that I do, even
>> with -Xmx2G. Using pycassa as the high level client.
>>
>> The trace is pasted below.
>>
>>
>>
>> Is it because the entire row is loaded even if some keys are asked for?
>>
>> Please help for my better understanding of limitations/usage.
>>
>>
>>
>> Thanks,
>>
>> Rajat
>>
>>
>>
>>
>>
>> Heap dump file created [2147332020 bytes in 16.621 secs] ERROR 16:38:06,420
>> Fatal exception in thread Thread[ReadStage:5,5,main]
>>
>> java.lang.OutOfMemoryError: Java heap space
>>
>>       at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>>
>>       at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>>
>>       at
>> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
>>
>>       at
>> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>>
>>       at
>> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
>>
>>       at
>> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
>>
>>       at
>> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
>>
>>       at
>> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
>>
>>       at
>> org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
>>
>>       at
>> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
>>
>>       at
>> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
>>
>>       at
>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
>>
>>       at
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
>>
>>       at
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
>>
>>       at org.apache.cassandra.db.Table.getRow(Table.java:384)
>>
>>       at
>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>>
>>       at
>> org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
>>
>>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>
>>       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>
>>       at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>
>>       at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>
>>       at java.lang.Thread.run(Thread.java:636)
>>
>> ERROR 16:38:06,717 Internal error processing get_slice
>>
>> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
>> java.lang.OutOfMemoryError: Java heap space
>>
>>       at
>> org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:282)
>>
>>       at
>> org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:224)
>>
>>       at
>> org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:98)
>>
>>       at
>> org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:195)
>>
>>       at
>> org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:271)
>>
>>       at
>> org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:233)
>>
>>       at
>> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:2699)
>>
>>       at
>> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
>>
>>       at
>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
>>
>>       at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>
>>       at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>
>>       at java.lang.Thread.run(Thread.java:636)
>>
>> Caused by: java.util.concurrent.ExecutionException:
>> java.lang.OutOfMemoryError: Java heap space
>>
>>       at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
>>
>>       at java.util.concurrent.FutureTask.get(FutureTask.java:111)
>>
>>       at
>> org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:278)
>>
>>       ... 11 more
>>
>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>
>>       at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>>
>>       at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>>
>>       at
>> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
>>
>>       at
>> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>>
>>       at
>> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
>>
>>       at
>> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
>>
>>       at
>> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
>>
>>       at
>> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
>>
>>       at
>> org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
>>
>>       at
>> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
>>
>>       at
>> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
>>
>>       at
>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
>>
>>       at
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
>>
>>       at
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
>>
>>       at org.apache.cassandra.db.Table.getRow(Table.java:384)
>>
>>       at
>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>>
>>       at
>> org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
>>
>>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>
>>       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>
>>       ... 3 more
>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

RE: get_slice OOM on large row

Posted by Rajat Chopra <rc...@makara.com>.
Thanks Jonathan.
But the read fails in all cases, even when start_column/end_column span is 10 columns here, and even when column_count is set appropriately. Or did I miss what you said?
The trace seems to suggest an entire super_column is being deserialized.

Rajat

-----Original Message-----
From: Jonathan Ellis [mailto:jbellis@gmail.com] 
Sent: Friday, January 28, 2011 5:32 PM
To: user@cassandra.apache.org
Subject: Re: get_slice OOM on large row

Requesting too much data in a single request is user error.  That is
why you have start columns/rows, so you can page through a large set.

On Fri, Jan 28, 2011 at 6:48 PM, Rajat Chopra <rc...@makara.com> wrote:
> Hi!
>
>    Trying to test the 0.7 release with some offbeat settings to check the
> behavior.
>
>
>
> -          Single node cluster
>
> -          Key_cache_size - default
>
> -          Row_cache_size - default
>
> -          Min/max compaction threshold - 0 (so this is disabled)
>
> -          Disk_access_mode : standard
>
> -          Memtable_throughput_in_mb : 2 (yes two mb only)
>
>
>
> Basically I wanted to run the process with least amount of memory usage.
> Without much hassle (probably because compaction was disabled) I put 400k
> columns in a single row of a super column of a cf. Each column value being
> about 20kb. So, I have about 8G of data.
>
>
>
> Now the issue is that I run into OOM on any kind of read that I do, even
> with -Xmx2G. Using pycassa as the high level client.
>
> The trace is pasted below.
>
>
>
> Is it because the entire row is loaded even if some keys are asked for?
>
> Please help for my better understanding of limitations/usage.
>
>
>
> Thanks,
>
> Rajat
>
>
>
>
>
> Heap dump file created [2147332020 bytes in 16.621 secs] ERROR 16:38:06,420
> Fatal exception in thread Thread[ReadStage:5,5,main]
>
> java.lang.OutOfMemoryError: Java heap space
>
>       at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>
>       at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>
>       at
> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
>
>       at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>
>       at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
>
>       at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
>
>       at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
>
>       at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
>
>       at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
>
>       at
> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
>
>       at
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
>
>       at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
>
>       at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
>
>       at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
>
>       at org.apache.cassandra.db.Table.getRow(Table.java:384)
>
>       at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>
>       at
> org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
>
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>
>       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>
>       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
>       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>
>       at java.lang.Thread.run(Thread.java:636)
>
> ERROR 16:38:06,717 Internal error processing get_slice
>
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> java.lang.OutOfMemoryError: Java heap space
>
>       at
> org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:282)
>
>       at
> org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:224)
>
>       at
> org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:98)
>
>       at
> org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:195)
>
>       at
> org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:271)
>
>       at
> org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:233)
>
>       at
> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:2699)
>
>       at
> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
>
>       at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
>
>       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
>       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>
>       at java.lang.Thread.run(Thread.java:636)
>
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.OutOfMemoryError: Java heap space
>
>       at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
>
>       at java.util.concurrent.FutureTask.get(FutureTask.java:111)
>
>       at
> org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:278)
>
>       ... 11 more
>
> Caused by: java.lang.OutOfMemoryError: Java heap space
>
>       at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>
>       at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>
>       at
> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
>
>       at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>
>       at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
>
>       at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
>
>       at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
>
>       at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
>
>       at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
>
>       at
> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
>
>       at
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
>
>       at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
>
>       at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
>
>       at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
>
>       at org.apache.cassandra.db.Table.getRow(Table.java:384)
>
>       at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>
>       at
> org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
>
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>
>       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>
>       ... 3 more
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: get_slice OOM on large row

Posted by Jonathan Ellis <jb...@gmail.com>.
Requesting too much data in a single request is user error.  That is
why you have start columns/rows, so you can page through a large set.

On Fri, Jan 28, 2011 at 6:48 PM, Rajat Chopra <rc...@makara.com> wrote:
> Hi!
>
>    Trying to test the 0.7 release with some offbeat settings to check the
> behavior.
>
>
>
> -          Single node cluster
>
> -          Key_cache_size – default
>
> -          Row_cache_size – default
>
> -          Min/max compaction threshold – 0 (so this is disabled)
>
> -          Disk_access_mode : standard
>
> -          Memtable_throughput_in_mb : 2 (yes two mb only)
>
>
>
> Basically I wanted to run the process with least amount of memory usage.
> Without much hassle (probably because compaction was disabled) I put 400k
> columns in a single row of a super column of a cf. Each column value being
> about 20kb. So, I have about 8G of data.
>
>
>
> Now the issue is that I run into OOM on any kind of read that I do, even
> with –Xmx2G. Using pycassa as the high level client.
>
> The trace is pasted below.
>
>
>
> Is it because the entire row is loaded even if some keys are asked for?
>
> Please help for my better understanding of limitations/usage.
>
>
>
> Thanks,
>
> Rajat
>
>
>
>
>
> Heap dump file created [2147332020 bytes in 16.621 secs] ERROR 16:38:06,420
> Fatal exception in thread Thread[ReadStage:5,5,main]
>
> java.lang.OutOfMemoryError: Java heap space
>
>       at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>
>       at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>
>       at
> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
>
>       at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>
>       at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
>
>       at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
>
>       at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
>
>       at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
>
>       at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
>
>       at
> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
>
>       at
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
>
>       at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
>
>       at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
>
>       at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
>
>       at org.apache.cassandra.db.Table.getRow(Table.java:384)
>
>       at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>
>       at
> org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
>
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>
>       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>
>       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
>       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>
>       at java.lang.Thread.run(Thread.java:636)
>
> ERROR 16:38:06,717 Internal error processing get_slice
>
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> java.lang.OutOfMemoryError: Java heap space
>
>       at
> org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:282)
>
>       at
> org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:224)
>
>       at
> org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:98)
>
>       at
> org.apache.cassandra.thrift.CassandraServer.getSlice(CassandraServer.java:195)
>
>       at
> org.apache.cassandra.thrift.CassandraServer.multigetSliceInternal(CassandraServer.java:271)
>
>       at
> org.apache.cassandra.thrift.CassandraServer.get_slice(CassandraServer.java:233)
>
>       at
> org.apache.cassandra.thrift.Cassandra$Processor$get_slice.process(Cassandra.java:2699)
>
>       at
> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
>
>       at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
>
>       at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
>       at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>
>       at java.lang.Thread.run(Thread.java:636)
>
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.OutOfMemoryError: Java heap space
>
>       at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
>
>       at java.util.concurrent.FutureTask.get(FutureTask.java:111)
>
>       at
> org.apache.cassandra.service.StorageProxy.weakRead(StorageProxy.java:278)
>
>       ... 11 more
>
> Caused by: java.lang.OutOfMemoryError: Java heap space
>
>       at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>
>       at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
>
>       at
> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:277)
>
>       at
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>
>       at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:364)
>
>       at
> org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:313)
>
>       at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readSimpleColumns(SSTableNamesIterator.java:144)
>
>       at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
>
>       at
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:70)
>
>       at
> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
>
>       at
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:81)
>
>       at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1215)
>
>       at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1107)
>
>       at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
>
>       at org.apache.cassandra.db.Table.getRow(Table.java:384)
>
>       at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:63)
>
>       at
> org.apache.cassandra.service.StorageProxy$weakReadLocalCallable.call(StorageProxy.java:777)
>
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>
>       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>
>       ... 3 more
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com