You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Patrik Modesto <pa...@gmail.com> on 2011/08/15 14:13:00 UTC

Internal error processing get_range_slices

Hi,

on our dev cluster of 4 cassandra nodes 0.7.8 I'm suddenly getting:

ERROR 13:40:50,848 Internal error processing get_range_slices
java.lang.OutOfMemoryError: Java heap space
        at java.util.ArrayList.<init>(ArrayList.java:112)
        at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:480)
        at org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:486)
        at org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:2868)
        at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
        at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

I run get_range_slices() on all keys with 3 columns named in the Thrift request.

    columnParent.column_family = CATEGORIES_CATEGORY;

    keyRange.start_key         = "";
    keyRange.end_key           = "";
    keyRange.__isset.start_key = true;
    keyRange.__isset.end_key   = true;
    keyRange.count             = std::numeric_limits<int32_t>::max();

    slicePredicate.column_names.push_back(CATEGORIES_CATEGORY_ID);
    slicePredicate.column_names.push_back(CATEGORIES_CATEGORY_NAME);
    slicePredicate.column_names.push_back(CATEGORIES_CATEGORY_PARENT);
    slicePredicate.__isset.column_names = true;

     std::vector<oacassandra::KeySlice>  rangeSlices;
     cassandraWrapper->get_range_slices(rangeSlices, columnParent,
slicePredicate, keyRange, oacassandra::ConsistencyLevel::QUORUM);

There are just 102 rows each with 6 columns. Maximum rowsize is 3 379
391B, mean rowsize is 407 756B. Suddenly Cassandra needs 9GB of
heap-space to fulfill this get_range_slices. There is no cache
enabled.

What could be the problem here?

Regards,
Patrik

PS: while reading the email before I'd send it, I've noticed the
keyRange.count =... is it possible that Cassandra is preallocating
some internal data acording the KeyRange.count parameter?

Re: Internal error processing get_range_slices

Posted by Jonathan Ellis <jb...@gmail.com>.
The count you specify is the worst case, so if you can't even allocate
a List to handle it, you shouldn't be specifying such a high count.
Better find that out immediately, then when your data set grows in
production.

On Mon, Aug 15, 2011 at 8:15 AM, Patrik Modesto
<pa...@gmail.com> wrote:
> On Mon, Aug 15, 2011 at 15:09, Jonathan Ellis <jb...@gmail.com> wrote:
>> On Mon, Aug 15, 2011 at 7:13 AM, Patrik Modesto
>> <pa...@gmail.com> wrote:
>>> PS: while reading the email before I'd send it, I've noticed the
>>> keyRange.count =... is it possible that Cassandra is preallocating
>>> some internal data acording the KeyRange.count parameter?
>>
>> That's exactly what it does.
>
> Ok. But is this pre-alocating really needed? Can't cassandra deduce
> that it doesn't need that much space?
>
> Regards,
> P.
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Internal error processing get_range_slices

Posted by Patrik Modesto <pa...@gmail.com>.
On Mon, Aug 15, 2011 at 15:09, Jonathan Ellis <jb...@gmail.com> wrote:
> On Mon, Aug 15, 2011 at 7:13 AM, Patrik Modesto
> <pa...@gmail.com> wrote:
>> PS: while reading the email before I'd send it, I've noticed the
>> keyRange.count =... is it possible that Cassandra is preallocating
>> some internal data acording the KeyRange.count parameter?
>
> That's exactly what it does.

Ok. But is this pre-alocating really needed? Can't cassandra deduce
that it doesn't need that much space?

Regards,
P.

Re: Internal error processing get_range_slices

Posted by Jonathan Ellis <jb...@gmail.com>.
On Mon, Aug 15, 2011 at 7:13 AM, Patrik Modesto
<pa...@gmail.com> wrote:
> PS: while reading the email before I'd send it, I've noticed the
> keyRange.count =... is it possible that Cassandra is preallocating
> some internal data acording the KeyRange.count parameter?

That's exactly what it does.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com