You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Dan Kinder <dk...@turnitin.com> on 2015/02/03 00:25:37 UTC

Re: large range read in Cassandra

For the benefit of others, I ended up finding out that the CQL library I
was using (https://github.com/gocql/gocql) at this time leaves paging page
size defaulted to no paging, so Cassandra was trying to pull all rows of
the partition into memory at once. Setting the page size to a reasonable
number seems to have done the trick.

On Tue, Nov 25, 2014 at 2:54 PM, Dan Kinder <dk...@turnitin.com> wrote:

> Thanks, very helpful Rob, I'll watch for that.
>
> On Tue, Nov 25, 2014 at 11:45 AM, Robert Coli <rc...@eventbrite.com>
> wrote:
>
>> On Tue, Nov 25, 2014 at 10:45 AM, Dan Kinder <dk...@turnitin.com>
>> wrote:
>>
>>> To be clear, I expect this range query to take a long time and perform
>>> relatively heavy I/O. What I expected Cassandra to do was use auto-paging (
>>> https://issues.apache.org/jira/browse/CASSANDRA-4415,
>>> http://stackoverflow.com/questions/17664438/iterating-through-cassandra-wide-row-with-cql3)
>>> so that we aren't literally pulling the entire thing in. Am I
>>> misunderstanding this use case? Could you clarify why exactly it would slow
>>> way down? It seems like with each read it should be doing a simple range
>>> read from one or two sstables.
>>>
>>
>> If you're paging through a single partition, that's likely to be fine.
>> When you said "range reads ... over rows" my impression was you were
>> talking about attempting to page through millions of partitions.
>>
>> With that confusion cleared up, the likely explanation for lack of
>> availability in your case is heap pressure/GC time. Look for GCs around
>> that time. Also, if you're using authentication, make sure that your
>> authentication keyspace has a replication factor greater than 1.
>>
>> =Rob
>>
>>
>>
>
>
> --
> Dan Kinder
> Senior Software Engineer
> Turnitin – www.turnitin.com
> dkinder@turnitin.com
>



-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkinder@turnitin.com