You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Anand Somani <me...@gmail.com> on 2011/07/05 17:23:09 UTC

Problems Iterating over tokens in > 0.7.5

Hi,

Using thrift and get_range_slices call with tokenrange. Using Random
Partionioner. Have only tried this on > 0.7.5
Used to work in 0.6.4 or earlier version for me , but I notice that it does
not work for me anymore. The need is to iterate over a token range to do
some bookkeeping.
The logic is use

   1. TokenRange from describe_ring
   2. and then for each range
   1. set the start and end token
      2. get a batch of rows using get_range_slices
      3. Then use the last token from the batch to set the start_token and
      repeat (get the next batch). iterate until no more to get (or
last from new
      batch is same as last from previous batch)

Now this works when in a test I insert n records and then for iterating use
a batch size m such that m > n. As soon as I use m < n, I get incorrect
count or an infinite loop where the range seems to repeat.

Anybody seen this issue or am I using it incorrectly for newer versions of
cassandra? I will also look up how this is done in Hector, but in the
meantime if somebody has seen this behavior, please do respond.

Thanks
Anand

Re: Problems Iterating over tokens in > 0.7.5

Posted by Aaron Morton <aa...@thelastpickle.com>.
If you still have problems send through some details of where you get incorrect results.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 6/07/2011, at 3:23 AM, Anand Somani <me...@gmail.com> wrote:

> Hi,
> 
> Using thrift and get_range_slices call with tokenrange. Using Random Partionioner. Have only tried this on > 0.7.5
> Used to work in 0.6.4 or earlier version for me , but I notice that it does not work for me anymore. The need is to iterate over a token range to do some bookkeeping. 
> The logic is use 
> TokenRange from describe_ring 
> and then for each range 
> set the start and end token
> get a batch of rows using get_range_slices
> Then use the last token from the batch to set the start_token and repeat (get the next batch). iterate until no more to get (or last from new batch is same as last from previous batch)
> Now this works when in a test I insert n records and then for iterating use a batch size m such that m > n. As soon as I use m < n, I get incorrect count or an infinite loop where the range seems to repeat.
> 
> Anybody seen this issue or am I using it incorrectly for newer versions of cassandra? I will also look up how this is done in Hector, but in the meantime if somebody has seen this behavior, please do respond.
> 
> Thanks
> Anand
> 
>