You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Anand Somani <me...@gmail.com> on 2010/11/10 16:05:48 UTC

Range queries using token instead of key

Hi,

I am trying to iterate over the entire dataset to calculate some
information. Now the way I am trying to do this is by going directly to the
node that has a data range, so here is the route I am following

   - get TokenRange using - describe_ring
   - then for each tokenRange pick a node and get all data from that node
   (so talk directly to that node for local data) - using get_range_slices ()
   and using KeyRange with start and end token. I want to get about N tokens at
   a time.
   - I want to use paging approach for this, but I cannot seem to find a way
   to get the token for my last keyslice? The only thing I can find is key, now
   is there way to get token given a key? As per some suggestions I can do the
   md5 on the last key and use that as the starting token for the next query,
   would that work?

Also is there a better way of doing this? The data per row is very small.
This looks like a hadoop kind of a job, but am trying to avoid hadoop since
have no other use for it and this operation will be infrequent.

I am using 0.6.6, RandomPartitioner.

Thanks
Anand

Re: Range queries using token instead of key

Posted by Edward Capriolo <ed...@gmail.com>.

On Wed, Nov 10, 2010 at 10:05 AM, Anand Somani <me...@gmail.com> wrote:
> Hi,
>
> I am trying to iterate over the entire dataset to calculate some
> information. Now the way I am trying to do this is by going directly to the
> node that has a data range, so here is the route I am following
>
> get TokenRange using - describe_ring
> then for each tokenRange pick a node and get all data from that node (so
> talk directly to that node for local data) - using get_range_slices () and
> using KeyRange with start and end token. I want to get about N tokens at a
> time.
> I want to use paging approach for this, but I cannot seem to find a way to
> get the token for my last keyslice? The only thing I can find is key, now is
> there way to get token given a key? As per some suggestions I can do the md5
> on the last key and use that as the starting token for the next query, would
> that work?
>
> Also is there a better way of doing this? The data per row is very small.
> This looks like a hadoop kind of a job, but am trying to avoid hadoop since
> have no other use for it and this operation will be infrequent.
>
> I am using 0.6.6, RandomPartitioner.
>
> Thanks
> Anand
>

You should take the last key from your keyslice and pass it into
FBUtilities.hash(key)  to get its token.

Edward