You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Koert Kuipers <Ko...@diamondnotch.com> on 2011/01/12 02:45:22 UTC

how to do a get_range_slices where all keys start with same string

I would like to do a get_range_slices for all keys (which are strings) that start with the same substring x (for example "com.google"). How do I do that?
start_key = x abd end_key = x doesn't seem to do the job...
thanks koert

Re: how to do a get_range_slices where all keys start with same string

Posted by Stephen Connolly <st...@gmail.com>.

or set the end key to "com.googlf"

On 12 January 2011 02:49, Aaron Morton <aa...@thelastpickle.com> wrote:

> If you were using OPP and get_range_slices then set the start_key to be
> "com.google" and the end_key to be "". Get is slices of say 1,000 (use the
> last key read as the next start_ket) and when you see the first key that
> does not start with com.google top making calls.
>
> If you move the data from rows to columns, you can use the same approach.
>
> Aaron
>
>
> On 12 Jan, 2011,at 03:25 PM, Roshan Dawrani <ro...@gmail.com>
> wrote:
>
> On Wed, Jan 12, 2011 at 7:41 AM, Koert Kuipers <
> Koert.Kuipers@diamondnotch.com> wrote:
>
>> Ok I see get_range_slice is really only useful for paging with RP..
>>
>> So if I were using OPP (which I am not) and I wanted all keys starting
>> with "com.google", what should my start_key and end_key be?
>>
>
> I think you can't. It's the columns that are sorted, and not the rows (if u
> r not using OPP). With your "com.google....." data arranged in columns
> instead of rows, you should be able to specify start_col, end_col to filter
> it.
>
>
>
>

Re: how to do a get_range_slices where all keys start with same string

Posted by Arijit Mukherjee <ar...@gmail.com>.

I have a follow on question on this.

I have a super column family like this:

<ColumnFamily Name="EventSpace" CompareWith="TimeUUIDType"
CompareSubcolumnsWith="BytesType" ColumnType="Super"/>

I store some events keyed by a subscriber id, and for each such "row",
I have a number of super columns which are keyed by an event time
stamp. For example:

subscriber1 {
     ts11 { some columns}
     ts12 { some columns}
     ....
     ts1n { some columns}
}
subscriber2 {
     ts21 {...}
     ...
}

and so on.

What I want to do is to find all events within a period (given T, the
period starts from time (T-1 min) to (T+1 min)) for each subscriber,
given the subscriber ID and the starting time T. I used this piece of
code:

SlicePredicate sliceP = new SlicePredicate();
SliceRange range = new SliceRange();
range.setStart(getUUIDForTimeStamp(T-1));
range.setStart(getUUIDForTimeStamp(T+1));
sliceP.setSlice_range(range);
ColumnParent parent = new ColumnParent(CF_NAME);
List<ColumnOrSuperColumn> result = client.get_sllice(KS, subscriberID,
parent, slieceP, ConsistencyLevel.ALL);

I've helper routines to add/subtract minutes/hours/seconds to a time
stamp, and converting that to UUID and back. But is the approach
correct?

Regards
Arijit



On 12 January 2011 08:19, Aaron Morton <aa...@thelastpickle.com> wrote:
>
> If you were using OPP and get_range_slices then set the start_key to be "com.google" and the end_key to be "". Get is slices of say 1,000 (use the last key read as the next start_ket) and when you see the first key that does not start with com.google top making calls.
> If you move the data from rows to columns, you can use the same approach.
> Aaron
>
> On 12 Jan, 2011,at 03:25 PM, Roshan Dawrani <ro...@gmail.com> wrote:
>
> On Wed, Jan 12, 2011 at 7:41 AM, Koert Kuipers <Ko...@diamondnotch.com> wrote:
>>
>> Ok I see get_range_slice is really only useful for paging with RP..
>>
>> So if I were using OPP (which I am not) and I wanted all keys starting with "com.google", what should my start_key and end_key be?
>
> I think you can't. It's the columns that are sorted, and not the rows (if u r not using OPP). With your "com.google....." data arranged in columns instead of rows, you should be able to specify start_col, end_col to filter it.
>
>
>
>
>
>


--
"And when the night is cloudy,
There is still a light that shines on me,
Shine on until tomorrow, let it be."

Re: how to do a get_range_slices where all keys start with same string

Posted by Aaron Morton <aa...@thelastpickle.com>.

If you were using OPP and get_range_slices then set the start_key to be "com.google" and the end_key to be "". Get is slices of say 1,000 (use the last key read as the next start_ket) and when you see the first key that does not start with com.google top making calls.

If you move the data from rows to columns, you can use the same approach. 

Aaron


On 12 Jan, 2011,at 03:25 PM, Roshan Dawrani <ro...@gmail.com> wrote:

On Wed, Jan 12, 2011 at 7:41 AM, Koert Kuipers <Ko...@diamondnotch.com> wrote:
Ok I see get_range_slice is really only useful for paging with RP...

So if I were using OPP (which I am not) and I wanted all keys starting with "com.google", what should my start_key and end_key be?

I think you can't. It's the columns that are sorted, and not the rows (if u r not using OPP). With your "com.google....." data arranged in columns instead of rows, you should be able to specify start_col, end_col to filter it.

Re: how to do a get_range_slices where all keys start with same string

Posted by Roshan Dawrani <ro...@gmail.com>.

On Wed, Jan 12, 2011 at 7:41 AM, Koert Kuipers <
Koert.Kuipers@diamondnotch.com> wrote:

> Ok I see get_range_slice is really only useful for paging with RP...
>
> So if I were using OPP (which I am not) and I wanted all keys starting with
> "com.google", what should my start_key and end_key be?
>

I think you can't. It's the columns that are sorted, and not the rows (if u
r not using OPP). With your "com.google....." data arranged in columns
instead of rows, you should be able to specify start_col, end_col to filter
it.
                                                  <#>              <#>
<#>       <#>

RE: how to do a get_range_slices where all keys start with same string

Posted by Koert Kuipers <Ko...@diamondnotch.com>.

Ok I see get_range_slice is really only useful for paging with RP...

So if I were using OPP (which I am not) and I wanted all keys starting with "com.google", what should my start_key and end_key be?

-----Original Message-----
From: Jonathan Ellis [mailto:jbellis@gmail.com] 
Sent: Tuesday, January 11, 2011 9:02 PM
To: user
Subject: Re: how to do a get_range_slices where all keys start with same string

http://wiki.apache.org/cassandra/FAQ#range_rp

also, start==end==x means "give me back exactly row x, if it exists."
IF you were using OPP you'd need end=y.

On Tue, Jan 11, 2011 at 7:45 PM, Koert Kuipers
<Ko...@diamondnotch.com> wrote:
> I would like to do a get_range_slices for all keys (which are strings) that
> start with the same substring x (for example "com.google"). How do I do
> that?
>
> start_key = x abd end_key = x doesn't seem to do the job...
>
> thanks koert
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: how to do a get_range_slices where all keys start with same string

Posted by Jonathan Ellis <jb...@gmail.com>.

http://wiki.apache.org/cassandra/FAQ#range_rp

also, start==end==x means "give me back exactly row x, if it exists."
IF you were using OPP you'd need end=y.

On Tue, Jan 11, 2011 at 7:45 PM, Koert Kuipers
<Ko...@diamondnotch.com> wrote:
> I would like to do a get_range_slices for all keys (which are strings) that
> start with the same substring x (for example “com.google”). How do I do
> that?
>
> start_key = x abd end_key = x doesn’t seem to do the job…
>
> thanks koert
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: how to do a get_range_slices where all keys start with same string

Posted by Tyler Hobbs <ty...@riptano.com>.

That type of operation only works (directly) when using an
OrderPreservingPartitioner.  There are a lot of downsides to OPP:

http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/

You can instead order your keys alphabetically as column names in a row (or
multiple rows, split up by some length of prefix).  You peform a get_slice()
on that row (or rows) and then use the column names as keys for a
multiget().

It's also possible that you could avoid this step and just store the data
directly in the sorted rows.

- Tyler

On Tue, Jan 11, 2011 at 7:45 PM, Koert Kuipers <
Koert.Kuipers@diamondnotch.com> wrote:

>  I would like to do a get_range_slices for all keys (which are strings)
> that start with the same substring x (for example “com.google”). How do I do
> that?
>
> start_key = x abd end_key = x doesn’t seem to do the job…
>
> thanks koert
>
>
>