You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Narendra Sharma <na...@gmail.com> on 2010/12/03 01:00:17 UTC

Fetch a SuperColumn based on value of column

Hi,

My schema has a row that has thousands of Super Columns. The size of each
super column is around 500B (20 columns). I need to query 1 SuperColumn
based on value of one of its column. Something like

SELECT SuperColumn FROM Row WHERE SuperColumn.column="value"

Questions:
1. Is this possible with current Cassandra APIs? If yes, could you please
show with a sample.
2. How would such a query perform if the number of SuperColumns is high (>
10K)?

Cassandra version 0.7.

Thanks,
Naren

Re: Fetch a SuperColumn based on value of column

Posted by Aaron Morton <aa...@thelastpickle.com>.

Ah, sounds like you could change your data model.

Perhaps using a Standard CF with and 0.7 secondary indexes would suit you. 

Or if you code knows the value for both attributes, just use these as a key and get all the data for the row. One simple lookup. It's ok to denormalise your data to support queries.

If you want more help please provide some more background on the problem your solving.

Cheers
Aaron

On 3/12/2010, at 6:58 PM, Narendra Sharma <na...@gmail.com> wrote:

> Thanks Aaron!
> 
> The first request requires you to know the SuperColumn name. In my case I don't know the SuperColumn name cause if I know then I can read the super column. I need to find the SuperColumn that has column with given value for a given column.
> The usecase is that application allows querying object by two attributes. I have made one of the attribute as Supercolumn name. I need to keep the second attribute as subcolumn in super column. Now I need to perform search by subcolumn.
> I think the only option is to maintain another CF with column name as the second attribute with value as the name of super column in current CF. Is there any better way to handle this?
> 
> Thanks,
> Naren
> 
> On Thu, Dec 2, 2010 at 5:48 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
> You can use column and super column names with the get_slice() function without 0.7 secondary indexes. I'm assuming that the original query was to test for the existence of a column by name. 
> 
> In the case below, to retrieve the full super column would require to request...
> 
> First to test the condition. get_slice with a ColumnParent that specifies the CF and the Super Column and a slice predicate with the column_names[] containing the name of the col you want. This query would only return the one column. 
> 
> If you then wanted to get all columns in the super column you would make another request. 
> 
> If making two requests is a pain or too slow, consider changing the data model to better support the requests you need to make. 
> 
> AFAIK a lot of super columns will not impact performance any more than a lot of column. There are however limitations to the number of columns in a super column http://wiki.apache.org/cassandra/CassandraLimitations
>  
> Hope that helps. 
> Aaron
> 
> 
> On 03 Dec, 2010,at 01:10 PM, Nick Santini <ni...@kaseya.com> wrote:
> 
>> actually, the solution would be something like my last mail, but pointing to the name of the super column and the row key
>> 
>> 
>> Nicolas Santini
>> Director of Cloud Computing
>> Auckland - New Zealand
>> (64) 09 914 9426 ext 2629
>> (64) 021 201 3672
>> 
>> 
>> 
>> On Fri, Dec 3, 2010 at 1:08 PM, Nick Santini <ni...@kaseya.com> wrote:
>> Hi,
>> as I got answered on my mail, secondary indexes for super column families is not supported yet, so you have to implement your own
>> 
>> easy way: keep another column family where the row key is the value of your field and the columns are the row keys of your super column family
>> 
>> (inverted index)
>> 
>> 
>> Nicolas Santini
>> Director of Cloud Computing
>> Auckland - New Zealand
>> (64) 09 914 9426 ext 2629
>> (64) 021 201 3672
>> 
>> 
>> 
>> 
>> On Fri, Dec 3, 2010 at 1:00 PM, Narendra Sharma <na...@gmail.com> wrote:
>> Hi,
>> 
>> My schema has a row that has thousands of Super Columns. The size of each super column is around 500B (20 columns). I need to query 1 SuperColumn based on value of one of its column. Something like
>> 
>> SELECT SuperColumn FROM Row WHERE SuperColumn.column="value"
>> 
>> Questions:
>> 1. Is this possible with current Cassandra APIs? If yes, could you please show with a sample.
>> 2. How would such a query perform if the number of SuperColumns is high (> 10K)?
>> 
>> Cassandra version 0.7.
>> 
>> Thanks,
>> Naren
>> 
>> 
>> 
>

Re: Fetch a SuperColumn based on value of column

Posted by Narendra Sharma <na...@gmail.com>.

Thanks Aaron!

The first request requires you to know the SuperColumn name. In my case I
don't know the SuperColumn name cause if I know then I can read the super
column. I need to find the SuperColumn that has column with given value for
a given column.
The usecase is that application allows querying object by two attributes. I
have made one of the attribute as Supercolumn name. I need to keep the
second attribute as subcolumn in super column. Now I need to perform search
by subcolumn.
I think the only option is to maintain another CF with column name as the
second attribute with value as the name of super column in current CF. Is
there any better way to handle this?

Thanks,
Naren

On Thu, Dec 2, 2010 at 5:48 PM, Aaron Morton <aa...@thelastpickle.com>wrote:

> You can use column and super column names with the get_slice() function
> without 0.7 secondary indexes. I'm assuming that the original query was to
> test for the existence of a column by name.
>
> In the case below, to retrieve the full super column would require to
> request...
>
> First to test the condition. get_slice with a ColumnParent that specifies
> the CF and the Super Column and a slice predicate with the column_names[]
> containing the name of the col you want. This query would only return the
> one column.
>
> If you then wanted to get all columns in the super column you would make
> another request.
>
> If making two requests is a pain or too slow, consider changing the data
> model to better support the requests you need to make.
>
> AFAIK a lot of super columns will not impact performance any more than a
> lot of column. There are however limitations to the number of columns in a
> super column http://wiki.apache.org/cassandra/CassandraLimitations
> <http://wiki.apache.org/cassandra/CassandraLimitations>
> Hope that helps.
> Aaron
>
>
> On 03 Dec, 2010,at 01:10 PM, Nick Santini <ni...@kaseya.com> wrote:
>
> actually, the solution would be something like my last mail, but pointing
> to the name of the super column and the row key
>
>
> Nicolas Santini
> Director of Cloud Computing
> Auckland - New Zealand
> (64) 09 914 9426 ext 2629
> (64) 021 201 3672
>
>
>
> On Fri, Dec 3, 2010 at 1:08 PM, Nick Santini <ni...@kaseya.com>wrote:
>
>> Hi,
>> as I got answered on my mail, secondary indexes for super column families
>> is not supported yet, so you have to implement your own
>>
>> easy way: keep another column family where the row key is the value of
>> your field and the columns are the row keys of your super column family
>>
>> (inverted index)
>>
>>
>> Nicolas Santini
>> Director of Cloud Computing
>> Auckland - New Zealand
>> (64) 09 914 9426 ext 2629
>> (64) 021 201 3672
>>
>>
>>
>>
>> On Fri, Dec 3, 2010 at 1:00 PM, Narendra Sharma <
>> narendra.sharma@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> My schema has a row that has thousands of Super Columns. The size of each
>>> super column is around 500B (20 columns). I need to query 1 SuperColumn
>>> based on value of one of its column. Something like
>>>
>>> SELECT SuperColumn FROM Row WHERE SuperColumn.column="value"
>>>
>>> Questions:
>>> 1. Is this possible with current Cassandra APIs? If yes, could you please
>>> show with a sample.
>>> 2. How would such a query perform if the number of SuperColumns is high
>>> (> 10K)?
>>>
>>> Cassandra version 0.7.
>>>
>>> Thanks,
>>> Naren
>>>
>>>
>>
>

Re: Fetch a SuperColumn based on value of column

Posted by Aaron Morton <aa...@thelastpickle.com>.

You can use column and super column names with the get_slice() function without 0.7 secondary indexes. I'm assuming that the original query was to test for the existence of a column by name. 

In the case below, to retrieve the full super column would require to request...

First to test the condition. get_slice with a ColumnParent that specifies the CF and the Super Column and a slice predicate with the column_names[] containing the name of the col you want. This query would only return the one column. 

If you then wanted to get all columns in the super column you would make another request. 

If making two requests is a pain or too slow, consider changing the data model to better support the requests you need to make. 

AFAIK a lot of super columns will not impact performance any more than a lot of column. There are however limitations to the number of columns in a super column http://wiki.apache.org/cassandra/CassandraLimitations
 
Hope that helps. 
Aaron


On 03 Dec, 2010,at 01:10 PM, Nick Santini <ni...@kaseya.com> wrote:

actually, the solution would be something like my last mail, but pointing to the name of the super column and the row key


Nicolas Santini
Director of Cloud Computing
Auckland - New Zealand
(64) 09 914 9426 ext 2629
(64) 021 201 3672



On Fri, Dec 3, 2010 at 1:08 PM, Nick Santini <ni...@kaseya.com> wrote:
Hi,
as I got answered on my mail, secondary indexes for super column families is not supported yet, so you have to implement your own

easy way: keep another column family where the row key is the value of your field and the columns are the row keys of your super column family

(inverted index)


Nicolas Santini
Director of Cloud Computing
Auckland - New Zealand
(64) 09 914 9426 ext 2629
(64) 021 201 3672




On Fri, Dec 3, 2010 at 1:00 PM, Narendra Sharma <na...@gmail.com> wrote:
Hi,

My schema has a row that has thousands of Super Columns. The size of each super column is around 500B (20 columns). I need to query 1 SuperColumn based on value of one of its column. Something like

SELECT SuperColumn FROM Row WHERE SuperColumn.column="value"

Questions:
1. Is this possible with current Cassandra APIs? If yes, could you please show with a sample.
2. How would such a query perform if the number of SuperColumns is high (> 10K)?

Cassandra version 0.7.

Thanks,
Naren

Re: Fetch a SuperColumn based on value of column

Posted by Nick Santini <ni...@kaseya.com>.

actually, the solution would be something like my last mail, but pointing to
the name of the super column and the row key


Nicolas Santini
Director of Cloud Computing
Auckland - New Zealand
(64) 09 914 9426 ext 2629
(64) 021 201 3672



On Fri, Dec 3, 2010 at 1:08 PM, Nick Santini <ni...@kaseya.com>wrote:

> Hi,
> as I got answered on my mail, secondary indexes for super column families
> is not supported yet, so you have to implement your own
>
> easy way: keep another column family where the row key is the value of your
> field and the columns are the row keys of your super column family
>
> (inverted index)
>
>
> Nicolas Santini
> Director of Cloud Computing
> Auckland - New Zealand
> (64) 09 914 9426 ext 2629
> (64) 021 201 3672
>
>
>
> On Fri, Dec 3, 2010 at 1:00 PM, Narendra Sharma <narendra.sharma@gmail.com
> > wrote:
>
>> Hi,
>>
>> My schema has a row that has thousands of Super Columns. The size of each
>> super column is around 500B (20 columns). I need to query 1 SuperColumn
>> based on value of one of its column. Something like
>>
>> SELECT SuperColumn FROM Row WHERE SuperColumn.column="value"
>>
>> Questions:
>> 1. Is this possible with current Cassandra APIs? If yes, could you please
>> show with a sample.
>> 2. How would such a query perform if the number of SuperColumns is high (>
>> 10K)?
>>
>> Cassandra version 0.7.
>>
>> Thanks,
>> Naren
>>
>>
>

Re: Fetch a SuperColumn based on value of column

Posted by Nick Santini <ni...@kaseya.com>.

Hi,
as I got answered on my mail, secondary indexes for super column families is
not supported yet, so you have to implement your own

easy way: keep another column family where the row key is the value of your
field and the columns are the row keys of your super column family

(inverted index)


Nicolas Santini
Director of Cloud Computing
Auckland - New Zealand
(64) 09 914 9426 ext 2629
(64) 021 201 3672



On Fri, Dec 3, 2010 at 1:00 PM, Narendra Sharma
<na...@gmail.com>wrote:

> Hi,
>
> My schema has a row that has thousands of Super Columns. The size of each
> super column is around 500B (20 columns). I need to query 1 SuperColumn
> based on value of one of its column. Something like
>
> SELECT SuperColumn FROM Row WHERE SuperColumn.column="value"
>
> Questions:
> 1. Is this possible with current Cassandra APIs? If yes, could you please
> show with a sample.
> 2. How would such a query perform if the number of SuperColumns is high (>
> 10K)?
>
> Cassandra version 0.7.
>
> Thanks,
> Naren
>
>