You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Yonder <zy...@yahoo.com.cn> on 2011/06/02 11:39:29 UTC

how to know there are some columns in a row

Dear all,

Is there any methods to list column names in a row?

Thanks,
Yonder

Re: how to know there are some columns in a row

Posted by Dan Kuebrich <da...@gmail.com>.
There might not be a built-in way to do this, but if you make two rows for
each author, eg:

nabokov_fulltext [ 'lolita' : 'Lolita, light of my life ...' , ...]
nabokov_bookindex [ 'lolita' : None , ... ]

you could query the bookindex for each author without cassandra having to
load the full texts.  This would make your cassandra row cache much more
effective for this type of query as well, and you might even consider
putting it in a separate CF.

I'd also recommend compressing the data for the full text column values.
 You can't query very well against them anyway, and it will make everything
(inserts, reads, compaction) so much better.

dan

On Tue, Jun 7, 2011 at 6:30 PM, Patrick de Torcy <pd...@gmail.com> wrote:

> But I want values in my columns... Imagine a cf with authors as keys. Each
> author has written several books. So each row has columns with the title as
> column names and the text of the book as value (ie a lot of data). If a user
> wants to know the different books for an author, I'd like to be able to have
> the column names without the values, then a user can pick a book name. In
> this case I can retrieve the value from this column (and only for this one).
> Of course, I could have an additionnal column which will manage the column
> names (=titles), but it's not very efficient and could be source of
> errors...
> If you have a method to retrieve the number of columns of a row (without
> their values),  I can't see why you couldn't retrieve the column names
> (without their values). It's perharps harder than I think... But it would be
> rather useful !
>
> Thanks !
>
> On Mon, Jun 6, 2011 at 2:08 AM, aaron morton <aa...@thelastpickle.com>wrote:
>
>> You can create columns without values.
>>
>> Are you talking about reading them back through the API ?
>>
>> I would suggest looking at your data model to see if there is a better way
>> to support your read patterns.
>>
>> Cheers
>>
>>  -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 6 Jun 2011, at 10:18, Patrick de Torcy wrote:
>>
>> It would be definetely useful to be able to have columns (or super
>> columns) names WITHOUT their values. If these ones are pretty big or if
>> there are a lot of columns, that would generate traffic not necessarily
>> needed (if in the end you are just interrested by some column).
>> Moreover it doesn't seem to be a feature too difficult to implement (well,
>> I think...)
>>
>> Patrick
>>
>>
>>
>

Re: how to know there are some columns in a row

Posted by aaron morton <aa...@thelastpickle.com>.
> Forgive me if I am a little insistent, but it's important for us and I'm sure we are not the only ones interested in this feature...

Not an issue, it's how things get done on :)

Create a jira ticket https://issues.apache.org/jira/browse/CASSANDRA with your ideas to start the process and ask others to vote if they would also like to see it. If you have time to donate for the feature include that on the ticket. 

Thanks 
 
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 9 Jun 2011, at 06:55, Jeremiah Jordan wrote:

> I am pretty sure this would cut down on network traffic, but not on Disk IO or CPU use.  I think Cassandra would still have to deserialize the whole column to get to the name.  So if you really have a use case where you just want the name, it would be better to store a separate "name with no data" column.
> 
> From: Patrick de Torcy [mailto:pdetorcy@gmail.com] 
> Sent: Wednesday, June 08, 2011 4:00 AM
> To: user@cassandra.apache.org
> Subject: Re: how to know there are some columns in a row
> 
> There is no reason for ambiguities...
> We could add in the api another method call (similar to get_count) :
> 
> get_columnNames
> 
> list<string> get_columnNames(key, column_parent, predicate, consistency_level)
> 
> Get the columns names present in column_parent within the predicate.
> 
> The method is not O(1). It takes all the columns from disk to calculate the answer. The only benefit of the method is that you do not need to pull all their values over Thrift interface to get their names
> 
> 
> (just to get the idea...)
> 
> In fact column names can really be data in themselves, so there should be a way to retrieve them (without their values). When you have big values, it's a real show stopper to use get_slice, since a lot of unnecessary traffic would be generated...
> 
> Forgive me if I am a little insistent, but it's important for us and I'm sure we are not the only ones interested in this feature...
> 
> cheers


Re: how to know there are some columns in a row

Posted by Patrick de Torcy <pd...@gmail.com>.
| I am pretty sure this would cut down on network traffic, but not on Disk
IO or CPU use.

Well, that's the same for the get_count method !

I think that would be ok,since the network traffic is the real problem (big
values...). To store the column names in a separate column could be a
solution of course, but it generates dupplicate data, with risk of
inconsistencies (and more work)

RE: how to know there are some columns in a row

Posted by Jeremiah Jordan <JE...@morningstar.com>.
I am pretty sure this would cut down on network traffic, but not on Disk
IO or CPU use.  I think Cassandra would still have to deserialize the
whole column to get to the name.  So if you really have a use case where
you just want the name, it would be better to store a separate "name
with no data" column.

________________________________

From: Patrick de Torcy [mailto:pdetorcy@gmail.com] 
Sent: Wednesday, June 08, 2011 4:00 AM
To: user@cassandra.apache.org
Subject: Re: how to know there are some columns in a row


There is no reason for ambiguities...
We could add in the api another method call (similar to get_count) :



get_columnNames


*	list<string> get_columnNames(key, column_parent, predicate,
consistency_level) 

Get the columns names present in column_parent within the predicate. 

The method is not O(1). It takes all the columns from disk to calculate
the answer. The only benefit of the method is that you do not need to
pull all their values over Thrift interface to get their names



(just to get the idea...)

In fact column names can really be data in themselves, so there should
be a way to retrieve them (without their values). When you have big
values, it's a real show stopper to use get_slice, since a lot of
unnecessary traffic would be generated...

Forgive me if I am a little insistent, but it's important for us and I'm
sure we are not the only ones interested in this feature...

cheers


Re: how to know there are some columns in a row

Posted by Patrick de Torcy <pd...@gmail.com>.
There is no reason for ambiguities...
We could add in the api another method call (similar to get_count) :

get_columnNames

   -

   list<string>
   get_columnNames(key, column_parent, predicate, consistency_level)

Get the columns names present in column_parent within the predicate.

The method is not O(1). It takes all the columns from disk to calculate the
answer. The only benefit of the method is that you do not need to pull all
their values over Thrift interface to get their names

(just to get the idea...)

In fact column names can really be data in themselves, so there should be a
way to retrieve them (without their values). When you have big values, it's
a real show stopper to use get_slice, since a lot of unnecessary traffic
would be generated...

Forgive me if I am a little insistent, but it's important for us and I'm
sure we are not the only ones interested in this feature...

cheers

Re: how to know there are some columns in a row

Posted by aaron morton <aa...@thelastpickle.com>.
> If you have a method to retrieve the number of columns of a row (without their values),  I can't see why you couldn't retrieve the column names (without their values). It's perharps harder than I think... But it would be rather useful ! 

Internally this just gets the full columns and counts them. 

The main reason I was dismissive was the complication it brings when dealing with a Column. If a Column has no value would it be because there is no value associated with it or because only the column name was requested? For now when you have a Column you have all the information about the column. 

There may also be some modelling arguments to be made. 

it's not been a show stopper for people in the past, but that does not mean it's a bad idea.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 8 Jun 2011, at 10:30, Patrick de Torcy wrote:

> But I want values in my columns... Imagine a cf with authors as keys. Each author has written several books. So each row has columns with the title as column names and the text of the book as value (ie a lot of data). If a user wants to know the different books for an author, I'd like to be able to have the column names without the values, then a user can pick a book name. In this case I can retrieve the value from this column (and only for this one).
> Of course, I could have an additionnal column which will manage the column names (=titles), but it's not very efficient and could be source of errors...
> If you have a method to retrieve the number of columns of a row (without their values),  I can't see why you couldn't retrieve the column names (without their values). It's perharps harder than I think... But it would be rather useful ! 
> 
> Thanks !
> 
> On Mon, Jun 6, 2011 at 2:08 AM, aaron morton <aa...@thelastpickle.com> wrote:
> You can create columns without values. 
> 
> Are you talking about reading them back through the API ? 
> 
> I would suggest looking at your data model to see if there is a better way to support your read patterns. 
> 
> Cheers
> 
> -----------------
> Aaron Morton 
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 6 Jun 2011, at 10:18, Patrick de Torcy wrote:
> 
>> It would be definetely useful to be able to have columns (or super columns) names WITHOUT their values. If these ones are pretty big or if there are a lot of columns, that would generate traffic not necessarily needed (if in the end you are just interrested by some column).
>> Moreover it doesn't seem to be a feature too difficult to implement (well, I think...)
>> 
>> Patrick
> 
> 


Re: how to know there are some columns in a row

Posted by Patrick de Torcy <pd...@gmail.com>.
But I want values in my columns... Imagine a cf with authors as keys. Each
author has written several books. So each row has columns with the title as
column names and the text of the book as value (ie a lot of data). If a user
wants to know the different books for an author, I'd like to be able to have
the column names without the values, then a user can pick a book name. In
this case I can retrieve the value from this column (and only for this one).
Of course, I could have an additionnal column which will manage the column
names (=titles), but it's not very efficient and could be source of
errors...
If you have a method to retrieve the number of columns of a row (without
their values),  I can't see why you couldn't retrieve the column names
(without their values). It's perharps harder than I think... But it would be
rather useful !

Thanks !

On Mon, Jun 6, 2011 at 2:08 AM, aaron morton <aa...@thelastpickle.com>wrote:

> You can create columns without values.
>
> Are you talking about reading them back through the API ?
>
> I would suggest looking at your data model to see if there is a better way
> to support your read patterns.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 6 Jun 2011, at 10:18, Patrick de Torcy wrote:
>
> It would be definetely useful to be able to have columns (or super columns)
> names WITHOUT their values. If these ones are pretty big or if there are a
> lot of columns, that would generate traffic not necessarily needed (if in
> the end you are just interrested by some column).
> Moreover it doesn't seem to be a feature too difficult to implement (well,
> I think...)
>
> Patrick
>
>
>

Re: how to know there are some columns in a row

Posted by aaron morton <aa...@thelastpickle.com>.
You can create columns without values. 

Are you talking about reading them back through the API ? 

I would suggest looking at your data model to see if there is a better way to support your read patterns. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 6 Jun 2011, at 10:18, Patrick de Torcy wrote:

> It would be definetely useful to be able to have columns (or super columns) names WITHOUT their values. If these ones are pretty big or if there are a lot of columns, that would generate traffic not necessarily needed (if in the end you are just interrested by some column).
> Moreover it doesn't seem to be a feature too difficult to implement (well, I think...)
> 
> Patrick


Re: how to know there are some columns in a row

Posted by Patrick de Torcy <pd...@gmail.com>.
It would be definetely useful to be able to have columns (or super columns)
names WITHOUT their values. If these ones are pretty big or if there are a
lot of columns, that would generate traffic not necessarily needed (if in
the end you are just interrested by some column).
Moreover it doesn't seem to be a feature too difficult to implement (well, I
think...)

Patrick

Re: 回复: how to know there are some columns in a row

Posted by aaron morton <aa...@thelastpickle.com>.
You can also use slice a range of columns for a row , e.g. first 100 columns after column "aaaa". 

What client are you using ? 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 5 Jun 2011, at 04:19, Yonder wrote:

> 
> Thanks you very much.
> but I'm afraid it's not a graceful means if there are billion columns in a row.
>   
> 发件人: Michal Augustýn <au...@gmail.com>
> 收件人: user@cassandra.apache.org; Yonder <zy...@yahoo.com.cn>
> 发送日期: 2011年6月2日, 星期四, 下午 9:25
> 主题: Re: how to know there are some columns in a row
> 
> Hi,
> 
> just use "get" Thrift method where "super_column" and "column"
> attributes in ColumnPath structure are empty. Yes, it returns both
> column names and values but I'm afraid there is no Thrift-way how to
> get column names only.
> 
> Augi
> 
> 2011/6/2 Yonder <zy...@yahoo.com.cn>:
> > Dear all,
> >
> > Is there any methods to list column names in a row?
> >
> > Thanks,
> > Yonder
> >
> 
> 


回复: how to know there are some columns in a row

Posted by Yonder <zy...@yahoo.com.cn>.

Thanks you very much.
but I'm afraid it's not a graceful means if there are billion columns in a row.
  


>________________________________
>发件人: Michal Augustýn <au...@gmail.com>
>收件人: user@cassandra.apache.org; Yonder <zy...@yahoo.com.cn>
>发送日期: 2011年6月2日, 星期四, 下午 9:25
>主题: Re: how to know there are some columns in a row
>
>Hi,
>
>just use "get" Thrift method where "super_column" and "column"
>attributes in ColumnPath structure are empty. Yes, it returns both
>column names and values but I'm afraid there is no Thrift-way how to
>get column names only.
>
>Augi
>
>2011/6/2 Yonder <zy...@yahoo.com.cn>:
>> Dear all,
>>
>> Is there any methods to list column names in a row?
>>
>> Thanks,
>> Yonder
>>
>
>
>

Re: how to know there are some columns in a row

Posted by Michal Augustýn <au...@gmail.com>.
Hi,

just use "get" Thrift method where "super_column" and "column"
attributes in ColumnPath structure are empty. Yes, it returns both
column names and values but I'm afraid there is no Thrift-way how to
get column names only.

Augi

2011/6/2 Yonder <zy...@yahoo.com.cn>:
> Dear all,
>
> Is there any methods to list column names in a row?
>
> Thanks,
> Yonder
>