You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Aklin_81 <as...@gmail.com> on 2012/01/05 06:37:31 UTC

What is the future of supercolumns ?

I have seen supercolumns usage been discouraged most of the times.
However sometimes the supercolumns seem to fit the scenario most
appropriately not only in terms of how the data is stored but also in
terms of how is it retrieved. Some of the queries supported by SCs are
uniquely capable of doing the task which no other alternative schema
could do.(Like recently I asked about getting the equivalent of
retrieving a list of (full)supercolumns by name, through use of
composite columns, unfortunately there was no way to do this without
reading lots of extra columns).

So I am really confused whether:

1. Should I really not use the supercolumns for any case at all,
however appropriate, or I just need to be just careful while realizing
that supercolumns fit my use case appropriately or what!?

2. Are there any performance concerns with supercolumns even in the
cases where they are used most appropriately. Like when you need to
retrieve the entire supercolumns everytime & max. no of subcolumns
vary between 0-10.
(I don't write all the subcolumns inside supercolumn, at once though!
Does this also matter?)

3. What is their future? Are they going to be deprecated or may be
enhanced later?

Re: What is the future of supercolumns ?

Posted by Aklin_81 <as...@gmail.com>.
Hmm .. it would be great if the supercolumns API remains. Also I
believe we can replace the full functionality of supercolumns through
composite column names in case this issue (related to reading multiple
column ranges) is
resolved:https://issues.apache.org/jira/browse/CASSANDRA-2710


On Sun, Jan 8, 2012 at 5:45 AM, Brandon Williams <dr...@gmail.com> wrote:
> On Sat, Jan 7, 2012 at 5:42 PM, Rustam Aliyev <ru...@code.az> wrote:
>>> My suggestion is simple: don't use any deprecated stuff out there. In
>>> practically any case there is a good reason why it's deprecated.
>>
>>
>> SuperColumns are not deprecated.
>
> The supercolumn API will remain:
> https://issues.apache.org/jira/browse/CASSANDRA-3237
>
> -Brandon

Re: What is the future of supercolumns ?

Posted by Brandon Williams <dr...@gmail.com>.
On Sat, Jan 7, 2012 at 5:42 PM, Rustam Aliyev <ru...@code.az> wrote:
>> My suggestion is simple: don't use any deprecated stuff out there. In
>> practically any case there is a good reason why it's deprecated.
>
>
> SuperColumns are not deprecated.

The supercolumn API will remain:
https://issues.apache.org/jira/browse/CASSANDRA-3237

-Brandon

Re: What is the future of supercolumns ?

Posted by Rustam Aliyev <ru...@code.az>.
> My suggestion is simple: don't use any deprecated stuff out there. In 
> practically any case there is a good reason why it's deprecated.

SuperColumns are not deprecated.

On Sat Jan  7 19:51:55 2012, R. Verlangen wrote:
> My suggestion is simple: don't use any deprecated stuff out there. In 
> practically any case there is a good reason why it's deprecated.
>
> I've seen a couple of composite-column vs supercolumn discussions in 
> the past weeks here: I think a little bit of searching will get you 
> around.
>
> Cheers
>
> 2012/1/7 Aklin_81 <asdkl93@gmail.com <ma...@gmail.com>>
>
>     I read entire columns inside the supercolumns at any time but as for
>     writing them, I write the columns at different times. I don't have the
>     need to update them except that die after their TTL period of 60 days.
>     But since they are going to be deprecated, I don't know if it would be
>     really advisable to use them right now.
>
>     I believe if it was possible to do wildchard querying for a list of
>     column names then the supercolumns use cases may be easily replaced by
>     normal columns. Could it practically possible, in future ?
>
>     On Sat, Jan 7, 2012 at 8:05 AM, Terje Marthinussen
>     <tmarthinussen@gmail.com <ma...@gmail.com>> wrote:
>     > Please realize that I do not make any decisions here and I am
>     not part of the core Cassandra developer team.
>     >
>     > What has been said before is that they will most likely go away
>     and at least under the hood be replaced by composite columns.
>     >
>     > Jonathan have however stated that he would like the supercolumn
>     API/abstraction to remain at least for backwards compatibility.
>     >
>     > Please understand that under the hood, supercolumns are merely
>     groups of columns serialized as a single block of data.
>     >
>     >
>     > The fact that there is a specialized and hardcoded way to
>     serialize these column groups into supercolumns is a problem
>     however and they should probably go away to make space for a more
>     generic implementation allowing more flexible data structures and
>     less code specific for one special data structure.
>     >
>     > Today there are tons of extra code to deal with the slight
>     difference in serialization and features of supercolumns vs
>     columns and hopefully most of that could go away if things got
>     structured a bit different.
>     >
>     > I also hope that we keep APIs to allow simple access to groups
>     of key/value pairs to simplify application logic as working with
>     just columns can add a lot of application code which should not be
>     needed.
>     >
>     > If you almost always need all or mostly all of the columns in a
>     supercolumn, and you normally update all of them at the same time,
>     they will most likely be faster than normal columns.
>     >
>     > Processing wise, you will actually do a bit more work on
>     serialization/deserialization of SC's but the I/O part will
>     usually be better grouped/require less operations.
>     >
>     > I think we did some benchmarks on some heavy use cases with ~30
>     small columns per SC some time back and I think we ended up with
>      SCs being 10-20% faster.
>     >
>     >
>     > Terje
>     >
>     > On Jan 5, 2012, at 2:37 PM, Aklin_81 wrote:
>     >
>     >> I have seen supercolumns usage been discouraged most of the times.
>     >> However sometimes the supercolumns seem to fit the scenario most
>     >> appropriately not only in terms of how the data is stored but
>     also in
>     >> terms of how is it retrieved. Some of the queries supported by
>     SCs are
>     >> uniquely capable of doing the task which no other alternative
>     schema
>     >> could do.(Like recently I asked about getting the equivalent of
>     >> retrieving a list of (full)supercolumns by name, through use of
>     >> composite columns, unfortunately there was no way to do this
>     without
>     >> reading lots of extra columns).
>     >>
>     >> So I am really confused whether:
>     >>
>     >> 1. Should I really not use the supercolumns for any case at all,
>     >> however appropriate, or I just need to be just careful while
>     realizing
>     >> that supercolumns fit my use case appropriately or what!?
>     >>
>     >> 2. Are there any performance concerns with supercolumns even in the
>     >> cases where they are used most appropriately. Like when you need to
>     >> retrieve the entire supercolumns everytime & max. no of subcolumns
>     >> vary between 0-10.
>     >> (I don't write all the subcolumns inside supercolumn, at once
>     though!
>     >> Does this also matter?)
>     >>
>     >> 3. What is their future? Are they going to be deprecated or may be
>     >> enhanced later?
>     >
>
>

Re: What is the future of supercolumns ?

Posted by "R. Verlangen" <ro...@us2.nl>.
My suggestion is simple: don't use any deprecated stuff out there. In
practically any case there is a good reason why it's deprecated.

I've seen a couple of composite-column vs supercolumn discussions in the
past weeks here: I think a little bit of searching will get you around.

Cheers

2012/1/7 Aklin_81 <as...@gmail.com>

> I read entire columns inside the supercolumns at any time but as for
> writing them, I write the columns at different times. I don't have the
> need to update them except that die after their TTL period of 60 days.
> But since they are going to be deprecated, I don't know if it would be
> really advisable to use them right now.
>
> I believe if it was possible to do wildchard querying for a list of
> column names then the supercolumns use cases may be easily replaced by
> normal columns. Could it practically possible, in future ?
>
> On Sat, Jan 7, 2012 at 8:05 AM, Terje Marthinussen
> <tm...@gmail.com> wrote:
> > Please realize that I do not make any decisions here and I am not part
> of the core Cassandra developer team.
> >
> > What has been said before is that they will most likely go away and at
> least under the hood be replaced by composite columns.
> >
> > Jonathan have however stated that he would like the supercolumn
> API/abstraction to remain at least for backwards compatibility.
> >
> > Please understand that under the hood, supercolumns are merely groups of
> columns serialized as a single block of data.
> >
> >
> > The fact that there is a specialized and hardcoded way to serialize
> these column groups into supercolumns is a problem however and they should
> probably go away to make space for a more generic implementation allowing
> more flexible data structures and less code specific for one special data
> structure.
> >
> > Today there are tons of extra code to deal with the slight difference in
> serialization and features of supercolumns vs columns and hopefully most of
> that could go away if things got structured a bit different.
> >
> > I also hope that we keep APIs to allow simple access to groups of
> key/value pairs to simplify application logic as working with just columns
> can add a lot of application code which should not be needed.
> >
> > If you almost always need all or mostly all of the columns in a
> supercolumn, and you normally update all of them at the same time, they
> will most likely be faster than normal columns.
> >
> > Processing wise, you will actually do a bit more work on
> serialization/deserialization of SC's but the I/O part will usually be
> better grouped/require less operations.
> >
> > I think we did some benchmarks on some heavy use cases with ~30 small
> columns per SC some time back and I think we ended up with  SCs being
> 10-20% faster.
> >
> >
> > Terje
> >
> > On Jan 5, 2012, at 2:37 PM, Aklin_81 wrote:
> >
> >> I have seen supercolumns usage been discouraged most of the times.
> >> However sometimes the supercolumns seem to fit the scenario most
> >> appropriately not only in terms of how the data is stored but also in
> >> terms of how is it retrieved. Some of the queries supported by SCs are
> >> uniquely capable of doing the task which no other alternative schema
> >> could do.(Like recently I asked about getting the equivalent of
> >> retrieving a list of (full)supercolumns by name, through use of
> >> composite columns, unfortunately there was no way to do this without
> >> reading lots of extra columns).
> >>
> >> So I am really confused whether:
> >>
> >> 1. Should I really not use the supercolumns for any case at all,
> >> however appropriate, or I just need to be just careful while realizing
> >> that supercolumns fit my use case appropriately or what!?
> >>
> >> 2. Are there any performance concerns with supercolumns even in the
> >> cases where they are used most appropriately. Like when you need to
> >> retrieve the entire supercolumns everytime & max. no of subcolumns
> >> vary between 0-10.
> >> (I don't write all the subcolumns inside supercolumn, at once though!
> >> Does this also matter?)
> >>
> >> 3. What is their future? Are they going to be deprecated or may be
> >> enhanced later?
> >
>

Re: What is the future of supercolumns ?

Posted by Aklin_81 <as...@gmail.com>.
I read entire columns inside the supercolumns at any time but as for
writing them, I write the columns at different times. I don't have the
need to update them except that die after their TTL period of 60 days.
But since they are going to be deprecated, I don't know if it would be
really advisable to use them right now.

I believe if it was possible to do wildchard querying for a list of
column names then the supercolumns use cases may be easily replaced by
normal columns. Could it practically possible, in future ?

On Sat, Jan 7, 2012 at 8:05 AM, Terje Marthinussen
<tm...@gmail.com> wrote:
> Please realize that I do not make any decisions here and I am not part of the core Cassandra developer team.
>
> What has been said before is that they will most likely go away and at least under the hood be replaced by composite columns.
>
> Jonathan have however stated that he would like the supercolumn API/abstraction to remain at least for backwards compatibility.
>
> Please understand that under the hood, supercolumns are merely groups of columns serialized as a single block of data.
>
>
> The fact that there is a specialized and hardcoded way to serialize these column groups into supercolumns is a problem however and they should probably go away to make space for a more generic implementation allowing more flexible data structures and less code specific for one special data structure.
>
> Today there are tons of extra code to deal with the slight difference in serialization and features of supercolumns vs columns and hopefully most of that could go away if things got structured a bit different.
>
> I also hope that we keep APIs to allow simple access to groups of key/value pairs to simplify application logic as working with just columns can add a lot of application code which should not be needed.
>
> If you almost always need all or mostly all of the columns in a supercolumn, and you normally update all of them at the same time, they will most likely be faster than normal columns.
>
> Processing wise, you will actually do a bit more work on serialization/deserialization of SC's but the I/O part will usually be better grouped/require less operations.
>
> I think we did some benchmarks on some heavy use cases with ~30 small columns per SC some time back and I think we ended up with  SCs being 10-20% faster.
>
>
> Terje
>
> On Jan 5, 2012, at 2:37 PM, Aklin_81 wrote:
>
>> I have seen supercolumns usage been discouraged most of the times.
>> However sometimes the supercolumns seem to fit the scenario most
>> appropriately not only in terms of how the data is stored but also in
>> terms of how is it retrieved. Some of the queries supported by SCs are
>> uniquely capable of doing the task which no other alternative schema
>> could do.(Like recently I asked about getting the equivalent of
>> retrieving a list of (full)supercolumns by name, through use of
>> composite columns, unfortunately there was no way to do this without
>> reading lots of extra columns).
>>
>> So I am really confused whether:
>>
>> 1. Should I really not use the supercolumns for any case at all,
>> however appropriate, or I just need to be just careful while realizing
>> that supercolumns fit my use case appropriately or what!?
>>
>> 2. Are there any performance concerns with supercolumns even in the
>> cases where they are used most appropriately. Like when you need to
>> retrieve the entire supercolumns everytime & max. no of subcolumns
>> vary between 0-10.
>> (I don't write all the subcolumns inside supercolumn, at once though!
>> Does this also matter?)
>>
>> 3. What is their future? Are they going to be deprecated or may be
>> enhanced later?
>

Re: What is the future of supercolumns ?

Posted by Terje Marthinussen <tm...@gmail.com>.
Please realize that I do not make any decisions here and I am not part of the core Cassandra developer team.

What has been said before is that they will most likely go away and at least under the hood be replaced by composite columns.

Jonathan have however stated that he would like the supercolumn API/abstraction to remain at least for backwards compatibility.

Please understand that under the hood, supercolumns are merely groups of columns serialized as a single block of data. 


The fact that there is a specialized and hardcoded way to serialize these column groups into supercolumns is a problem however and they should probably go away to make space for a more generic implementation allowing more flexible data structures and less code specific for one special data structure.

Today there are tons of extra code to deal with the slight difference in serialization and features of supercolumns vs columns and hopefully most of that could go away if things got structured a bit different.

I also hope that we keep APIs to allow simple access to groups of key/value pairs to simplify application logic as working with just columns can add a lot of application code which should not be needed.

If you almost always need all or mostly all of the columns in a supercolumn, and you normally update all of them at the same time, they will most likely be faster than normal columns.

Processing wise, you will actually do a bit more work on serialization/deserialization of SC's but the I/O part will usually be better grouped/require less operations.

I think we did some benchmarks on some heavy use cases with ~30 small columns per SC some time back and I think we ended up with  SCs being 10-20% faster.


Terje

On Jan 5, 2012, at 2:37 PM, Aklin_81 wrote:

> I have seen supercolumns usage been discouraged most of the times.
> However sometimes the supercolumns seem to fit the scenario most
> appropriately not only in terms of how the data is stored but also in
> terms of how is it retrieved. Some of the queries supported by SCs are
> uniquely capable of doing the task which no other alternative schema
> could do.(Like recently I asked about getting the equivalent of
> retrieving a list of (full)supercolumns by name, through use of
> composite columns, unfortunately there was no way to do this without
> reading lots of extra columns).
> 
> So I am really confused whether:
> 
> 1. Should I really not use the supercolumns for any case at all,
> however appropriate, or I just need to be just careful while realizing
> that supercolumns fit my use case appropriately or what!?
> 
> 2. Are there any performance concerns with supercolumns even in the
> cases where they are used most appropriately. Like when you need to
> retrieve the entire supercolumns everytime & max. no of subcolumns
> vary between 0-10.
> (I don't write all the subcolumns inside supercolumn, at once though!
> Does this also matter?)
> 
> 3. What is their future? Are they going to be deprecated or may be
> enhanced later?


Re: What is the future of supercolumns ?

Posted by Aklin_81 <as...@gmail.com>.
Any comments please ?

On Thu, Jan 5, 2012 at 11:07 AM, Aklin_81 <as...@gmail.com> wrote:
> I have seen supercolumns usage been discouraged most of the times.
> However sometimes the supercolumns seem to fit the scenario most
> appropriately not only in terms of how the data is stored but also in
> terms of how is it retrieved. Some of the queries supported by SCs are
> uniquely capable of doing the task which no other alternative schema
> could do.(Like recently I asked about getting the equivalent of
> retrieving a list of (full)supercolumns by name, through use of
> composite columns, unfortunately there was no way to do this without
> reading lots of extra columns).
>
> So I am really confused whether:
>
> 1. Should I really not use the supercolumns for any case at all,
> however appropriate, or I just need to be just careful while realizing
> that supercolumns fit my use case appropriately or what!?
>
> 2. Are there any performance concerns with supercolumns even in the
> cases where they are used most appropriately. Like when you need to
> retrieve the entire supercolumns everytime & max. no of subcolumns
> vary between 0-10.
> (I don't write all the subcolumns inside supercolumn, at once though!
> Does this also matter?)
>
> 3. What is their future? Are they going to be deprecated or may be
> enhanced later?