You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Boris Iordanov <bi...@sabaltech.com> on 2017/07/25 20:23:21 UTC

performance penalty of add column in CQL3

Hi,

Is "alter table t add column..." an expensive operation? For example, if it's something to be triggered at an admin level of an application (i.e. not frequently), is it ok? It won’t trigger rewriting all the data, right? 

My goal is not to have super wide tables, I know that’s not the practice with CQL3, but I still want to be able to add a column occasionally to a potentially big data set without a huge penalty. What are the gothas?

Thanks!
Boris
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Re: performance penalty of add column in CQL3

Posted by Boris Iordanov <bi...@sabaltech.com>.
Thank you!
Boris

> On Jul 25, 2017, at 7:44 PM, kurt greaves <ku...@instaclustr.com> wrote:
> 
> If by "offline" you mean with no reads going to the nodes, then yes that would be a potentially safe time to do it, but it's still not advised. You should avoid doing any ALTERs on versions of 3 less than 3.0.14 or 3.11 if possible. 
> 
> Adding/dropping a column does not require a re-write of the data and it is relatively efficient (it should take seconds, not hours). It's just a schema change, so just requires gossip to propagate the schema between the nodes. Note that if you drop a column all the data in that column is not actually removed from the SSTables until they are compacted. I believe they are effectively treated as tombstones if you hit the dropped column (not sure if that's how the metrics record them though).​


Re: performance penalty of add column in CQL3

Posted by kurt greaves <ku...@instaclustr.com>.
If by "offline" you mean with no reads going to the nodes, then yes that
would be a *potentially *safe time to do it, but it's still not advised.
You should avoid doing any ALTERs on versions of 3 less than 3.0.14 or 3.11
if possible.

Adding/dropping a column does not require a re-write of the data and it is
relatively efficient (it should take seconds, not hours). It's just a
schema change, so just requires gossip to propagate the schema between the
nodes. Note that if you drop a column all the data in that column is not
actually removed from the SSTables until they are compacted. I believe they
are effectively treated as tombstones if you hit the dropped column (not
sure if that's how the metrics record them though).​

Re: performance penalty of add column in CQL3

Posted by Boris Iordanov <bi...@sabaltech.com>.
No, I hadn’t, thanks much for pointing it out!

The take away for me is then that such an update should be done offline. In that case, a schema change would be safe and relatively efficient (wouldn’t take hours). If that assumption is wrong, could anybody let me know?

Thanks much,
Boris

> On Jul 25, 2017, at 5:28 PM, Fay Hou [Storage Service] ­ <fa...@coupang.com> wrote:
> 
> are you aware this ticket?
> https://issues.apache.org/jira/browse/CASSANDRA-13004 <https://issues.apache.org/jira/browse/CASSANDRA-13004>
> 
> On Tue, Jul 25, 2017 at 1:23 PM, Boris Iordanov <biordanov@sabaltech.com <ma...@sabaltech.com>> wrote:
> Hi,
> 
> Is "alter table t add column..." an expensive operation? For example, if it's something to be triggered at an admin level of an application (i.e. not frequently), is it ok? It won’t trigger rewriting all the data, right?
> 
> My goal is not to have super wide tables, I know that’s not the practice with CQL3, but I still want to be able to add a column occasionally to a potentially big data set without a huge penalty. What are the gothas?
> 
> Thanks!
> Boris
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org <ma...@cassandra.apache.org>
> For additional commands, e-mail: user-help@cassandra.apache.org <ma...@cassandra.apache.org>
> 
> 


Re: performance penalty of add column in CQL3

Posted by Fay, , Storage, , ­ <fa...@coupang.com>.
are you aware this ticket?
https://issues.apache.org/jira/browse/CASSANDRA-13004

On Tue, Jul 25, 2017 at 1:23 PM, Boris Iordanov <bi...@sabaltech.com>
wrote:

> Hi,
>
> Is "alter table t add column..." an expensive operation? For example, if
> it's something to be triggered at an admin level of an application (i.e.
> not frequently), is it ok? It won’t trigger rewriting all the data, right?
>
> My goal is not to have super wide tables, I know that’s not the practice
> with CQL3, but I still want to be able to add a column occasionally to a
> potentially big data set without a huge penalty. What are the gothas?
>
> Thanks!
> Boris
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>