You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Edward Sargisson <ed...@globalrelay.net> on 2012/06/05 18:30:31 UTC

Performance impact of static vs dynamic columns and mixing the two in the same CF

Hi all,
A question has come up in our team about the performance impact of 
static vs dynamic columns. We'd like to ask two questions:

Quick background: We are using a custom app to write to Cassandra using 
Hector. Production is Solaris and pre-prod is generally Centos. We're 
currently on 0.7 but will be moving to 1.1 very shortly.

1. Does specifying the type of a column affect performance other than 
the cost of validating data as it is stored?
e.g. does it help compaction, etc?
 From my reading of the docs the advantage is that the data will be 
validated on write and that the various dev tools can deserialize into a 
human readable form easily.

2. Is there any impact to mixing static and dynamic columns in the same 
column family? (Follow-up question: is this far outside of the 
designers' intentions and thus unsafe?)
The docs seem to indicate that the designers think of static column 
families and dynamic column families and *not* a mixture of the two.

My mental model is that a column is just a column. It's possible to 
specify some metadata about columns for validation and display but 
that's about it. Is there something to change this model?

Thanks in advance for any comments.

Cheers,
Edward


-- 

Edward Sargisson

senior java developer
Global Relay

edward.sargisson@globalrelay.net <ma...@globalrelay.net>


*866.484.6630*
New York | Chicago | Vancouver | London  (+44.0800.032.9829) | 
Singapore  (+65.3158.1301)

Global Relay Archive supports email, instant messaging, BlackBerry, 
Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, 
Facebook and more.


Ask about *Global Relay Message* 
<http://www.globalrelay.com/services/message>*--- *The Future of 
Collaboration in the Financial Services World

*
*All email sent to or from this address will be retained by Global 
Relay's email archiving system. This message is intended only for the 
use of the individual or entity to which it is addressed, and may 
contain information that is privileged, confidential, and exempt from 
disclosure under applicable law.  Global Relay will not be liable for 
any compliance or technical information provided herein.  All trademarks 
are the property of their respective owners.


Re: Performance impact of static vs dynamic columns and mixing the two in the same CF

Posted by aaron morton <aa...@thelastpickle.com>.
(I'm assuming you are talking about column values here)

> 1. Does specifying the type of a column affect performance other than the cost of validating data as it is stored?
> e.g. does it help compaction, etc?
No. 
Validation is normally pretty light weight. 


> From my reading of the docs the advantage is that the data will be validated on write and that the various dev tools can deserialize into a human readable form easily.
Thats about it. 

> 2. Is there any impact to mixing static and dynamic columns in the same column family? (Follow-up question: is this far outside of the     designers' intentions and thus unsafe?)
I'm not sure what you mean by static and dynamic. 

> My mental model is that a column is just a column. It's possible to specify some metadata about columns for validation and display but that's about it. Is there something to change this model?
That's my model too. Exceptions are secondary indexes and any data type requirements that CQL and associated db drivers need.

I think some people take the approach of specifying the columns in the CF definition, this is just my personal approach. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/06/2012, at 4:30 AM, Edward Sargisson wrote:

> Hi all,
> A question has come up in our team about the performance impact of static vs dynamic columns. We'd like to ask two questions:
> 
> Quick background: We are using a custom app to write to Cassandra using Hector. Production is Solaris and pre-prod is generally Centos. We're currently on 0.7 but will be moving to 1.1 very shortly.
> 
> 1. Does specifying the type of a column affect performance other than the cost of validating data as it is stored?
> e.g. does it help compaction, etc?
> From my reading of the docs the advantage is that the data will be validated on write and that the various dev tools can deserialize into a human readable form easily.
> 
> 2. Is there any impact to mixing static and dynamic columns in the same column family? (Follow-up question: is this far outside of the     designers' intentions and thus unsafe?)
> The docs seem to indicate that the designers think of static column families and dynamic column families and *not* a mixture of the two.
> 
> My mental model is that a column is just a column. It's possible to specify some metadata about columns for validation and display but that's about it. Is there something to change this model?
> 
> Thanks in advance for any comments.
> 
> Cheers,
> Edward
> 
> 
> -- 
> Edward Sargisson
> senior java developer
> Global Relay
> 
> edward.sargisson@globalrelay.net
> 
> 
> 866.484.6630 
> New York | Chicago | Vancouver  |  London  (+44.0800.032.9829)  |  Singapore  (+65.3158.1301)
> 
> Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. 
> 
> Ask about Global Relay Message — The Future of Collaboration in the Financial Services World
> 
> All email sent to or from this address will be retained by Global Relay’s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law.  Global Relay will not be liable for any compliance or technical information provided herein.  All trademarks are the property of their respective owners.