You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Siddharth Verma <ve...@snapdeal.com> on 2016/07/01 10:45:20 UTC

Re: Performance impact on schema

If anyone, who has done a POC for the same, could share his/her views.
Any help would be appreciated.

Thanks
Siddharth Verma

Re: Performance impact on schema

Posted by Eric Stevens <mi...@gmail.com>.
If a given partition only ever contains one set of those columns, it
probably makes no practical difference, though it suggests an unintuitive
data model, so you might break it up just because it no longer seems to
make sense to keep them together.

If you really don't ever overlap your columns during reads or writes, and
assuming that there exist partitions where more than one of those sets
columns are present, then yes you'll benefit from splitting the table up.

Depending on how far apart time wise your writes are, writing to the same
partition key multiple times will cause your records to span sstables,
slowing down reads because you'll have to lift sstables that won't contain
any relevant columns.

Also, within an sstable, data for a given partition sits adjacent on disk,
so if you're on magnetic media, your read head will be skipping over the
unwanted columns of the wider partition (unless all the column names of a
given set sort together, such as if they contain a common prefix).
In short, there's probably several cases where you could be hurting
performance by commingling them - and with no use case where you read
columns from multiple of those sets at once, there's no advantage to
keeping them in the same table.

On Fri, Jul 1, 2016, 4:45 AM Siddharth Verma <ve...@snapdeal.com>
wrote:

> If anyone, who has done a POC for the same, could share his/her views.
> Any help would be appreciated.
>
> Thanks
> Siddharth Verma
>