You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Andrew Tolbert <an...@datastax.com> on 2015/08/24 20:48:45 UTC
Re: Maximum Row Limit
Hello Mangeet,
According to CassandraLimitations
<http://wiki.apache.org/cassandra/CassandraLimitations> the maximum cells
per Partition key is 2 billion:
>
> - The maximum number of cells (rows x columns) in a single partition
> is 2 billion.
>
>
Therefore the maximum number of rows per partition will be 2 billion /
columns_per_row. i.e. if you have 5 columnns per row you are limited to
400 million rows per partition.
However, it is generally good practice to keep your rows below 100 MB in
size to prevent hotspots/reduce compaction burden. While it is difficult
you can come with a somewhat ballpark row size by looking at the size of
types as documented in the spec in chapter 6
<https://github.com/apache/cassandra/blob/cassandra-2.2.0/doc/native_protocol_v4.spec#L812>
.
As far as addressing how to manage partition data, this might be a better
question for the cassandra users list, but it will really depend on the
kind of data your are modeling and your queries. A few example cases:
1. Time series data: adjust your partition size depending on time
intervals. For example, include the 'month' as part of the partition key
so that is partitioned by day, if you have too many records for a month,
partition by 'day'.
2. Bucket partitioning: have an int column as part of your partition
key that represents a 'bucket', when you write data, write to one of the
buckets (maybe randomly using a mod function to identify a bucket). When
reading, join and read from all buckets. This is a bit hard to work with.
Another alternative is if you are only writing from one place to create a
new bucket once you know you've written so many records, that way the data
is at least kept together around where it is written.
Thanks,
Andy
On Wednesday, August 19, 2015 at 5:08:01 PM UTC-5, Mangeet Singh Eden wrote:
>
> Hello all,
>
> Just curious to know, is there any Row limit per partition key ?
> If so, then how to manage partition data if limit increases.
>
> -Mangeet
>