You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Benjamin Christenson <be...@kineticdata.com> on 2020/06/09 13:51:35 UTC
Partition size, limits, recommendations for tables where all columns
are part of the primary key
Hello all, I am doing some data modeling and want to make sure that I
understand some nuances to cell counts, partition sizes, and related
recommendations. Am I correct in my understanding that tables for which
every column is in the primary key will always have 0 cells?
For example, using https://cql-calculator.herokuapp.com/, I tested the
following table definition with 1000000 (1 million) rows per partition and
an average value size of 255 bytes, and it returned that there were 0 cells
and the partition took up 32 bytes total:
CREATE TABLE IF NOT EXISTS widgets (
id timeuuid,
key_id timeuuid,
parent_id timeuuid,
value text,
PRIMARY KEY ((parent_id, key_id), value, id)
)
Obviously the total amount of disk space for this table must be more than
32 bytes. In this situation, how should I be reasoning about partition
sizes (in terms of the 2B cell limit, and 100MB-400MB partition size
limit)? Additionally, are there other limits / potential performance
issues I should be concerned about?
Ben Christenson
Developer
Kinetic Data, Inc.
Your business. Your process.
651-556-0937 | ben.christenson@kineticdata.com
www.kineticdata.com | community.kineticdata.com
Re: Partition size, limits, recommendations for tables where all
columns are part of the primary key
Posted by Alex Ott <al...@gmail.com>.
Hi
Yes, basically rows have no cells as everything is in the partition
key/clustering columns.
You can always look unto the data using the sstabledump (this is for DSE
6.7 that I have running):
sstabledump ac-1-bti-Data.db
[
{
"partition" : {
"key" : [ "977eb1f1-aa5b-11ea-b91a-db426f6f892c",
"977ed900-aa5b-11ea-b91a-db426f6f892c" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 78,
"clustering" : [ "test", "977ed901-aa5b-11ea-b91a-db426f6f892c" ],
"liveness_info" : { "tstamp" : "2020-06-09T14:14:54.863249Z" },
"cells" : [ ]
}
]
}
]
P.S. You can play with your schema, and do some performance tests using the
https://github.com/nosqlbench/
On Tue, Jun 9, 2020 at 3:51 PM Benjamin Christenson <
ben.christenson@kineticdata.com> wrote:
> Hello all, I am doing some data modeling and want to make sure that I
> understand some nuances to cell counts, partition sizes, and related
> recommendations. Am I correct in my understanding that tables for which
> every column is in the primary key will always have 0 cells?
>
> For example, using https://cql-calculator.herokuapp.com/, I tested the
> following table definition with 1000000 (1 million) rows per partition and
> an average value size of 255 bytes, and it returned that there were 0 cells
> and the partition took up 32 bytes total:
> CREATE TABLE IF NOT EXISTS widgets (
> id timeuuid,
> key_id timeuuid,
> parent_id timeuuid,
> value text,
> PRIMARY KEY ((parent_id, key_id), value, id)
> )
>
> Obviously the total amount of disk space for this table must be more than
> 32 bytes. In this situation, how should I be reasoning about partition
> sizes (in terms of the 2B cell limit, and 100MB-400MB partition size
> limit)? Additionally, are there other limits / potential performance
> issues I should be concerned about?
>
> Ben Christenson
> Developer
>
> Kinetic Data, Inc.
> Your business. Your process.
> 651-556-0937 | ben.christenson@kineticdata.com
> www.kineticdata.com | community.kineticdata.com
>
>
--
With best wishes, Alex Ott
http://alexott.net/
Twitter: alexott_en (English), alexott (Russian)