You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by S Ahmed <sa...@gmail.com> on 2016/09/21 21:40:19 UTC

understanding partitions and # of nodes

Hello,

If you have a 10 node cluster, how does having 10 partitions or 100
partitions change how cassandra will perform?

With 10 partitions you will have 1 partition per node.
WIth 100 partitions you will have 10 partitions per node.

With 100 partitions I guess it helps because when you add more nodes to
your cluster, the data can be redistributed since you have more nodes.

What else are things to consider?

Thanks.

Re: understanding partitions and # of nodes

Posted by Jens Rantil <je...@tink.se>.
By "partitions" I assume you refer to "partition keys".

Generally, the more partitions keys, the better. Having more partition keys
means your data generally is spread out more evenly across the cluster,
makes repairs run faster (or so I've heard), makes adding new nodes more
smooth, and makes it less likely that you are at hitting tombstone limits.

Also, 100 partition keys in a Cassandra table is nothing. If you don't have
more partition keys than that, Cassandra might not be the right fit.

Cheers,
Jens

On Wednesday, September 21, 2016, S Ahmed <sa...@gmail.com> wrote:

> Hello,
>
> If you have a 10 node cluster, how does having 10 partitions or 100
> partitions change how cassandra will perform?
>
> With 10 partitions you will have 1 partition per node.
> WIth 100 partitions you will have 10 partitions per node.
>
> With 100 partitions I guess it helps because when you add more nodes to
> your cluster, the data can be redistributed since you have more nodes.
>
> What else are things to consider?
>
> Thanks.
>


-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>

Re: understanding partitions and # of nodes

Posted by Jeff Jirsa <je...@crowdstrike.com>.
It If you only have 100 partitions, then having more than (100 * RF) nodes doesn’t help you much.

 

However, unless you’re using very specific partitioners, there’s no guarantee that you’ll have 1 partition per node (with 10 nodes / 10 partitions).

 

Cassandra uses murmur3 hash (by default, and md5 in old versions) to hash the partition key to place data onto a node. You have very little control over distribution – murmur3 and md5 are both sufficiently distributed that you’re likely to have a good distribution on sufficiently high number of partitions, but with 100 partitions, you’re going to have a miserable time.

 

If your data model is such that you’re only ever going to have 100 partitions, your data model is broken, or you should use some other database. 

 

 

 

From: S Ahmed <sa...@gmail.com>
Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Date: Wednesday, September 21, 2016 at 2:40 PM
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: understanding partitions and # of nodes

 

Hello, 

 

If you have a 10 node cluster, how does having 10 partitions or 100 partitions change how cassandra will perform?

 

With 10 partitions you will have 1 partition per node.

WIth 100 partitions you will have 10 partitions per node.

 

With 100 partitions I guess it helps because when you add more nodes to your cluster, the data can be redistributed since you have more nodes.

 

What else are things to consider?

 

Thanks.