You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by jerome <je...@hotmail.com> on 2016/09/14 18:49:05 UTC

How Fast Does Information Spread With Gossip?

Hi,


I was curious if anyone had any kind of statistics or ballpark figures on how long it takes information to propagate through a cluster with Gossip? I'm particularly interested in how fast information about the liveness of a node spreads. For example, in an n-node cluster the median amount of time it takes for all nodes to learn that a node went down is f(n) seconds. Is a minute a reasonable upper bound for most clusters? Too high, too low?


Thanks,

Jerome

Re: How Fast Does Information Spread With Gossip?

Posted by Ben Bromhead <be...@instaclustr.com>.
Gossip propagation is generally best modelled by epidemic algorithms.

Luckily for us Cassandra's gossip protocol is fairly simply.

Cassandra will perform one Gossip Task every second. Within each gossip
task it will randomly gossip with another available node in the cluster, it
will also possibly attempt to gossip with a down node (based on a random
chance that increases as the number of down nodes increases) and if it
hasn't gossiped with seed that round it may also attempt to gossip with a
defined seed. So Cassandra can do up to 3 rounds per second, however these
extra rounds are supposed to be optimizations for improving average case
convergence and recovering from split brain scenarios quicker than would
normally occur.

Assuming just one gossip round per second, for a new piece of information
to spread to all members of the cluster via gossip, you would see a worst
case performance of O(n) gossip rounds where n is the number of nodes in
the cluster. This is because each Cassandra node can gossip to any other
node irrespective of topology (a fully connected mesh).

There is some ongoing discussion about expanding gossip to utilise partial
views of the cluster and exchanging those, or using spanning/broadcast
trees to speed up convergence and reduce workload in large clusters (1000+)
nodes, see https://issues.apache.org/jira/browse/CASSANDRA-12345 for
details.



On Fri, 16 Sep 2016 at 01:01 Jens Rantil <je...@tink.se> wrote:

> > Is a minute a reasonable upper bound for most clusters?
>
> I have no numbers and I'm sure this differs depending on how large your
> cluster is. We have a small cluster of around 12 nodes and I statuses
> generally propagate in under 5 seconds for sure. So, it will definitely be
> less than 1 minute.
>
> Cheers,
> Jens
>
> On Wed, Sep 14, 2016 at 8:49 PM jerome <je...@hotmail.com> wrote:
>
>> Hi,
>>
>>
>> I was curious if anyone had any kind of statistics or ballpark figures on
>> how long it takes information to propagate through a cluster with Gossip?
>> I'm particularly interested in how fast information about the liveness of a
>> node spreads. For example, in an n-node cluster the median amount of time
>> it takes for all nodes to learn that a node went down is f(n) seconds. Is a
>> minute a reasonable upper bound for most clusters? Too high, too low?
>>
>>
>> Thanks,
>>
>> Jerome
>>
> --
>
> Jens Rantil
> Backend Developer @ Tink
>
> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
> For urgent matters you can reach me at +46-708-84 18 32.
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer

Re: How Fast Does Information Spread With Gossip?

Posted by Jens Rantil <je...@tink.se>.
> Is a minute a reasonable upper bound for most clusters?

I have no numbers and I'm sure this differs depending on how large your
cluster is. We have a small cluster of around 12 nodes and I statuses
generally propagate in under 5 seconds for sure. So, it will definitely be
less than 1 minute.

Cheers,
Jens

On Wed, Sep 14, 2016 at 8:49 PM jerome <je...@hotmail.com> wrote:

> Hi,
>
>
> I was curious if anyone had any kind of statistics or ballpark figures on
> how long it takes information to propagate through a cluster with Gossip?
> I'm particularly interested in how fast information about the liveness of a
> node spreads. For example, in an n-node cluster the median amount of time
> it takes for all nodes to learn that a node went down is f(n) seconds. Is a
> minute a reasonable upper bound for most clusters? Too high, too low?
>
>
> Thanks,
>
> Jerome
>
-- 

Jens Rantil
Backend Developer @ Tink

Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
For urgent matters you can reach me at +46-708-84 18 32.

Re: How Fast Does Information Spread With Gossip?

Posted by Eric Evans <jo...@gmail.com>.
On Wed, Sep 14, 2016 at 1:49 PM, jerome <je...@hotmail.com> wrote:
> I was curious if anyone had any kind of statistics or ballpark figures on
> how long it takes information to propagate through a cluster with Gossip?
> I'm particularly interested in how fast information about the liveness of a
> node spreads. For example, in an n-node cluster the median amount of time it
> takes for all nodes to learn that a node went down is f(n) seconds. Is a
> minute a reasonable upper bound for most clusters? Too high, too low?

Dahlia Malkhi gave a talk on gossip protocols at the Papers We Love
conference last Thursday (http://pwlconf.org/dahlia-malkhi/), and she
answered this better than I ever could.  The video of her presentation
hasn't been posted yet, I'm told it should be as early as later today
though.  You can look for at on the Papers We Love Youtube channel
(https://www.youtube.com/channel/UCoj4eQh_dZR37lL78ymC6XA), and it'll
be announced on the website (http://paperswelove.org/).

-- 
Eric Evans
john.eric.evans@gmail.com