You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Carl Mueller <ca...@smartthings.com.INVALID> on 2019/09/26 21:30:56 UTC

gossip tuning

We have three datacenters (EU, AP, US) in aws and have problems
bootstrapping new nodes in the AP datacenter:

java.lang.RuntimeException: A node required to move the data
consistently is down (/SOME_NODE). If you wish to move the data from a
potentially inconsistent replica, restart the node with
-Dcassandra.consistent.rangemovement=false

Per the code increasing the ring_delay fixed the bootstrap sleep race
condition. But my research brought up some other questions. We are
generally frustrated with AWS networking but it is what it is, and was
wondering if we could further tune gossip. We have been getting some
intermittent:

Gossip stage has {} pending tasks; skipping status check (no nodes will be
marked down)

Which is the GossipStage queue backing up, and another page indicated
gossip has one thread for its pool.

For large clusters (>30 nodes from what we've seen in crappy aws) ,
wouldn't it be good to increase the threads so more frequent communication
occurs and a slow cross-region gossip connect doesn't slow things? Can the
GossipStage be increased to two threads from the one? Or is the Gossiper
single-threaded/not reentrant?

Is the random ordering of selecting a node to gossip-check a shuffle of the
known IPs to contact, or a totally random pick-one every second? The second
interval is based on a comment in the code, is that the gossip interval,
and can that be changed?

Would it be a bad idea to create a prioritized gossip request for nodes
that are up for a while to respond to bootstrapping nodes / nodes that are
known to be restarting? Is that even possible? That might also get hinted
handoff processing proceeding more quickly.

For multidatacenter, it also would make sense to have gossip process local
datacenter gossip requests more quickly (or have a thread dedicated to
fastracking that) and the "far" nodes in a different thread? Again, to
prevent cross-pacific gossip requests holding up other requests that can be
more quickly handled?