You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by Michael Fong <mi...@ruckuswireless.com> on 2016/06/07 02:57:51 UTC

Gossip Behavioral Difference between C* 2.0 and C* 2.1

Hi,

We recently discovered that there are some differences in gossip behavior between C* 2.0 and C* 2.1. In some cases of network instability or a node reboot, we can observe some behavioral differences from Cassandra/system.log.

2.0.17
We can observe this log of similar pattern in log :
DEBUG [RequestResponseStage:3] 2016-04-19 11:18:18,332 Gossiper.java (line 977) removing expire time for endpoint : /192.168.88.34
INFO [RequestResponseStage:3] 2016-04-19 11:18:18,333 Gossiper.java (line 978) InetAddress /192.168.88.34 is now UP
DEBUG [RequestResponseStage:4] 2016-04-19 11:18:18,335 Gossiper.java (line 977) removing expire time for endpoint : /192.168.88.34
INFO [RequestResponseStage:4] 2016-04-19 11:18:18,335 Gossiper.java (line 978) InetAddress /192.168.88.34 is now UP
DEBUG [RequestResponseStage:3] 2016-04-19 11:18:18,335 Gossiper.java (line 977) removing expire time for endpoint : /192.168.88.34
INFO [RequestResponseStage:3] 2016-04-19 11:18:18,335 Gossiper.java (line 978) InetAddress /192.168.88.34 is now UP
....

It seems the longer for the node to regain connection (or reboot), the more accumulated gossip message, and the more gossip message will appear afterwards.

However,  in 2.1, we do not observe this kind of behavior any more. There seems to be some fundamental changes on gossip protocol. Did anyone also observe the similar pattern, or could kindly point out which changes (JIRA #) that made of this improvement?

Thanks in advanced!

Sincerely,

Michael Fong