You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jeremiah Jordan (JIRA)" <ji...@apache.org> on 2016/07/05 15:33:11 UTC

[jira] [Comment Edited] (CASSANDRA-11740) Nodes have wrong membership view of the cluster

    [ https://issues.apache.org/jira/browse/CASSANDRA-11740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15362641#comment-15362641 ] 

Jeremiah Jordan edited comment on CASSANDRA-11740 at 7/5/16 3:32 PM:
---------------------------------------------------------------------

[~dikanggu] Yes, the peers table should be the truth.

[~jkni] we should probably switch the order of those checks.  I would think we should check the peers table before checking the cassandra-topology.properties.  The peers table will contain the information from the last time we got gossip status from those nodes, so if there are values in it we should probably use those before seeing if the cassandra-topology.properties has something (especially since the cassandra-topology.properties very often contains a catch all).


was (Author: jjordan):
[~dikanggu] Yes, the peers table should be the truth.

[~jkni] we should probably switch the order of those checks.  I would think we should check the peers table before checking the cassandra-topology.properties.  The peers table will contain the information from the last time we got gossip status from those nodes, so if there are values in it we should probably use those before seeing if the cassandra-topology.properties has something.

> Nodes have wrong membership view of the cluster
> -----------------------------------------------
>
>                 Key: CASSANDRA-11740
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11740
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Dikang Gu
>            Assignee: Joel Knighton
>             Fix For: 2.2.x, 3.x
>
>
> We have a few hundreds nodes across 3 data centers, and we are doing a few millions writes per second into the cluster.
> The problem we found is that there are some nodes (>10) have very wrong view of the cluster.
> For example, we have 3 data centers A, B and C. On the problem nodes, in the output of the 'nodetool status', it shows that ~100 nodes are not in data center A, B, or C. Instead, it shows nodes are in DC1, and rack r1, which is very wrong. And as a result, the node will return wrong results to client requests.
> {code}
> Datacenter: DC1
> ===============
> Status=Up/Down
> / State=Normal/Leaving/Joining/Moving
> – Address Load Tokens Owns Host ID Rack
> UN 2401:db00:11:6134:face:0:1:0 509.52 GB 256 ? e24656ac-c3b2-4117-b933-a5b06852c993 r1
> UN 2401:db00:11:b218:face:0:5:0 510.01 GB 256 ? 53da2104-b1b5-4fa5-a3dd-52c7557149f9 r1
> UN 2401:db00:2130:5133:face:0:4d:0 459.75 GB 256 ? ef8311f0-f6b8-491c-904d-baa925cdd7c2 r1
> {code}
> We are using GossipingPropertyFileSnitch.
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)