You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefania (JIRA)" <ji...@apache.org> on 2015/07/23 16:41:04 UTC
[jira] [Comment Edited] (CASSANDRA-9871) Cannot replace token does
not exist - DN node removed as Fat Client
[ https://issues.apache.org/jira/browse/CASSANDRA-9871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638902#comment-14638902 ]
Stefania edited comment on CASSANDRA-9871 at 7/23/15 2:40 PM:
--------------------------------------------------------------
bq. can you provide a dump of both nodetool gossipinfo and nodetool status?
{code}
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 127.0.0.1 82.71 KB 256 ? af23fcbb-fce4-495c-b5b5-b0b90ccc71c1 rack1
UN 127.0.0.2 51.57 KB 256 ? 11814d51-5120-4f9f-b5fc-d0ffa534f964 rack1
DN 127.0.0.3 51.59 KB 256 ? 0101e850-7f3a-499c-a80c-092ecf4e27e3 rack1
Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless
/127.0.0.1
generation:1437661129
heartbeat:164
RELEASE_VERSION:2.1.8-SNAPSHOT
SEVERITY:0.0
STATUS:NORMAL,-107708216716906722
DC:datacenter1
NET_VERSION:8
RACK:rack1
HOST_ID:af23fcbb-fce4-495c-b5b5-b0b90ccc71c1
SCHEMA:fa2a3033-51b7-30c0-8926-a2b71bf0fd8a
RPC_ADDRESS:127.0.0.1
LOAD:52781.0
/127.0.0.2
generation:1437661129
heartbeat:166
SEVERITY:0.0
RELEASE_VERSION:2.1.8-SNAPSHOT
STATUS:NORMAL,-1054644930469012369
DC:datacenter1
NET_VERSION:8
RACK:rack1
HOST_ID:11814d51-5120-4f9f-b5fc-d0ffa534f964
SCHEMA:fa2a3033-51b7-30c0-8926-a2b71bf0fd8a
RPC_ADDRESS:127.0.0.2
LOAD:52807.0
/127.0.0.3
generation:1437661129
heartbeat:2147483647
RELEASE_VERSION:2.1.8-SNAPSHOT
SEVERITY:0.0
STATUS:shutdown,true
DC:datacenter1
NET_VERSION:8
RACK:rack1
HOST_ID:0101e850-7f3a-499c-a80c-092ecf4e27e3
SCHEMA:fa2a3033-51b7-30c0-8926-a2b71bf0fd8a
RPC_ADDRESS:127.0.0.3
LOAD:52826.0
{code}
bq. isFatClient returns true as the endpoint is not a member in TokenMetadata and that's why we fail in SS.joinTokenRing (we check to see if the token is associated with a TokenMetadata member).
Yes this is the root cause but why would the node not be a member? I guess handleStateNormal() is never called, so once again isFatClient() is at fault, just like for CASSANDRA-9765?
Anyway, I plan on putting more debug information tomorrow to find out when the TM is modified.
was (Author: stefania):
bq. can you provide a dump of both nodetool gossipinfo and nodetool status?
{code}
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 127.0.0.1 82.71 KB 256 ? af23fcbb-fce4-495c-b5b5-b0b90ccc71c1 rack1
UN 127.0.0.2 51.57 KB 256 ? 11814d51-5120-4f9f-b5fc-d0ffa534f964 rack1
DN 127.0.0.3 51.59 KB 256 ? 0101e850-7f3a-499c-a80c-092ecf4e27e3 rack1
Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless
/127.0.0.1
generation:1437661129
heartbeat:164
RELEASE_VERSION:2.1.8-SNAPSHOT
SEVERITY:0.0
STATUS:NORMAL,-107708216716906722
DC:datacenter1
NET_VERSION:8
RACK:rack1
HOST_ID:af23fcbb-fce4-495c-b5b5-b0b90ccc71c1
SCHEMA:fa2a3033-51b7-30c0-8926-a2b71bf0fd8a
RPC_ADDRESS:127.0.0.1
LOAD:52781.0
/127.0.0.2
generation:1437661129
heartbeat:166
SEVERITY:0.0
RELEASE_VERSION:2.1.8-SNAPSHOT
STATUS:NORMAL,-1054644930469012369
DC:datacenter1
NET_VERSION:8
RACK:rack1
HOST_ID:11814d51-5120-4f9f-b5fc-d0ffa534f964
SCHEMA:fa2a3033-51b7-30c0-8926-a2b71bf0fd8a
RPC_ADDRESS:127.0.0.2
LOAD:52807.0
/127.0.0.3
generation:1437661129
heartbeat:2147483647
RELEASE_VERSION:2.1.8-SNAPSHOT
SEVERITY:0.0
STATUS:shutdown,true
DC:datacenter1
NET_VERSION:8
RACK:rack1
HOST_ID:0101e850-7f3a-499c-a80c-092ecf4e27e3
SCHEMA:fa2a3033-51b7-30c0-8926-a2b71bf0fd8a
RPC_ADDRESS:127.0.0.3
LOAD:52826.0
{code}
bq. isFatClient returns true as the endpoint is not a member in TokenMetadata and that's why we fail in SS.joinTokenRing (we check to see if the token is associated with a TokenMetadata member).
Yes this is the root cause but why would the node not be a member?
Anyway, I plan on putting more debug information tomorrow to find out when the TM is modified.
> Cannot replace token does not exist - DN node removed as Fat Client
> -------------------------------------------------------------------
>
> Key: CASSANDRA-9871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9871
> Project: Cassandra
> Issue Type: Bug
> Reporter: Sebastian Estevez
> Assignee: Stefania
> Fix For: 2.1.x
>
>
> We lost a node due to disk failure, we tried to replace it via -Dcassandra.replace_address per -- http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/opsReplaceNode.html
> The node would not come up with these errors in the system.log:
> {code}
> INFO [main] 2015-07-22 03:20:06,722 StorageService.java:500 - Gathering node replacement information for /10.171.115.233
> ...
> INFO [SharedPool-Worker-1] 2015-07-22 03:22:34,281 Gossiper.java:954 - InetAddress /10.111.183.101 is now UP
> INFO [GossipTasks:1] 2015-07-22 03:22:59,300 Gossiper.java:735 - FatClient /10.171.115.233 has been silent for 30000ms, removing from gossip
> ERROR [main] 2015-07-22 03:23:28,485 CassandraDaemon.java:541 - Exception encountered during startup
> java.lang.UnsupportedOperationException: Cannot replace token -1013652079972151677 which does not exist!
> {code}
> It is not clear why Gossiper removed the node as a FatClient, given that it was a full node before it died and it had tokens assigned to it (including -1013652079972151677) in system.peers and nodetool ring.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)