You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Ariel Weisberg (JIRA)" <ji...@apache.org> on 2019/01/02 18:30:00 UTC

[jira] [Resolved] (CASSANDRA-14932) removenode coordinator, and its hints data will be lost

     [ https://issues.apache.org/jira/browse/CASSANDRA-14932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ariel Weisberg resolved CASSANDRA-14932.
----------------------------------------
    Resolution: Not A Problem

This is the expected behavior for ANY. 

You don't want to remove node if the node is available and you can decommission. Less risk of data loss.

If you want to avoid losing hints you need to make sure the node you are about to remove has no pending hints. However if you care about your data you shouldn't be relying on hints anyways. It's really going against the grain reliability wise because hints are not replicated at multiple nodes so the guarantees are already weak.

If hint dispatch is enabled and running, disable native/rpc so no new hints are generated, and then wait for them to be delivered. I think once you have started hint delivery (HintsServiceMXBean) you wait for the thread pool to have no pending or running tasks for hint delivery for a while. It's probably always attempting to deliver them anyways unless you disabled it.

I would do a bit more research on how to properly drain hints. Maybe ask on the user list. 

> removenode coordinator, and its hints data will be lost
> -------------------------------------------------------
>
>                 Key: CASSANDRA-14932
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14932
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: cassandra version 3.11.3
>            Reporter: sunhaihong
>            Assignee: sunhaihong
>            Priority: Major
>
> There are four nodes in cluster. assume them are node A, B, C, D. enabled hinted handoff.
> 1) create a keyspace with RF=2, and create a table.
> 2) make node B, C down(nodetool stopdaemon),
> 3) login in node A with cqlsh,set CONSISTENCY ANY, insert into a row(assume the row will be stored in node B and C). The row was successfully inserted even though the node B,C was down, because the consistency level is ANY. the coordinator(node A) wrote hints.
> 4) make node A down(nodetool stopdaemon), then remove node A(nodetool removenode ${nodeA_hostId})
> 5) make node B, C come back(nodetool start)
> 6) login in any node of B, C, D. and execute select statement with partition key of inserted row. But there is no any data that inserted row on step 3. 
>  
> These steps lead to data(on step 3 was inserted row) lost.
> Is there any problem with the steps I performed above? 
> If yes, How to deal with this situation?
> look forward to your reply,  thanks.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org