You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Joe Obernberger <jo...@gmail.com> on 2023/01/23 17:40:37 UTC

removenode stuck - cassandra 4.1.0

I had a drive fail (first drive in the list) on a Cassandra cluster.  
I've stopped the node (as it no longer starts), and am trying to remove 
it from the cluster, but the removenode command is hung (been running 
for 3 hours so far):
nodetool removenode status is always reporting the same token as being 
removed.  Help?

nodetool removenode status
RemovalStatus: Removing token (-9196617215347134065). Waiting for 
replication confirmation from 
[/172.16.100.248,/172.16.100.249,/172.16.100.251,/172.16.100.252,/172.16.100.34,/172.16.100.35,/172.16.100.36,/172.16.100.37,/172.16.100.38,/172.16.100.42,/172.16.100.44,/172.16.100.45].

Thanks.

-Joe


-- 
This email has been checked for viruses by AVG antivirus software.
www.avg.com

Re: removenode stuck - cassandra 4.1.0

Posted by Joe Obernberger <jo...@gmail.com>.
Thank you - I was just impatient.  :)

-Joe

On 1/23/2023 12:56 PM, Jeff Jirsa wrote:
> Those hosts are likely sending streams.
>
> If you do `nodetool netstats` on the replicas of the node you're 
> removing, you should see byte counters and file counters - they should 
> all be incrementing. If one of them isnt incremening, that one is 
> probably stuck.
>
> There's at least one bug in 4.1 that can cause (I think? rate 
> limiters) to interact in a way that can cause this. 
> https://issues.apache.org/jira/browse/CASSANDRA-18110 describes it and 
> has a workaround.
>
>
>
> On Mon, Jan 23, 2023 at 9:41 AM Joe Obernberger 
> <jo...@gmail.com> wrote:
>
>     I had a drive fail (first drive in the list) on a Cassandra cluster.
>     I've stopped the node (as it no longer starts), and am trying to
>     remove
>     it from the cluster, but the removenode command is hung (been running
>     for 3 hours so far):
>     nodetool removenode status is always reporting the same token as
>     being
>     removed.  Help?
>
>     nodetool removenode status
>     RemovalStatus: Removing token (-9196617215347134065). Waiting for
>     replication confirmation from
>     [/172.16.100.248 <http://172.16.100.248>,/172.16.100.249
>     <http://172.16.100.249>,/172.16.100.251
>     <http://172.16.100.251>,/172.16.100.252
>     <http://172.16.100.252>,/172.16.100.34
>     <http://172.16.100.34>,/172.16.100.35
>     <http://172.16.100.35>,/172.16.100.36
>     <http://172.16.100.36>,/172.16.100.37
>     <http://172.16.100.37>,/172.16.100.38
>     <http://172.16.100.38>,/172.16.100.42
>     <http://172.16.100.42>,/172.16.100.44
>     <http://172.16.100.44>,/172.16.100.45 <http://172.16.100.45>].
>
>     Thanks.
>
>     -Joe
>
>
>     -- 
>     This email has been checked for viruses by AVG antivirus software.
>     www.avg.com <http://www.avg.com>
>

Re: removenode stuck - cassandra 4.1.0

Posted by Jeff Jirsa <jj...@gmail.com>.
Those hosts are likely sending streams.

If you do `nodetool netstats` on the replicas of the node you're removing,
you should see byte counters and file counters - they should all be
incrementing. If one of them isnt incremening, that one is probably stuck.

There's at least one bug in 4.1 that can cause (I think? rate limiters) to
interact in a way that can cause this.
https://issues.apache.org/jira/browse/CASSANDRA-18110 describes it and has
a workaround.



On Mon, Jan 23, 2023 at 9:41 AM Joe Obernberger <
joseph.obernberger@gmail.com> wrote:

> I had a drive fail (first drive in the list) on a Cassandra cluster.
> I've stopped the node (as it no longer starts), and am trying to remove
> it from the cluster, but the removenode command is hung (been running
> for 3 hours so far):
> nodetool removenode status is always reporting the same token as being
> removed.  Help?
>
> nodetool removenode status
> RemovalStatus: Removing token (-9196617215347134065). Waiting for
> replication confirmation from
> [/172.16.100.248,/172.16.100.249,/172.16.100.251,/172.16.100.252,/
> 172.16.100.34,/172.16.100.35,/172.16.100.36,/172.16.100.37,/172.16.100.38
> ,/172.16.100.42,/172.16.100.44,/172.16.100.45].
>
> Thanks.
>
> -Joe
>
>
> --
> This email has been checked for viruses by AVG antivirus software.
> www.avg.com
>