You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Marc Canaleta <mc...@gmail.com> on 2012/06/06 14:32:24 UTC

Node decomission failed

Hi,

We are testing Cassandra and tried to remove a node from the cluster using
nodetool decomission. The node transferred the data, then "died" for about
20 minutes without responding, then came back to life with a load of
50-100, was in a heavy load during about 1 hour and then returned to normal
load. It seems to have stopped receiving new data but it is still in the
cluster.

The node we tried to remove is the third one:

root@dc-cassandra-03:~# nodetool ring
Note: Ownership information does not include topology, please specify a
keyspace.
Address         DC          Rack        Status State   Load            Owns
               Token

               113427455640312821154458202477256070484
10.70.147.62    datacenter1 rack1       Up     Normal  7.14 GB
33.33%              0
10.208.51.64    datacenter1 rack1       Up     Normal  3.68 GB
33.33%              56713727820156410577229101238628035242
10.190.207.185  datacenter1 rack1       Up     Normal  3.54 GB
33.33%              113427455640312821154458202477256070484


It seems it is still part of the cluster. What should we do? decomission
again?

How can we know the current state of the cluster?

Thanks!

Re: Node decomission failed

Posted by aaron morton <aa...@thelastpickle.com>.

Take a look in the logs for .185 and check for errors. 

Run node tool ring from node .62 to see if it thinks .185 is in the ring. 

if all looks good, try to decomission again. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 7/06/2012, at 12:32 AM, Marc Canaleta wrote:

> Hi,
> 
> We are testing Cassandra and tried to remove a node from the cluster using nodetool decomission. The node transferred the data, then "died" for about 20 minutes without responding, then came back to life with a load of 50-100, was in a heavy load during about 1 hour and then returned to normal load. It seems to have stopped receiving new data but it is still in the cluster.
> 
> The node we tried to remove is the third one:
> 
> root@dc-cassandra-03:~# nodetool ring
> Note: Ownership information does not include topology, please specify a keyspace. 
> Address         DC          Rack        Status State   Load            Owns                Token                                       
>                                                                                            113427455640312821154458202477256070484     
> 10.70.147.62    datacenter1 rack1       Up     Normal  7.14 GB         33.33%              0                                           
> 10.208.51.64    datacenter1 rack1       Up     Normal  3.68 GB         33.33%              56713727820156410577229101238628035242      
> 10.190.207.185  datacenter1 rack1       Up     Normal  3.54 GB         33.33%              113427455640312821154458202477256070484    
> 
> 
> It seems it is still part of the cluster. What should we do? decomission again?
> 
> How can we know the current state of the cluster?
> 
> Thanks!