You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Janne Jalkanen <Ja...@ecyrd.com> on 2013/08/25 10:06:31 UTC
Failed decommission
This on cass 1.2.8
Ring state before decommission
-- Address Load Owns Host ID Token Rack
UN 10.0.0.1 38.82 GB 33.3% 21a98502-dc74-4ad0-9689-0880aa110409 1 1a
UN 10.0.0.2 33.5 GB 33.3% cba6b27a-4982-4f04-854d-cc73155d5f69 56713727820156407428984779325531226110 1b
UN 10.0.0.3 37.41 GB 0.0% 6ba2c7d4-713e-4c14-8df8-f861fb211b0d 56713727820156407428984779325531226111 1b
UN 10.0.0.4 35.7 GB 33.3% bf3d4792-f3e0-4062-afe3-be292bc85ed7 113427455640312814857969558651062452222 1c
Trying to decommission the node
ubuntu@10.0.0.3:~$ nodetool decommission
Exception in thread "main" java.lang.NumberFormatException: For input string: "56713727820156407428984779325531226111"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:444)
at java.lang.Long.parseLong(Long.java:483)
at org.apache.cassandra.service.StorageService.extractExpireTime(StorageService.java:1660)
at org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1515)
at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1234)
at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:949)
at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1116)
at org.apache.cassandra.service.StorageService.leaveRing(StorageService.java:2817)
at org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2861)
at org.apache.cassandra.service.StorageService.decommission(StorageService.java:2808)
Now I'm in a state where the machine is still "up" but "leaving" but I can't seem to get it out of the ring. For example:
% nodetool removenode 6ba2c7d4-713e-4c14-8df8-f861fb211b0d
Exception in thread "main" java.lang.UnsupportedOperationException: Node /10.0.0.3 is alive and owns this ID. Use decommission command to remove it from the ring
Any ideas?
/Janne
Re: Failed decommission
Posted by Janne Jalkanen <Ja...@ecyrd.com>.
Thanks; this worked for me too.
/Janne
On Aug 25, 2013, at 18:47 , Mike Heffner <mi...@librato.com> wrote:
> Janne,
>
> We ran into this too. Appears it's a bug in 1.2.8 that is fixed in the upcoming 1.2.9. I added the steps I took to finally remove the node here: https://issues.apache.org/jira/browse/CASSANDRA-5857?focusedCommentId=13748998&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13748998
>
>
> Cheers,
>
> Mike
>
>
> On Sun, Aug 25, 2013 at 4:06 AM, Janne Jalkanen <Ja...@ecyrd.com> wrote:
> This on cass 1.2.8
>
> Ring state before decommission
>
> -- Address Load Owns Host ID Token Rack
> UN 10.0.0.1 38.82 GB 33.3% 21a98502-dc74-4ad0-9689-0880aa110409 1 1a
> UN 10.0.0.2 33.5 GB 33.3% cba6b27a-4982-4f04-854d-cc73155d5f69 56713727820156407428984779325531226110 1b
> UN 10.0.0.3 37.41 GB 0.0% 6ba2c7d4-713e-4c14-8df8-f861fb211b0d 56713727820156407428984779325531226111 1b
> UN 10.0.0.4 35.7 GB 33.3% bf3d4792-f3e0-4062-afe3-be292bc85ed7 113427455640312814857969558651062452222 1c
>
> Trying to decommission the node
>
> ubuntu@10.0.0.3:~$ nodetool decommission
> Exception in thread "main" java.lang.NumberFormatException: For input string: "56713727820156407428984779325531226111"
> at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:444)
> at java.lang.Long.parseLong(Long.java:483)
> at org.apache.cassandra.service.StorageService.extractExpireTime(StorageService.java:1660)
> at org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1515)
> at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1234)
> at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:949)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1116)
> at org.apache.cassandra.service.StorageService.leaveRing(StorageService.java:2817)
> at org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2861)
> at org.apache.cassandra.service.StorageService.decommission(StorageService.java:2808)
>
> Now I'm in a state where the machine is still "up" but "leaving" but I can't seem to get it out of the ring. For example:
>
> % nodetool removenode 6ba2c7d4-713e-4c14-8df8-f861fb211b0d
> Exception in thread "main" java.lang.UnsupportedOperationException: Node /10.0.0.3 is alive and owns this ID. Use decommission command to remove it from the ring
>
> Any ideas?
>
> /Janne
>
>
>
> --
>
> Mike Heffner <mi...@librato.com>
> Librato, Inc.
>
Re: Failed decommission
Posted by Jon Haddad <jo...@jonhaddad.com>.
We ran into a similar issue as well. I believe we removed the node via cqlsh from the system keyspace, restarted the cluster, then ran a repair. I'm not sure how safe this really is though.
On Aug 25, 2013, at 8:47 AM, Mike Heffner <mi...@librato.com> wrote:
> Janne,
>
> We ran into this too. Appears it's a bug in 1.2.8 that is fixed in the upcoming 1.2.9. I added the steps I took to finally remove the node here: https://issues.apache.org/jira/browse/CASSANDRA-5857?focusedCommentId=13748998&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13748998
>
>
> Cheers,
>
> Mike
>
>
> On Sun, Aug 25, 2013 at 4:06 AM, Janne Jalkanen <Ja...@ecyrd.com> wrote:
> This on cass 1.2.8
>
> Ring state before decommission
>
> -- Address Load Owns Host ID Token Rack
> UN 10.0.0.1 38.82 GB 33.3% 21a98502-dc74-4ad0-9689-0880aa110409 1 1a
> UN 10.0.0.2 33.5 GB 33.3% cba6b27a-4982-4f04-854d-cc73155d5f69 56713727820156407428984779325531226110 1b
> UN 10.0.0.3 37.41 GB 0.0% 6ba2c7d4-713e-4c14-8df8-f861fb211b0d 56713727820156407428984779325531226111 1b
> UN 10.0.0.4 35.7 GB 33.3% bf3d4792-f3e0-4062-afe3-be292bc85ed7 113427455640312814857969558651062452222 1c
>
> Trying to decommission the node
>
> ubuntu@10.0.0.3:~$ nodetool decommission
> Exception in thread "main" java.lang.NumberFormatException: For input string: "56713727820156407428984779325531226111"
> at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:444)
> at java.lang.Long.parseLong(Long.java:483)
> at org.apache.cassandra.service.StorageService.extractExpireTime(StorageService.java:1660)
> at org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1515)
> at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1234)
> at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:949)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1116)
> at org.apache.cassandra.service.StorageService.leaveRing(StorageService.java:2817)
> at org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2861)
> at org.apache.cassandra.service.StorageService.decommission(StorageService.java:2808)
>
> Now I'm in a state where the machine is still "up" but "leaving" but I can't seem to get it out of the ring. For example:
>
> % nodetool removenode 6ba2c7d4-713e-4c14-8df8-f861fb211b0d
> Exception in thread "main" java.lang.UnsupportedOperationException: Node /10.0.0.3 is alive and owns this ID. Use decommission command to remove it from the ring
>
> Any ideas?
>
> /Janne
>
>
>
> --
>
> Mike Heffner <mi...@librato.com>
> Librato, Inc.
>
Re: Failed decommission
Posted by Mike Heffner <mi...@librato.com>.
Janne,
We ran into this too. Appears it's a bug in 1.2.8 that is fixed in the
upcoming 1.2.9. I added the steps I took to finally remove the node here:
https://issues.apache.org/jira/browse/CASSANDRA-5857?focusedCommentId=13748998&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13748998
Cheers,
Mike
On Sun, Aug 25, 2013 at 4:06 AM, Janne Jalkanen <Ja...@ecyrd.com>wrote:
> This on cass 1.2.8
>
> Ring state before decommission
>
> -- Address Load Owns Host ID
> Token Rack
> UN 10.0.0.1 38.82 GB 33.3% 21a98502-dc74-4ad0-9689-0880aa110409 1
> 1a
> UN 10.0.0.2 33.5 GB 33.3% cba6b27a-4982-4f04-854d-cc73155d5f69
> 56713727820156407428984779325531226110 1b
> UN 10.0.0.3 37.41 GB 0.0% 6ba2c7d4-713e-4c14-8df8-f861fb211b0d
> 56713727820156407428984779325531226111 1b
> UN 10.0.0.4 35.7 GB 33.3% bf3d4792-f3e0-4062-afe3-be292bc85ed7
> 113427455640312814857969558651062452222 1c
>
> Trying to decommission the node
>
> ubuntu@10.0.0.3:~$ nodetool decommission
> Exception in thread "main" java.lang.NumberFormatException: For input
> string: "56713727820156407428984779325531226111"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:444)
> at java.lang.Long.parseLong(Long.java:483)
> at
> org.apache.cassandra.service.StorageService.extractExpireTime(StorageService.java:1660)
> at
> org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1515)
> at
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1234)
> at
> org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:949)
> at
> org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1116)
> at
> org.apache.cassandra.service.StorageService.leaveRing(StorageService.java:2817)
> at
> org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2861)
> at
> org.apache.cassandra.service.StorageService.decommission(StorageService.java:2808)
>
> Now I'm in a state where the machine is still "up" but "leaving" but I
> can't seem to get it out of the ring. For example:
>
> % nodetool removenode 6ba2c7d4-713e-4c14-8df8-f861fb211b0d
> Exception in thread "main" java.lang.UnsupportedOperationException: Node /
> 10.0.0.3 is alive and owns this ID. Use decommission command to remove it
> from the ring
>
> Any ideas?
>
> /Janne
--
Mike Heffner <mi...@librato.com>
Librato, Inc.
Re: Failed decommission
Posted by Nate McCall <na...@thelastpickle.com>.
This is what I was seeing code wise as well - but Mike's answer was spot
on. Glad you got this straightened out. (And huge thanks to Mike for coming
back to post a work-around here and on the ticket).
On Sun, Aug 25, 2013 at 11:42 AM, Janne Jalkanen
<Ja...@ecyrd.com>wrote:
>
> This would be RP (cluster upgraded from 0.6->0.8->1.0->1.1 ;-). Looks to
> me like decommission assumes Murmur and 64-bit tokens.
>
> /Janne
>
> On Aug 25, 2013, at 17:25 , Nate McCall <na...@thelastpickle.com> wrote:
>
> Are you using Murmur3 or the older Random partitioner on this cluster?
>
>
> On Sun, Aug 25, 2013 at 3:06 AM, Janne Jalkanen <Ja...@ecyrd.com>wrote:
>
>> This on cass 1.2.8
>>
>> Ring state before decommission
>>
>> -- Address Load Owns Host ID
>> Token Rack
>> UN 10.0.0.1 38.82 GB 33.3% 21a98502-dc74-4ad0-9689-0880aa110409 1
>> 1a
>> UN 10.0.0.2 33.5 GB 33.3% cba6b27a-4982-4f04-854d-cc73155d5f69
>> 56713727820156407428984779325531226110 1b
>> UN 10.0.0.3 37.41 GB 0.0% 6ba2c7d4-713e-4c14-8df8-f861fb211b0d
>> 56713727820156407428984779325531226111 1b
>> UN 10.0.0.4 35.7 GB 33.3% bf3d4792-f3e0-4062-afe3-be292bc85ed7
>> 113427455640312814857969558651062452222 1c
>>
>> Trying to decommission the node
>>
>> ubuntu@10.0.0.3:~$ nodetool decommission
>> Exception in thread "main" java.lang.NumberFormatException: For input
>> string: "56713727820156407428984779325531226111"
>> at
>> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>> at java.lang.Long.parseLong(Long.java:444)
>> at java.lang.Long.parseLong(Long.java:483)
>> at
>> org.apache.cassandra.service.StorageService.extractExpireTime(StorageService.java:1660)
>> at
>> org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1515)
>> at
>> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1234)
>> at
>> org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:949)
>> at
>> org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1116)
>> at
>> org.apache.cassandra.service.StorageService.leaveRing(StorageService.java:2817)
>> at
>> org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2861)
>> at
>> org.apache.cassandra.service.StorageService.decommission(StorageService.java:2808)
>>
>> Now I'm in a state where the machine is still "up" but "leaving" but I
>> can't seem to get it out of the ring. For example:
>>
>> % nodetool removenode 6ba2c7d4-713e-4c14-8df8-f861fb211b0d
>> Exception in thread "main" java.lang.UnsupportedOperationException: Node /
>> 10.0.0.3 is alive and owns this ID. Use decommission command to remove
>> it from the ring
>>
>> Any ideas?
>>
>> /Janne
>
>
>
>
Re: Failed decommission
Posted by Janne Jalkanen <Ja...@ecyrd.com>.
This would be RP (cluster upgraded from 0.6->0.8->1.0->1.1 ;-). Looks to me like decommission assumes Murmur and 64-bit tokens.
/Janne
On Aug 25, 2013, at 17:25 , Nate McCall <na...@thelastpickle.com> wrote:
> Are you using Murmur3 or the older Random partitioner on this cluster?
>
>
> On Sun, Aug 25, 2013 at 3:06 AM, Janne Jalkanen <Ja...@ecyrd.com> wrote:
> This on cass 1.2.8
>
> Ring state before decommission
>
> -- Address Load Owns Host ID Token Rack
> UN 10.0.0.1 38.82 GB 33.3% 21a98502-dc74-4ad0-9689-0880aa110409 1 1a
> UN 10.0.0.2 33.5 GB 33.3% cba6b27a-4982-4f04-854d-cc73155d5f69 56713727820156407428984779325531226110 1b
> UN 10.0.0.3 37.41 GB 0.0% 6ba2c7d4-713e-4c14-8df8-f861fb211b0d 56713727820156407428984779325531226111 1b
> UN 10.0.0.4 35.7 GB 33.3% bf3d4792-f3e0-4062-afe3-be292bc85ed7 113427455640312814857969558651062452222 1c
>
> Trying to decommission the node
>
> ubuntu@10.0.0.3:~$ nodetool decommission
> Exception in thread "main" java.lang.NumberFormatException: For input string: "56713727820156407428984779325531226111"
> at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:444)
> at java.lang.Long.parseLong(Long.java:483)
> at org.apache.cassandra.service.StorageService.extractExpireTime(StorageService.java:1660)
> at org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1515)
> at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1234)
> at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:949)
> at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1116)
> at org.apache.cassandra.service.StorageService.leaveRing(StorageService.java:2817)
> at org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2861)
> at org.apache.cassandra.service.StorageService.decommission(StorageService.java:2808)
>
> Now I'm in a state where the machine is still "up" but "leaving" but I can't seem to get it out of the ring. For example:
>
> % nodetool removenode 6ba2c7d4-713e-4c14-8df8-f861fb211b0d
> Exception in thread "main" java.lang.UnsupportedOperationException: Node /10.0.0.3 is alive and owns this ID. Use decommission command to remove it from the ring
>
> Any ideas?
>
> /Janne
>
Re: Failed decommission
Posted by Nate McCall <na...@thelastpickle.com>.
Are you using Murmur3 or the older Random partitioner on this cluster?
On Sun, Aug 25, 2013 at 3:06 AM, Janne Jalkanen <Ja...@ecyrd.com>wrote:
> This on cass 1.2.8
>
> Ring state before decommission
>
> -- Address Load Owns Host ID
> Token Rack
> UN 10.0.0.1 38.82 GB 33.3% 21a98502-dc74-4ad0-9689-0880aa110409 1
> 1a
> UN 10.0.0.2 33.5 GB 33.3% cba6b27a-4982-4f04-854d-cc73155d5f69
> 56713727820156407428984779325531226110 1b
> UN 10.0.0.3 37.41 GB 0.0% 6ba2c7d4-713e-4c14-8df8-f861fb211b0d
> 56713727820156407428984779325531226111 1b
> UN 10.0.0.4 35.7 GB 33.3% bf3d4792-f3e0-4062-afe3-be292bc85ed7
> 113427455640312814857969558651062452222 1c
>
> Trying to decommission the node
>
> ubuntu@10.0.0.3:~$ nodetool decommission
> Exception in thread "main" java.lang.NumberFormatException: For input
> string: "56713727820156407428984779325531226111"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:444)
> at java.lang.Long.parseLong(Long.java:483)
> at
> org.apache.cassandra.service.StorageService.extractExpireTime(StorageService.java:1660)
> at
> org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1515)
> at
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1234)
> at
> org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:949)
> at
> org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1116)
> at
> org.apache.cassandra.service.StorageService.leaveRing(StorageService.java:2817)
> at
> org.apache.cassandra.service.StorageService.unbootstrap(StorageService.java:2861)
> at
> org.apache.cassandra.service.StorageService.decommission(StorageService.java:2808)
>
> Now I'm in a state where the machine is still "up" but "leaving" but I
> can't seem to get it out of the ring. For example:
>
> % nodetool removenode 6ba2c7d4-713e-4c14-8df8-f861fb211b0d
> Exception in thread "main" java.lang.UnsupportedOperationException: Node /
> 10.0.0.3 is alive and owns this ID. Use decommission command to remove it
> from the ring
>
> Any ideas?
>
> /Janne