You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Allan Carroll <al...@gmail.com> on 2010/10/07 23:22:55 UTC

Retrieving dead node's token from system keyspace

Hey all, 

I had a node go down that I'm not able to get a token for from nodetool ring.

The wiki says:

"You can obtain the dead node's token by running nodetool ring on any live node, unless there was some kind of outage, and the others came up but not the down one -- in that case, you can retrieve the token from the live nodes' system tables."

But, I can't for the life of me figure out how to get the system keyspace to give up the secret. All attempts end up in:

ERROR [pool-1-thread-2] 2010-10-07 21:20:44,865 Cassandra.java (line 1280) Internal error processing get_slice
java.lang.RuntimeException: No replica strategy configured for system


Can someone point me at a good way to get the token?

Thanks
-Allan

Re: Retrieving dead node's token from system keyspace

Posted by Allan Carroll <al...@gmail.com>.
I had a cluster of three nodes with RF=3 that I was using. Then, my demand dropped off quite a bit and I was trying to bring the cluster down to just one node for some time while working on other things to lower my server costs. 

Dropping the first node off the cluster worked fine using nodetool decommission. On the second node, I forgot to decommission the node before terminating the server instance. For some reason, this caused the remaining node to stop working. So, now I have one broken node and a backup of the data from the second node.

I'd like to just bring up the one node and get it working again. It should have a copy of all the data since I never ran the cluster with more nodes than the RF.

Here's some more info on where I'm at that might help. 

All the servers were running 0.6.5.

This is the output I get from nodetool ring

	Address       Status     Load          Range                                      Ring
	10.202.65.143 Up         27.13 GB      165675654950889355108929973590945588660    |<--|

I dumped the LocationInfo table and ran nodetool removetoken on anything that looked remotely like a token. Every time, nodetool produced no output. Except when I tried to remove the token given in the ring output. It, of course, told me I couldn't remove the token from the local node.

I tried rebuilding the node from scratch yesterday but got only the same results. The token shown in the ring was different, but otherwise, all output there is the same.

The more extreme option I considered today is creating a whole new node on a new server, running all the db files out to json and then importing them into the new node. Not sure that'll be any different than what I've tried, but it feels like it would be as clean as I could get.

Thanks for the followups,
Allan

On Oct 7, 2010, at 7:00 PM, Matthew Dennis wrote:

> Allan,
> 
> I'm confused on why removetoken doesn't do anything and would be interested in finding out why, but to answer your question:
> 
> You can shutdown down your last node, nuke the system directory (make a backup just in case), restart the node, load the schema (export it first if need be) and be one your way.  You should end up with a node that is the only one in the ring.  Again, make a backup of the the system directory (actually, might as well just backup the entire data and commitlog directories) before you start nuking stuff.
> 
> On Thu, Oct 7, 2010 at 7:12 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
> Allan, 
> I'm a bit confused about what you are trying to do here. You have 2 nodes with RF = ? , you lost one node completely and now you want to...
> 
> Just get a cluster running again, don't worry about the data.
> OR
> Restore the data from the dead node. 
> OR
> Create a cluster with the data from the remaining node and a new node.
> 
> Aaron
> 
> 
> On 08 Oct, 2010,at 11:15 AM, Allan Carroll <al...@gmail.com> wrote:
> 
>> I was able to figure out to use the sstable2json tool to get the values out of the system keyspace.
>> 
>> Unfortunately, the node that went down took all of it's data with it and I only have access to the system keyspace of the remaining live node. There were only two nodes and the one left should have a whole DB copy.
>> 
>> Running removetoken on any of the values that appeared to be tokens in the LocationInfo cf hasn't done any good. Perhaps I'm missing which value is the token of the dead node? Or, is there a way to take down the last node and bring back up a new cluster using the sstables that I have on the remaining node?
>> 
>> -Allan
>> 
>> On Oct 7, 2010, at 3:22 PM, Allan Carroll wrote:
>> 
>> > Hey all, 
>> > 
>> > I had a node go down that I'm not able to get a token for from nodetool ring.
>> > 
>> > The wiki says:
>> > 
>> > "You can obtain the dead node's token by running nodetool ring on any live node, unless there was some kind of outage, and the others came up but not the down one -- in that case, you can retrieve the token from the live nodes' system tables."
>> > 
>> > But, I can't for the life of me figure out how to get the system keyspace to give up the secret. All attempts end up in:
>> > 
>> > ERROR [pool-1-thread-2] 2010-10-07 21:20:44,865 Cassandra.java (line 1280) Internal error processing get_slice
>> > java.lang.RuntimeException: No replica strategy configured for system
>> > 
>> > 
>> > Can someone point me at a good way to get the token?
>> > 
>> > Thanks
>> > -Allan
>> 
> 
> 
> 
> -- 
> Riptano
> Software and Support for Apache Cassandra
> http://www.riptano.com/
> mdennis@riptano.com
> m: 512.587.0900 f: 866.583.2068


Re: Retrieving dead node's token from system keyspace

Posted by Jonathan Ellis <jb...@gmail.com>.
You only need to removetoken if you want to re-replicate data to other
nodes.  If each node has a full copy of the data, and the other nodes
have forgotten about the dead node anyway, there is no need.  (If they
have not forgotten about the dead node, then the token will be in the
ring information.)

On Fri, Oct 8, 2010 at 2:39 PM, Allan Carroll <al...@gmail.com> wrote:
>
> I had a cluster of three nodes with RF=3 that I was using. Then, my demand
> dropped off quite a bit and I was trying to bring the cluster down to just
> one node for some time while working on other things to lower my server
> costs.
> Dropping the first node off the cluster worked fine using nodetool
> decommission. On the second node, I forgot to decommission the node before
> terminating the server instance. For some reason, this caused the remaining
> node to stop working. So, now I have one broken node and a backup of the
> data from the second node.
> I'd like to just bring up the one node and get it working again. It should
> have a copy of all the data since I never ran the cluster with more nodes
> than the RF.
> Here's some more info on where I'm at that might help.
> All the servers were running 0.6.5.
> This is the output I get from nodetool ring
> Address       Status     Load          Range
>      Ring
> 10.202.65.143 Up         27.13 GB
>  165675654950889355108929973590945588660    |<--|
> I dumped the LocationInfo table and ran nodetool removetoken on anything
> that looked remotely like a token. Every time, nodetool produced no output.
> Except when I tried to remove the token given in the ring output. It, of
> course, told me I couldn't remove the token from the local node.
> I tried rebuilding the node from scratch yesterday but got only the same
> results. The token shown in the ring was different, but otherwise, all
> output there is the same.
> The more extreme option I considered today is creating a whole new node on a
> new server, running all the db files out to json and then importing them
> into the new node. Not sure that'll be any different than what I've tried,
> but it feels like it would be as clean as I could get.
> Thanks for the followups,
> Allan
> On Oct 7, 2010, at 7:00 PM, Matthew Dennis wrote:
>
> Allan,
>
> I'm confused on why removetoken doesn't do anything and would be interested
> in finding out why, but to answer your question:
>
> You can shutdown down your last node, nuke the system directory (make a
> backup just in case), restart the node, load the schema (export it first if
> need be) and be one your way.  You should end up with a node that is the
> only one in the ring.  Again, make a backup of the the system directory
> (actually, might as well just backup the entire data and commitlog
> directories) before you start nuking stuff.
>
> On Thu, Oct 7, 2010 at 7:12 PM, Aaron Morton <aa...@thelastpickle.com>
> wrote:
>>
>> Allan,
>> I'm a bit confused about what you are trying to do here. You have 2 nodes
>> with RF = ? , you lost one node completely and now you want to...
>> Just get a cluster running again, don't worry about the data.
>> OR
>> Restore the data from the dead node.
>> OR
>> Create a cluster with the data from the remaining node and a new node.
>> Aaron
>>
>> On 08 Oct, 2010,at 11:15 AM, Allan Carroll <al...@gmail.com> wrote:
>>
>> I was able to figure out to use the sstable2json tool to get the values
>> out of the system keyspace.
>>
>> Unfortunately, the node that went down took all of it's data with it and I
>> only have access to the system keyspace of the remaining live node. There
>> were only two nodes and the one left should have a whole DB copy.
>>
>> Running removetoken on any of the values that appeared to be tokens in the
>> LocationInfo cf hasn't done any good. Perhaps I'm missing which value is the
>> token of the dead node? Or, is there a way to take down the last node and
>> bring back up a new cluster using the sstables that I have on the remaining
>> node?
>>
>> -Allan
>>
>> On Oct 7, 2010, at 3:22 PM, Allan Carroll wrote:
>>
>> > Hey all,
>> >
>> > I had a node go down that I'm not able to get a token for from nodetool
>> > ring.
>> >
>> > The wiki says:
>> >
>> > "You can obtain the dead node's token by running nodetool ring on any
>> > live node, unless there was some kind of outage, and the others came up but
>> > not the down one -- in that case, you can retrieve the token from the live
>> > nodes' system tables."
>> >
>> > But, I can't for the life of me figure out how to get the system
>> > keyspace to give up the secret. All attempts end up in:
>> >
>> > ERROR [pool-1-thread-2] 2010-10-07 21:20:44,865 Cassandra.java (line
>> > 1280) Internal error processing get_slice
>> > java.lang.RuntimeException: No replica strategy configured for system
>> >
>> >
>> > Can someone point me at a good way to get the token?
>> >
>> > Thanks
>> > -Allan
>>
>
>
>
> --
> Riptano
> Software and Support for Apache Cassandra
> http://www.riptano.com/
> mdennis@riptano.com
> m: 512.587.0900 f: 866.583.2068
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Retrieving dead node's token from system keyspace

Posted by Allan Carroll <al...@gmail.com>.
I had a cluster of three nodes with RF=3 that I was using. Then, my demand dropped off quite a bit and I was trying to bring the cluster down to just one node for some time while working on other things to lower my server costs. 

Dropping the first node off the cluster worked fine using nodetool decommission. On the second node, I forgot to decommission the node before terminating the server instance. For some reason, this caused the remaining node to stop working. So, now I have one broken node and a backup of the data from the second node.

I'd like to just bring up the one node and get it working again. It should have a copy of all the data since I never ran the cluster with more nodes than the RF.

Here's some more info on where I'm at that might help. 

All the servers were running 0.6.5.

This is the output I get from nodetool ring

	Address       Status     Load          Range                                      Ring
	10.202.65.143 Up         27.13 GB      165675654950889355108929973590945588660    |<--|

I dumped the LocationInfo table and ran nodetool removetoken on anything that looked remotely like a token. Every time, nodetool produced no output. Except when I tried to remove the token given in the ring output. It, of course, told me I couldn't remove the token from the local node.

I tried rebuilding the node from scratch yesterday but got only the same results. The token shown in the ring was different, but otherwise, all output there is the same.

The more extreme option I considered today is creating a whole new node on a new server, running all the db files out to json and then importing them into the new node. Not sure that'll be any different than what I've tried, but it feels like it would be as clean as I could get.

Thanks for the followups,
Allan

On Oct 7, 2010, at 7:00 PM, Matthew Dennis wrote:

> Allan,
> 
> I'm confused on why removetoken doesn't do anything and would be interested in finding out why, but to answer your question:
> 
> You can shutdown down your last node, nuke the system directory (make a backup just in case), restart the node, load the schema (export it first if need be) and be one your way.  You should end up with a node that is the only one in the ring.  Again, make a backup of the the system directory (actually, might as well just backup the entire data and commitlog directories) before you start nuking stuff.
> 
> On Thu, Oct 7, 2010 at 7:12 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
> Allan, 
> I'm a bit confused about what you are trying to do here. You have 2 nodes with RF = ? , you lost one node completely and now you want to...
> 
> Just get a cluster running again, don't worry about the data.
> OR
> Restore the data from the dead node. 
> OR
> Create a cluster with the data from the remaining node and a new node.
> 
> Aaron
> 
> 
> On 08 Oct, 2010,at 11:15 AM, Allan Carroll <al...@gmail.com> wrote:
> 
>> I was able to figure out to use the sstable2json tool to get the values out of the system keyspace.
>> 
>> Unfortunately, the node that went down took all of it's data with it and I only have access to the system keyspace of the remaining live node. There were only two nodes and the one left should have a whole DB copy.
>> 
>> Running removetoken on any of the values that appeared to be tokens in the LocationInfo cf hasn't done any good. Perhaps I'm missing which value is the token of the dead node? Or, is there a way to take down the last node and bring back up a new cluster using the sstables that I have on the remaining node?
>> 
>> -Allan
>> 
>> On Oct 7, 2010, at 3:22 PM, Allan Carroll wrote:
>> 
>> > Hey all, 
>> > 
>> > I had a node go down that I'm not able to get a token for from nodetool ring.
>> > 
>> > The wiki says:
>> > 
>> > "You can obtain the dead node's token by running nodetool ring on any live node, unless there was some kind of outage, and the others came up but not the down one -- in that case, you can retrieve the token from the live nodes' system tables."
>> > 
>> > But, I can't for the life of me figure out how to get the system keyspace to give up the secret. All attempts end up in:
>> > 
>> > ERROR [pool-1-thread-2] 2010-10-07 21:20:44,865 Cassandra.java (line 1280) Internal error processing get_slice
>> > java.lang.RuntimeException: No replica strategy configured for system
>> > 
>> > 
>> > Can someone point me at a good way to get the token?
>> > 
>> > Thanks
>> > -Allan
>> 
> 
> 
> 
> -- 
> Riptano
> Software and Support for Apache Cassandra
> http://www.riptano.com/
> mdennis@riptano.com
> m: 512.587.0900 f: 866.583.2068


Re: Retrieving dead node's token from system keyspace

Posted by Matthew Dennis <md...@riptano.com>.
Allan,

I'm confused on why removetoken doesn't do anything and would be interested
in finding out why, but to answer your question:

You can shutdown down your last node, nuke the system directory (make a
backup just in case), restart the node, load the schema (export it first if
need be) and be one your way.  You should end up with a node that is the
only one in the ring.  Again, make a backup of the the system directory
(actually, might as well just backup the entire data and commitlog
directories) before you start nuking stuff.

On Thu, Oct 7, 2010 at 7:12 PM, Aaron Morton <aa...@thelastpickle.com>wrote:

> Allan,
> I'm a bit confused about what you are trying to do here. You have 2 nodes
> with RF = ? , you lost one node completely and now you want to...
>
> Just get a cluster running again, don't worry about the data.
> OR
> Restore the data from the dead node.
> OR
> Create a cluster with the data from the remaining node and a new node.
>
> Aaron
>
>
> On 08 Oct, 2010,at 11:15 AM, Allan Carroll <al...@gmail.com> wrote:
>
> I was able to figure out to use the sstable2json tool to get the values out
> of the system keyspace.
>
> Unfortunately, the node that went down took all of it's data with it and I
> only have access to the system keyspace of the remaining live node. There
> were only two nodes and the one left should have a whole DB copy.
>
> Running removetoken on any of the values that appeared to be tokens in the
> LocationInfo cf hasn't done any good. Perhaps I'm missing which value is the
> token of the dead node? Or, is there a way to take down the last node and
> bring back up a new cluster using the sstables that I have on the remaining
> node?
>
> -Allan
>
> On Oct 7, 2010, at 3:22 PM, Allan Carroll wrote:
>
> > Hey all,
> >
> > I had a node go down that I'm not able to get a token for from nodetool
> ring.
> >
> > The wiki says:
> >
> > "You can obtain the dead node's token by running nodetool ring on any
> live node, unless there was some kind of outage, and the others came up but
> not the down one -- in that case, you can retrieve the token from the live
> nodes' system tables."
> >
> > But, I can't for the life of me figure out how to get the system keyspace
> to give up the secret. All attempts end up in:
> >
> > ERROR [pool-1-thread-2] 2010-10-07 21:20:44,865 Cassandra.java (line
> 1280) Internal error processing get_slice
> > java.lang.RuntimeException: No replica strategy configured for system
> >
> >
> > Can someone point me at a good way to get the token?
> >
> > Thanks
> > -Allan
>
>


-- 
Riptano
Software and Support for Apache Cassandra
http://www.riptano.com/
mdennis@riptano.com
m: 512.587.0900 f: 866.583.2068

Re: Retrieving dead node's token from system keyspace

Posted by Aaron Morton <aa...@thelastpickle.com>.
Allan, 
I'm a bit confused about what you are trying to do here. You have 2 nodes with RF = ? , you lost one node completely and now you want to...

Just get a cluster running again, don't worry about the data.
OR
Restore the data from the dead node. 
OR
Create a cluster with the data from the remaining node and a new node.

Aaron


On 08 Oct, 2010,at 11:15 AM, Allan Carroll <al...@gmail.com> wrote:

I was able to figure out to use the sstable2json tool to get the values out of the system keyspace.

Unfortunately, the node that went down took all of it's data with it and I only have access to the system keyspace of the remaining live node. There were only two nodes and the one left should have a whole DB copy.

Running removetoken on any of the values that appeared to be tokens in the LocationInfo cf hasn't done any good. Perhaps I'm missing which value is the token of the dead node? Or, is there a way to take down the last node and bring back up a new cluster using the sstables that I have on the remaining node?

-Allan

On Oct 7, 2010, at 3:22 PM, Allan Carroll wrote:

> Hey all, 
> 
> I had a node go down that I'm not able to get a token for from nodetool ring.
> 
> The wiki says:
> 
> "You can obtain the dead node's token by running nodetool ring on any live node, unless there was some kind of outage, and the others came up but not the down one -- in that case, you can retrieve the token from the live nodes' system tables."
> 
> But, I can't for the life of me figure out how to get the system keyspace to give up the secret. All attempts end up in:
> 
> ERROR [pool-1-thread-2] 2010-10-07 21:20:44,865 Cassandra.java (line 1280) Internal error processing get_slice
> java.lang.RuntimeException: No replica strategy configured for system
> 
> 
> Can someone point me at a good way to get the token?
> 
> Thanks
> -Allan


Re: Retrieving dead node's token from system keyspace

Posted by Allan Carroll <al...@gmail.com>.
I was able to figure out to use the sstable2json tool to get the values out of the system keyspace.

Unfortunately, the node that went down took all of it's data with it and I only have access to the system keyspace of the remaining live node. There were only two nodes and the one left should have a whole DB copy.

Running removetoken on any of the values that appeared to be tokens in the LocationInfo cf hasn't done any good. Perhaps I'm missing which value is the token of the dead node? Or, is there a way to take down the last node and bring back up a new cluster using the sstables that I have on the remaining node?

-Allan

On Oct 7, 2010, at 3:22 PM, Allan Carroll wrote:

> Hey all, 
> 
> I had a node go down that I'm not able to get a token for from nodetool ring.
> 
> The wiki says:
> 
> "You can obtain the dead node's token by running nodetool ring on any live node, unless there was some kind of outage, and the others came up but not the down one -- in that case, you can retrieve the token from the live nodes' system tables."
> 
> But, I can't for the life of me figure out how to get the system keyspace to give up the secret. All attempts end up in:
> 
> ERROR [pool-1-thread-2] 2010-10-07 21:20:44,865 Cassandra.java (line 1280) Internal error processing get_slice
> java.lang.RuntimeException: No replica strategy configured for system
> 
> 
> Can someone point me at a good way to get the token?
> 
> Thanks
> -Allan