You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jean Tremblay <je...@zen-innovations.com> on 2015/06/25 17:07:22 UTC

Restore Snapshots

Hi,

I am testing snapshot restore procedures in case of a major catastrophe on our cluster. I’m using Cassandra 2.1.7 with RF:3

The scenario that I am trying to solve is how to quickly get one node back to work after its disk failed and lost all its data assuming that the only thing I have is its snapshots.

The procedure that I’m following is the one explained here: http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html

I can do a snapshot that is straight forward.
My problem is in the restore of the snapshot.

If I restart Cassandra with an empty data directory the node will bootstrap.
Bootstrap is very nice, since it recreate the schema and reload the data from its neighbour.
But this is quite heavy traffic and quite a slow process.

My questions are:

- how can I restore the data directory structure in order to copy my snapshots at the right position?
- is it possible to recreate the schema on one node?
- how can I avoid the node from streaming from the other nodes?
- must I also have the snapshot of the system tables in order to restore a node from only the snapshot of my tables?

Thanks for your comments.

Jean





Re: Restore Snapshots

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
Hi Jean,

Glad to hear it worked this way.

Some other people provided (and continue providing) similar help to me,
just trying to give back to the community as much as I received from it.

See you around.

Alain

2015-06-26 8:44 GMT+02:00 Jean Tremblay <je...@zen-innovations.com>:

>  Good morning,
> Alain, thank you so much. This is exactly what I needed.
>
>   In my test I had a node which had for whatever reason the directory
> containing my data corrupted. I keep in a separate folder my snapshots.
>
>  Here are the steps I took to recover my sick node:
>
>  0) Cassandra is stopped on my sick node.
> 1) I wiped out my data directory. My snapshots were kept outside this
> directory.
> 2) I modified my Cassandra.yaml. I added auto_bootstrap: false .This is to
> make sure that my node does not synch with the others.
> 3) I restarted Cassandra. This step created a basic structure for my new
> data directory.
> 4) I did the command: nodetool resetlocalschema. This recreated all the
> folders for my cf.
> 5) I stopped Cassandra on my node.
> 6) I copied my snapshot in the right location. I actually hard linked
> them, this is very fast.
> 7) I restarted Cassandra.
>
>  That's it.
>
>  Thank you SO MUCH ALAIN for your support. You really helped me a lot.
>
> On 25 Jun,2015, at 18:37, Alain RODRIGUEZ <ar...@gmail.com> wrote:
>
>   Hi Jean,
>
>  Answers in line to be sure to be exhaustive:
>
>  - how can I restore the data directory structure in order to copy my
> snapshots at the right position?
> --> making a script to do it and testing it I would say. basically under
> any table repo you have a "snapshots/snapshot_name" directory (snapshot_name
> is timestamp if not specified off the top of my head..) and then your
> sstables.
>
>  - is it possible to recreate the schema on one node?
> --> The easiest way that come to my mind is to set "auto_bootstrap: false"
> on a node not already in the ring. If you have trouble with the schema of a
> node in the ring run a "nodetool resetlocalschema"
>
>  - how can I avoid the node from streaming from the other nodes?
> --> See above (auto_bootstrap: false). BTW, option might not be present
> at all, just add it.
>
>  - must I also have the snapshot of the system tables in order to restore
> a node from only the snapshot of my tables?
> --> just you user table. Yet remember that snapshot is per node and as
> such you will just have part of the data this node use to hold. meaning
> that if the new node have different tokens, there will be unused data +
> missing data for sure.
>
>  Basically when a node is down I use to remove it, repair the cluster,
> and bootstap it (auto_bootstrap: true). Streams are part of Cassandra. I
> accept that. An other solution would be to "replace" the node -->
> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html
>
>
>  C*heers,
>
>  Alain
>
> 2015-06-25 17:07 GMT+02:00 Jean Tremblay <
> jean.tremblay@zen-innovations.com>:
>
>> Hi,
>>
>>  I am testing snapshot restore procedures in case of a major catastrophe
>> on our cluster. I’m using Cassandra 2.1.7 with RF:3
>>
>>  The scenario that I am trying to solve is how to quickly get one node
>> back to work after its disk failed and lost all its data assuming that the
>> only thing I have is its snapshots.
>>
>>  The procedure that I’m following is the one explained here:
>> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html
>>
>>  I can do a snapshot that is straight forward.
>> My problem is in the restore of the snapshot.
>>
>>  If I restart Cassandra with an empty data directory the node will
>> bootstrap.
>> Bootstrap is very nice, since it recreate the schema and reload the data
>> from its neighbour.
>> But this is quite heavy traffic and quite a slow process.
>>
>>  My questions are:
>>
>>  - how can I restore the data directory structure in order to copy my
>> snapshots at the right position?
>> - is it possible to recreate the schema on one node?
>> - how can I avoid the node from streaming from the other nodes?
>> - must I also have the snapshot of the system tables in order to restore
>> a node from only the snapshot of my tables?
>>
>>  Thanks for your comments.
>>
>>  Jean
>>
>>
>>
>>
>>
>

Re: Restore Snapshots

Posted by Jean Tremblay <je...@zen-innovations.com>.
Good morning,
Alain, thank you so much. This is exactly what I needed.

 In my test I had a node which had for whatever reason the directory containing my data corrupted. I keep in a separate folder my snapshots.

Here are the steps I took to recover my sick node:

0) Cassandra is stopped on my sick node.
1) I wiped out my data directory. My snapshots were kept outside this directory.
2) I modified my Cassandra.yaml. I added auto_bootstrap: false .This is to make sure that my node does not synch with the others.
3) I restarted Cassandra. This step created a basic structure for my new data directory.
4) I did the command: nodetool resetlocalschema. This recreated all the folders for my cf.
5) I stopped Cassandra on my node.
6) I copied my snapshot in the right location. I actually hard linked them, this is very fast.
7) I restarted Cassandra.

That's it.

Thank you SO MUCH ALAIN for your support. You really helped me a lot.
On 25 Jun,2015, at 18:37, Alain RODRIGUEZ <ar...@gmail.com>> wrote:

Hi Jean,

Answers in line to be sure to be exhaustive:

- how can I restore the data directory structure in order to copy my snapshots at the right position?
--> making a script to do it and testing it I would say. basically under any table repo you have a "snapshots/snapshot_name" directory (snapshot_name is timestamp if not specified off the top of my head..) and then your sstables.

- is it possible to recreate the schema on one node?
--> The easiest way that come to my mind is to set "auto_bootstrap: false" on a node not already in the ring. If you have trouble with the schema of a node in the ring run a "nodetool resetlocalschema"

- how can I avoid the node from streaming from the other nodes?
--> See above (auto_bootstrap: false). BTW, option might not be present at all, just add it.

- must I also have the snapshot of the system tables in order to restore a node from only the snapshot of my tables?
--> just you user table. Yet remember that snapshot is per node and as such you will just have part of the data this node use to hold. meaning that if the new node have different tokens, there will be unused data + missing data for sure.

Basically when a node is down I use to remove it, repair the cluster, and bootstap it (auto_bootstrap: true). Streams are part of Cassandra. I accept that. An other solution would be to "replace" the node --> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html

C*heers,

Alain

2015-06-25 17:07 GMT+02:00 Jean Tremblay <je...@zen-innovations.com>>:
Hi,

I am testing snapshot restore procedures in case of a major catastrophe on our cluster. I'm using Cassandra 2.1.7 with RF:3

The scenario that I am trying to solve is how to quickly get one node back to work after its disk failed and lost all its data assuming that the only thing I have is its snapshots.

The procedure that I'm following is the one explained here: http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html

I can do a snapshot that is straight forward.
My problem is in the restore of the snapshot.

If I restart Cassandra with an empty data directory the node will bootstrap.
Bootstrap is very nice, since it recreate the schema and reload the data from its neighbour.
But this is quite heavy traffic and quite a slow process.

My questions are:

- how can I restore the data directory structure in order to copy my snapshots at the right position?
- is it possible to recreate the schema on one node?
- how can I avoid the node from streaming from the other nodes?
- must I also have the snapshot of the system tables in order to restore a node from only the snapshot of my tables?

Thanks for your comments.

Jean






Re: Restore Snapshots

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
Hi Jean,

Answers in line to be sure to be exhaustive:

- how can I restore the data directory structure in order to copy my
snapshots at the right position?
--> making a script to do it and testing it I would say. basically under
any table repo you have a "snapshots/snapshot_name" directory (snapshot_name
is timestamp if not specified off the top of my head..) and then your
sstables.

- is it possible to recreate the schema on one node?
--> The easiest way that come to my mind is to set "auto_bootstrap: false"
on a node not already in the ring. If you have trouble with the schema of a
node in the ring run a "nodetool resetlocalschema"

- how can I avoid the node from streaming from the other nodes?
--> See above (auto_bootstrap: false). BTW, option might not be present at
all, just add it.

- must I also have the snapshot of the system tables in order to restore a
node from only the snapshot of my tables?
--> just you user table. Yet remember that snapshot is per node and as such
you will just have part of the data this node use to hold. meaning that if
the new node have different tokens, there will be unused data + missing
data for sure.

Basically when a node is down I use to remove it, repair the cluster, and
bootstap it (auto_bootstrap: true). Streams are part of Cassandra. I accept
that. An other solution would be to "replace" the node -->
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html


C*heers,

Alain

2015-06-25 17:07 GMT+02:00 Jean Tremblay <je...@zen-innovations.com>
:

>  Hi,
>
>  I am testing snapshot restore procedures in case of a major catastrophe
> on our cluster. I’m using Cassandra 2.1.7 with RF:3
>
>  The scenario that I am trying to solve is how to quickly get one node
> back to work after its disk failed and lost all its data assuming that the
> only thing I have is its snapshots.
>
>  The procedure that I’m following is the one explained here:
> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html
>
>  I can do a snapshot that is straight forward.
> My problem is in the restore of the snapshot.
>
>  If I restart Cassandra with an empty data directory the node will
> bootstrap.
> Bootstrap is very nice, since it recreate the schema and reload the data
> from its neighbour.
> But this is quite heavy traffic and quite a slow process.
>
>  My questions are:
>
>  - how can I restore the data directory structure in order to copy my
> snapshots at the right position?
> - is it possible to recreate the schema on one node?
> - how can I avoid the node from streaming from the other nodes?
> - must I also have the snapshot of the system tables in order to restore a
> node from only the snapshot of my tables?
>
>  Thanks for your comments.
>
>  Jean
>
>
>
>
>