You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Petr Malik <pm...@tesora.com> on 2016/11/29 17:38:19 UTC

Single cluster node restore

Hi.

I have a question about Cassandra backup-restore strategies.


As far as I understand Cassandra has been designed to survive hardware failures by relying on data replication.


It seems like people still want backup/restore for case when somebody accidentally deletes data or the data gets otherwise corrupted.

In that case restoring all keyspace/table snapshots on all nodes should bring it back.


I am asking because I often read directions on restoring a single node in a cluster. I am just wondering under what circumstances could this be done safely.


Please correct me if i am wrong but restoring just a single node does not really roll back the data as the newer (corrupt) data will be served by other replicas and eventually propagated to the restored node. Right?

In fact by doing so one may end up reintroducing deleted data back...


Also since Cassandra distributes the data throughout the cluster it is not clear on which mode any particular (corrupt) data resides and hence which to restore.


I guess this is a long way of asking whether there is an advantage of trying to restore just a single node in a Cassandra cluster as opposed to say replacing the dead node and letting Cassandra handle the replication.


Thanks.

Re: Single cluster node restore

Posted by Ben Slater <be...@instaclustr.com>.
You can have situations where rebuilding a node via streaming is painful
and slow (generally because there is something bad about the data model
like misused secondary indexes or massive partitions). Also, overstreaming
can mean you need more disk space to bootstrap a node than you’ll require
once it’s fully streamed and compacted - this can be hard to work around in
some environments. In this case you might want to restore a single node
from backup.

However, in general you’re right - it’s not something that tends to be done
very often.

Cheers
Ben

On Wed, 30 Nov 2016 at 04:38 Petr Malik <pm...@tesora.com> wrote:

>
> Hi.
>
> I have a question about Cassandra backup-restore strategies.
>
> As far as I understand Cassandra has been designed to survive hardware
> failures by relying on data replication.
>
>
> It seems like people still want backup/restore for case when somebody
> accidentally deletes data or the data gets otherwise corrupted.
>
> In that case restoring all keyspace/table snapshots on all nodes should
> bring it back.
>
>
> I am asking because I often read directions on restoring a single node in
> a cluster. I am just wondering under what circumstances could this be done
> safely.
>
>
> Please correct me if i am wrong but restoring just a single node does not
> really roll back the data as the newer (corrupt) data will be served by
> other replicas and eventually propagated to the restored node. Right?
>
> In fact by doing so one may end up reintroducing deleted data back...
>
>
> Also since Cassandra distributes the data throughout the cluster it is not
> clear on which mode any particular (corrupt) data resides and hence which
> to restore.
>
>
> I guess this is a long way of asking whether there is an advantage of
> trying to restore just a single node in a Cassandra cluster as opposed to
> say replacing the dead node and letting Cassandra handle the replication.
>
>
> Thanks.
>