You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by manish khandelwal <ma...@gmail.com> on 2019/03/28 11:24:06 UTC

Best practices while designing backup storage system for big Cassandra cluster

Hi



I would like to know is there any guideline for selecting storage device
(disk type) for Cassandra backups.



As per my current observation, NearLine (NL) disk on SAN  slows down
significantly while copying backup files (taking full backup) from all node
simultaneously. Will using SSD disk on SAN help us in this regard?

Apart from using SSD disk, what are the alternative approach to make my
backup process fast?

What are the best practices while designing backup storage system for big
Cassandra cluster?


Regards

Manish

Re: Best practices while designing backup storage system for big Cassandra cluster

Posted by Carl Mueller <ca...@smartthings.com.INVALID>.
Another approach to avoiding the full backup I/O hit would be to rotate a
node or small subset of nodes that do full backups routinely, so that over
the course of a month or two you get full backups. Of course this assumes
you have incremental ability for the other backup days/dates.

On Mon, Apr 1, 2019 at 1:30 PM Carl Mueller <ca...@smartthings.com>
wrote:

> At my current job I had to roll my own backup system. Hopefully I can get
> it OSS'd at some point. Here is a (now slightly outdated) presentation:
>
>
> https://docs.google.com/presentation/d/13Aps-IlQPYAa_V34ocR0E8Q4C8W2YZ6Jn5_BYGrjqFk/edit#slide=id.p
>
> If you are struggling with the disk I/O cost of the sstable
> backups/copies, note that since sstables are append-only, if you adopt an
> incremental approach to your backups, you only need to track a list of the
> current files and upload the files that are new compared to a previous
> successful backup. Your "manifest" of files for a node will need to have
> references to the previous backup, and you'll wnat to "reset" with a full
> backup each month.
>
> I stole that idea from https://github.com/tbarbugli/cassandra_snapshotter.
> I would have used that but we had more complex node access modes
> (kubernetes, ssh through jumphosts, etc) and lots of other features needed
> that weren't supported.
>
> In AWS I use aws profiles to throttle the transfers, and parallelize
> across nodes. The basic unit of a successful backup is a single node, but
> you'll obviously want to track overall node success.
>
> Note that in rack-based topologies you really only need one whole
> successful rack if your RF is > # racks, and one DC.
>
> Beware doing simultaneous flushes/snapshots across the cluster at once,
> that might be the equivalent of a DDos. You might want to do a "jittered"
> randomized preflush of the cluster first before doing the snapshotting.
>
> Unfortunately, the nature of a distributed system is that snapshotting all
> the nodes at the precise same time is a hard problem.
>
> I also do not / have not used the built-in incremental backup feature of
> cassandra, which can enable more precise point-in-time backups (aside from
> the unflushed data in the commitlogs)
>
> A note on incrementals with occaisional FULLs: Note that FULL backups
> monthly might take more than a day or two, especially throttled. My
> incrementals were originally looking up previous manifests using only 'most
> recent", but then the long-running FULL backups were excluded from the
> "chain" of incremental backups. So I now implement a fuzzy lookup for the
> incrementals that prioritizes any FULL in the last 5 days over any more
> recent incremental. Thus you can purge old backups you don't need more
> safely using the monthly full backups as a reset point.
>
> On Mon, Apr 1, 2019 at 1:08 PM Alain RODRIGUEZ <ar...@gmail.com> wrote:
>
>> Hello Manish,
>>
>> I think any disk works. As long as it is big enough. It's also better if
>> it's a reliable system (some kind of redundant raid, NAS, storage like GCS
>> or S3...). We are not looking for speed mostly during a backup, but
>> resiliency and not harming the source cluster mostly I would say.
>> Then how fast you write to the backup storage system will probably be
>> more often limited by what you can read from the source cluster.
>> The backups have to be taken from running nodes, thus it's easy to
>> overload the disk (reads), network (export backup data to final
>> destination), and even CPU (as/if the machine handles the transfer).
>>
>> What are the best practices while designing backup storage system for big
>>> Cassandra cluster?
>>
>>
>> What is nice to have (not to say mandatory) is a system of incremental
>> backups. You should not take the data from the nodes every time, or you'll
>> either harm the cluster regularly OR spend days to transfer the data (if
>> the amount of data grows big enough).
>> I'm not speaking about Cassandra incremental snapshots, but of using
>> something like AWS Snapshot, or copying this behaviour programmatically to
>> take (copy, link?) old SSTables from previous backups when they exist, will
>> greatly unload the clusters work and the resource needed as soon enough a
>> substantial amount of the data should be coming from the backup data source
>> itself. The problem with incremental snapshot is that when restoring, you
>> have to restore multiple pieces, making it harder and involving a lot of
>> compaction work.
>> The "caching" technic mentioned above gives the best of the 2 worlds:
>> - You will always backup from the nodes only the sstables you don’t have
>> already in your backup storage system,
>> - You will always restore easily as each backup is a full backup.
>>
>> It's not really a "hands-on" writing, but this should let you know about
>> existing ways to do backups and the tradeoffs, I wrote this a year ago:
>> http://thelastpickle.com/blog/2018/04/03/cassandra-backup-and-restore-aws-ebs.html
>> .
>>
>> It's a complex topic, I hope some of this is helpful to you.
>>
>> C*heers,
>> -----------------------
>> Alain Rodriguez - alain@thelastpickle.com
>> France / Spain
>>
>> The Last Pickle - Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>>
>> Le jeu. 28 mars 2019 à 11:24, manish khandelwal <
>> manishkhandelwal03@gmail.com> a écrit :
>>
>>> Hi
>>>
>>>
>>>
>>> I would like to know is there any guideline for selecting storage device
>>> (disk type) for Cassandra backups.
>>>
>>>
>>>
>>> As per my current observation, NearLine (NL) disk on SAN  slows down
>>> significantly while copying backup files (taking full backup) from all node
>>> simultaneously. Will using SSD disk on SAN help us in this regard?
>>>
>>> Apart from using SSD disk, what are the alternative approach to make my
>>> backup process fast?
>>>
>>> What are the best practices while designing backup storage system for
>>> big Cassandra cluster?
>>>
>>>
>>> Regards
>>>
>>> Manish
>>>
>>

Re: Best practices while designing backup storage system for big Cassandra cluster

Posted by Carl Mueller <ca...@smartthings.com.INVALID>.
At my current job I had to roll my own backup system. Hopefully I can get
it OSS'd at some point. Here is a (now slightly outdated) presentation:

https://docs.google.com/presentation/d/13Aps-IlQPYAa_V34ocR0E8Q4C8W2YZ6Jn5_BYGrjqFk/edit#slide=id.p

If you are struggling with the disk I/O cost of the sstable backups/copies,
note that since sstables are append-only, if you adopt an incremental
approach to your backups, you only need to track a list of the current
files and upload the files that are new compared to a previous successful
backup. Your "manifest" of files for a node will need to have references to
the previous backup, and you'll wnat to "reset" with a full backup each
month.

I stole that idea from https://github.com/tbarbugli/cassandra_snapshotter.
I would have used that but we had more complex node access modes
(kubernetes, ssh through jumphosts, etc) and lots of other features needed
that weren't supported.

In AWS I use aws profiles to throttle the transfers, and parallelize across
nodes. The basic unit of a successful backup is a single node, but you'll
obviously want to track overall node success.

Note that in rack-based topologies you really only need one whole
successful rack if your RF is > # racks, and one DC.

Beware doing simultaneous flushes/snapshots across the cluster at once,
that might be the equivalent of a DDos. You might want to do a "jittered"
randomized preflush of the cluster first before doing the snapshotting.

Unfortunately, the nature of a distributed system is that snapshotting all
the nodes at the precise same time is a hard problem.

I also do not / have not used the built-in incremental backup feature of
cassandra, which can enable more precise point-in-time backups (aside from
the unflushed data in the commitlogs)

A note on incrementals with occaisional FULLs: Note that FULL backups
monthly might take more than a day or two, especially throttled. My
incrementals were originally looking up previous manifests using only 'most
recent", but then the long-running FULL backups were excluded from the
"chain" of incremental backups. So I now implement a fuzzy lookup for the
incrementals that prioritizes any FULL in the last 5 days over any more
recent incremental. Thus you can purge old backups you don't need more
safely using the monthly full backups as a reset point.

On Mon, Apr 1, 2019 at 1:08 PM Alain RODRIGUEZ <ar...@gmail.com> wrote:

> Hello Manish,
>
> I think any disk works. As long as it is big enough. It's also better if
> it's a reliable system (some kind of redundant raid, NAS, storage like GCS
> or S3...). We are not looking for speed mostly during a backup, but
> resiliency and not harming the source cluster mostly I would say.
> Then how fast you write to the backup storage system will probably be more
> often limited by what you can read from the source cluster.
> The backups have to be taken from running nodes, thus it's easy to
> overload the disk (reads), network (export backup data to final
> destination), and even CPU (as/if the machine handles the transfer).
>
> What are the best practices while designing backup storage system for big
>> Cassandra cluster?
>
>
> What is nice to have (not to say mandatory) is a system of incremental
> backups. You should not take the data from the nodes every time, or you'll
> either harm the cluster regularly OR spend days to transfer the data (if
> the amount of data grows big enough).
> I'm not speaking about Cassandra incremental snapshots, but of using
> something like AWS Snapshot, or copying this behaviour programmatically to
> take (copy, link?) old SSTables from previous backups when they exist, will
> greatly unload the clusters work and the resource needed as soon enough a
> substantial amount of the data should be coming from the backup data source
> itself. The problem with incremental snapshot is that when restoring, you
> have to restore multiple pieces, making it harder and involving a lot of
> compaction work.
> The "caching" technic mentioned above gives the best of the 2 worlds:
> - You will always backup from the nodes only the sstables you don’t have
> already in your backup storage system,
> - You will always restore easily as each backup is a full backup.
>
> It's not really a "hands-on" writing, but this should let you know about
> existing ways to do backups and the tradeoffs, I wrote this a year ago:
> http://thelastpickle.com/blog/2018/04/03/cassandra-backup-and-restore-aws-ebs.html
> .
>
> It's a complex topic, I hope some of this is helpful to you.
>
> C*heers,
> -----------------------
> Alain Rodriguez - alain@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
> Le jeu. 28 mars 2019 à 11:24, manish khandelwal <
> manishkhandelwal03@gmail.com> a écrit :
>
>> Hi
>>
>>
>>
>> I would like to know is there any guideline for selecting storage device
>> (disk type) for Cassandra backups.
>>
>>
>>
>> As per my current observation, NearLine (NL) disk on SAN  slows down
>> significantly while copying backup files (taking full backup) from all node
>> simultaneously. Will using SSD disk on SAN help us in this regard?
>>
>> Apart from using SSD disk, what are the alternative approach to make my
>> backup process fast?
>>
>> What are the best practices while designing backup storage system for big
>> Cassandra cluster?
>>
>>
>> Regards
>>
>> Manish
>>
>

Re: Best practices while designing backup storage system for big Cassandra cluster

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
Hello Manish,

I think any disk works. As long as it is big enough. It's also better if
it's a reliable system (some kind of redundant raid, NAS, storage like GCS
or S3...). We are not looking for speed mostly during a backup, but
resiliency and not harming the source cluster mostly I would say.
Then how fast you write to the backup storage system will probably be more
often limited by what you can read from the source cluster.
The backups have to be taken from running nodes, thus it's easy to overload
the disk (reads), network (export backup data to final destination), and
even CPU (as/if the machine handles the transfer).

What are the best practices while designing backup storage system for big
> Cassandra cluster?


What is nice to have (not to say mandatory) is a system of incremental
backups. You should not take the data from the nodes every time, or you'll
either harm the cluster regularly OR spend days to transfer the data (if
the amount of data grows big enough).
I'm not speaking about Cassandra incremental snapshots, but of using
something like AWS Snapshot, or copying this behaviour programmatically to
take (copy, link?) old SSTables from previous backups when they exist, will
greatly unload the clusters work and the resource needed as soon enough a
substantial amount of the data should be coming from the backup data source
itself. The problem with incremental snapshot is that when restoring, you
have to restore multiple pieces, making it harder and involving a lot of
compaction work.
The "caching" technic mentioned above gives the best of the 2 worlds:
- You will always backup from the nodes only the sstables you don’t have
already in your backup storage system,
- You will always restore easily as each backup is a full backup.

It's not really a "hands-on" writing, but this should let you know about
existing ways to do backups and the tradeoffs, I wrote this a year ago:
http://thelastpickle.com/blog/2018/04/03/cassandra-backup-and-restore-aws-ebs.html
.

It's a complex topic, I hope some of this is helpful to you.

C*heers,
-----------------------
Alain Rodriguez - alain@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com


Le jeu. 28 mars 2019 à 11:24, manish khandelwal <
manishkhandelwal03@gmail.com> a écrit :

> Hi
>
>
>
> I would like to know is there any guideline for selecting storage device
> (disk type) for Cassandra backups.
>
>
>
> As per my current observation, NearLine (NL) disk on SAN  slows down
> significantly while copying backup files (taking full backup) from all node
> simultaneously. Will using SSD disk on SAN help us in this regard?
>
> Apart from using SSD disk, what are the alternative approach to make my
> backup process fast?
>
> What are the best practices while designing backup storage system for big
> Cassandra cluster?
>
>
> Regards
>
> Manish
>