You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jens Rantil <je...@tink.se> on 2014/08/23 17:06:05 UTC

Question about incremental backup

Hi,


I am setting backup and restoration tooling for a Cassandra cluster and have a specific question regarding incremental backup.


Let’s say I’m running incremental backups and take a snapshot. At the exact(ish) same time as my snapshot it taken another incremental *.db file is hard linked into the backups directory. My question is, how do I know which snapshot my incremental file belongs to?


If it was made half a second _before_ my snapshot, it belongs to the previous snapshot. If it was made half a second after my snapshot, I guess it belongs to my latest snapshot. Or, is this not an issue since I can always include the uncertain incremental file when restoring (since timestamps are always included with every column value)?


Thanks,
Jens

———
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook Linkedin Twitter

Re: Question about incremental backup

Posted by Andrey Ilinykh <ai...@gmail.com>.
keep in mind backing up SSTables is not enough. To have truly incremental
backup you have to store commit logs also.

Thank you,
  Andrey


On Sat, Aug 23, 2014 at 11:30 AM, Robert Coli <rc...@eventbrite.com> wrote:

> On Sat, Aug 23, 2014 at 8:06 AM, Jens Rantil <je...@tink.se> wrote:
>
>>  I am setting backup and restoration tooling for a Cassandra cluster and
>> have a specific question regarding incremental backup.
>>
>> Let’s say I’m running incremental backups and take a snapshot. At the
>> exact(ish) same time as my snapshot it taken another incremental *.db file
>> is hard linked into the backups directory. My question is, how do I know
>> which snapshot my incremental file belongs to?
>>
>
> Tablesnap avoids this race by snapshotting files directly from the data
> directory, and backing it up with a meta-information file that contains a
> list of all SSTables in the data directory at the time it notices a new
> one. You can probably do something similar with the incremental snapshot
> system, but you might want to consider if you need to. :D
>
> https://github.com/JeremyGrosser/tablesnap
>
> =Rob
>
>
>
>

Re: Question about incremental backup

Posted by Robert Coli <rc...@eventbrite.com>.
On Sat, Aug 23, 2014 at 8:06 AM, Jens Rantil <je...@tink.se> wrote:

>  I am setting backup and restoration tooling for a Cassandra cluster and
> have a specific question regarding incremental backup.
>
> Let’s say I’m running incremental backups and take a snapshot. At the
> exact(ish) same time as my snapshot it taken another incremental *.db file
> is hard linked into the backups directory. My question is, how do I know
> which snapshot my incremental file belongs to?
>

Tablesnap avoids this race by snapshotting files directly from the data
directory, and backing it up with a meta-information file that contains a
list of all SSTables in the data directory at the time it notices a new
one. You can probably do something similar with the incremental snapshot
system, but you might want to consider if you need to. :D

https://github.com/JeremyGrosser/tablesnap

=Rob