You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Reynald Bourtembourg <re...@esrf.fr> on 2016/05/31 09:17:15 UTC

Re: [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

Hi Paul,

I guess this might come from the incremental repairs...
The repair time is stored in the sstable (RepairedAt timestamp metadata).

Cheers,
Reynald

On 31/05/2016 11:03, Paul Dunkler wrote:
> Hi there,
>
> i am sometimes running in very strange errors while backing up 
> snapshots from a cassandra cluster.
>
> Cassandra version:
> 2.1.11
>
> What i basically do:
> 1. nodetool snapshot
> 2. tar all snapshot folders into one file
> 3. transfer them to another server
>
> What happens is that tar just sometimes give the error message "file 
> changed as we read it" while its adding a .db-file from the folder of 
> the previously created snapshot.
> If i understand everything correct, this SHOULD never happen. 
> Snapshots should be totally immutable, right?
>
> Am i maybe hitting a bug or is there some rare case with running 
> repair operations or what-so-ever which can change snapshotted data?
> I already searched through cassandra jira but couldn't find a bug 
> which looks related to this behaviour.
>
> Would love to get some help on this.
>
> —
> Paul Dunkler


Re: [Marketing Mail] [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

Posted by Paul Dunkler <pa...@uplex.de>.
Hi there,

just for your information: I found out whats actually happening.
It's not the file content which is changing. It's the number of hard links to the files which is changing because during the tar is running, some incremental backups are created.

In that case, tar gives a warning that some data changed while it was reading. It exits with code 1 - and this needs to be ignored in the script.


> Hi Reynald,
> 
>> If I understand correctly, you are making a tar file with all the folders named "snapshots" (i.e. the folder under which all the snapshots are created. So you have one snapshots folder per table).
> 
> No, Thats not the case. We are doing a nightly snapshot of the whole database (named with a date).
> 
>> If this is the case, when you are executing "nodetool repair", Cassandra will create a snapshot at the beginning of the repair, creating a new directory under each snapshots directories. If this happens while you are creating your tar file, you will get the error you saw.
> 
> Sure, i am aware of that. No, we are only taring the snapshot we just created some second before.
> 
>> If you are not yet doing it, I advise you to use the -t option of the "nodetool snapshot" command to specify a specific name to your snapshot.
>> Then you should copy only the directories named snapshots/<your_specific_snapshot_name> in your tar file.
> 
> We're doing exactly that.
> 
>> Can you confirm that you are creating your tar file from the snapshots directories directly (resulting in taking all the currently generated snapshots)?
> 
> As i wrote above, we just do a snapshot and ONLY tar this exact snapshot.
> 
> Short (high level) overview about our backup script:
> 
> 1. check if repair is running - if yes, exit
> 2. Dump the current db-schema
> 3. nodetool snapshot -t $(date)
> 4. Wait 60 seconds
> 5. Create a tar of all snapshot folders with the date we just created
> 6. Copy that away to a remote server
> 
>> 
>> Kind regards
>> 
>> Reynald
>> 
>> On 01/06/2016 13:27, Paul Dunkler wrote:
>>>> I guess this might come from the incremental repairs...
>>>> The repair time is stored in the sstable (RepairedAt timestamp metadata).
>>> 
>>> By the way: We are not using incremental repairs at all. So can't be the case here.
>>> 
>>> It really seems like there is somewhat that can still change data in snapshot directories. I feel like it's something to do with flushing / compaction. But no clue, what... :(
>>> 
>>>> 
>>>> Cheers,
>>>> Reynald
>>>> 
>>>> On 31/05/2016 11:03, Paul Dunkler wrote:
>>>>> Hi there,
>>>>> 
>>>>> i am sometimes running in very strange errors while backing up snapshots from a cassandra cluster.
>>>>> 
>>>>> Cassandra version:
>>>>> 2.1.11
>>>>> 
>>>>> What i basically do:
>>>>> 1. nodetool snapshot
>>>>> 2. tar all snapshot folders into one file
>>>>> 3. transfer them to another server
>>>>> 
>>>>> What happens is that tar just sometimes give the error message "file changed as we read it" while its adding a .db-file from the folder of the previously created snapshot.
>>>>> If i understand everything correct, this SHOULD never happen. Snapshots should be totally immutable, right?
>>>>> 
>>>>> Am i maybe hitting a bug or is there some rare case with running repair operations or what-so-ever which can change snapshotted data?
>>>>> I already searched through cassandra jira but couldn't find a bug which looks related to this behaviour.
>>>>> 
>>>>> Would love to get some help on this.
>>>>> 
>>>>> —
>>>>> Paul Dunkler
>>>> 
>>> 
>>> —
>>> Paul Dunkler
>> 
> 
> —
> Paul Dunkler
> 
> ** * * UPLEX - Nils Goroll Systemoptimierung
> 
> Scheffelstraße 32
> 22301 Hamburg
> 
> tel +49 40 288 057 31
> mob +49 151 252 228 42
> fax +49 40 429 497 53
> 
> xmpp://pauldunkler@jabber.ccc <xm...@jabber.ccc>.de
> 
> http://uplex.de/ <http://uplex.de/>
—
Paul Dunkler

** * * UPLEX - Nils Goroll Systemoptimierung

Scheffelstraße 32
22301 Hamburg

tel +49 40 288 057 31
mob +49 151 252 228 42
fax +49 40 429 497 53

xmpp://pauldunkler@jabber.ccc <xm...@jabber.ccc>.de

http://uplex.de/ <http://uplex.de/>

Re: [Marketing Mail] [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

Posted by Paul Dunkler <pa...@uplex.de>.
Hi Reynald,

> If I understand correctly, you are making a tar file with all the folders named "snapshots" (i.e. the folder under which all the snapshots are created. So you have one snapshots folder per table).

No, Thats not the case. We are doing a nightly snapshot of the whole database (named with a date).

> If this is the case, when you are executing "nodetool repair", Cassandra will create a snapshot at the beginning of the repair, creating a new directory under each snapshots directories. If this happens while you are creating your tar file, you will get the error you saw.

Sure, i am aware of that. No, we are only taring the snapshot we just created some second before.

> If you are not yet doing it, I advise you to use the -t option of the "nodetool snapshot" command to specify a specific name to your snapshot.
> Then you should copy only the directories named snapshots/<your_specific_snapshot_name> in your tar file.

We're doing exactly that.

> Can you confirm that you are creating your tar file from the snapshots directories directly (resulting in taking all the currently generated snapshots)?

As i wrote above, we just do a snapshot and ONLY tar this exact snapshot.

Short (high level) overview about our backup script:

1. check if repair is running - if yes, exit
2. Dump the current db-schema
3. nodetool snapshot -t $(date)
4. Wait 60 seconds
5. Create a tar of all snapshot folders with the date we just created
6. Copy that away to a remote server

> 
> Kind regards
> 
> Reynald
> 
> On 01/06/2016 13:27, Paul Dunkler wrote:
>>> I guess this might come from the incremental repairs...
>>> The repair time is stored in the sstable (RepairedAt timestamp metadata).
>> 
>> By the way: We are not using incremental repairs at all. So can't be the case here.
>> 
>> It really seems like there is somewhat that can still change data in snapshot directories. I feel like it's something to do with flushing / compaction. But no clue, what... :(
>> 
>>> 
>>> Cheers,
>>> Reynald
>>> 
>>> On 31/05/2016 11:03, Paul Dunkler wrote:
>>>> Hi there,
>>>> 
>>>> i am sometimes running in very strange errors while backing up snapshots from a cassandra cluster.
>>>> 
>>>> Cassandra version:
>>>> 2.1.11
>>>> 
>>>> What i basically do:
>>>> 1. nodetool snapshot
>>>> 2. tar all snapshot folders into one file
>>>> 3. transfer them to another server
>>>> 
>>>> What happens is that tar just sometimes give the error message "file changed as we read it" while its adding a .db-file from the folder of the previously created snapshot.
>>>> If i understand everything correct, this SHOULD never happen. Snapshots should be totally immutable, right?
>>>> 
>>>> Am i maybe hitting a bug or is there some rare case with running repair operations or what-so-ever which can change snapshotted data?
>>>> I already searched through cassandra jira but couldn't find a bug which looks related to this behaviour.
>>>> 
>>>> Would love to get some help on this.
>>>> 
>>>> —
>>>> Paul Dunkler
>>> 
>> 
>> —
>> Paul Dunkler
> 

—
Paul Dunkler

** * * UPLEX - Nils Goroll Systemoptimierung

Scheffelstraße 32
22301 Hamburg

tel +49 40 288 057 31
mob +49 151 252 228 42
fax +49 40 429 497 53

xmpp://pauldunkler@jabber.ccc <xm...@jabber.ccc>.de

http://uplex.de/ <http://uplex.de/>

Re: [Marketing Mail] Re: [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

Posted by Reynald Bourtembourg <re...@esrf.fr>.
Hi Paul,

If I understand correctly, you are making a tar file with all the 
folders named "snapshots" (i.e. the folder under which all the snapshots 
are created. So you have one /snapshots /folder per table).
If this is the case, when you are executing "nodetool repair", Cassandra 
will create a snapshot at the beginning of the repair, creating a new 
directory under each /snapshots/ directories. If this happens while you 
are creating your tar file, you will get the error you saw.

If you are not yet doing it, I advise you to use the -t option of the 
"nodetool snapshot" command to specify a specific name to your snapshot.
Then you should copy only the directories named 
snapshots/<your_specific_snapshot_name> in your tar file.

Can you confirm that you are creating your tar file from the snapshots 
directories directly (resulting in taking all the currently generated 
snapshots)?

Kind regards

Reynald

On 01/06/2016 13:27, Paul Dunkler wrote:
>> I guess this might come from the incremental repairs...
>> The repair time is stored in the sstable (RepairedAt timestamp metadata).
>
> By the way: We are not using incremental repairs at all. So can't be 
> the case here.
>
> It really seems like there is somewhat that can still change data in 
> snapshot directories. I feel like it's something to do with flushing / 
> compaction. But no clue, what... :(
>
>>
>> Cheers,
>> Reynald
>>
>> On 31/05/2016 11:03, Paul Dunkler wrote:
>>> Hi there,
>>>
>>> i am sometimes running in very strange errors while backing up 
>>> snapshots from a cassandra cluster.
>>>
>>> Cassandra version:
>>> 2.1.11
>>>
>>> What i basically do:
>>> 1. nodetool snapshot
>>> 2. tar all snapshot folders into one file
>>> 3. transfer them to another server
>>>
>>> What happens is that tar just sometimes give the error message "file 
>>> changed as we read it" while its adding a .db-file from the folder 
>>> of the previously created snapshot.
>>> If i understand everything correct, this SHOULD never happen. 
>>> Snapshots should be totally immutable, right?
>>>
>>> Am i maybe hitting a bug or is there some rare case with running 
>>> repair operations or what-so-ever which can change snapshotted data?
>>> I already searched through cassandra jira but couldn't find a bug 
>>> which looks related to this behaviour.
>>>
>>> Would love to get some help on this.
>>>
>>> —
>>> Paul Dunkler
>>
>
> —
> Paul Dunkler


Re: [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

Posted by Paul Dunkler <pa...@uplex.de>.
> I guess this might come from the incremental repairs...
> The repair time is stored in the sstable (RepairedAt timestamp metadata).

By the way: We are not using incremental repairs at all. So can't be the case here.

It really seems like there is somewhat that can still change data in snapshot directories. I feel like it's something to do with flushing / compaction. But no clue, what... :(

> 
> Cheers,
> Reynald
> 
> On 31/05/2016 11:03, Paul Dunkler wrote:
>> Hi there,
>> 
>> i am sometimes running in very strange errors while backing up snapshots from a cassandra cluster.
>> 
>> Cassandra version:
>> 2.1.11
>> 
>> What i basically do:
>> 1. nodetool snapshot
>> 2. tar all snapshot folders into one file
>> 3. transfer them to another server
>> 
>> What happens is that tar just sometimes give the error message "file changed as we read it" while its adding a .db-file from the folder of the previously created snapshot.
>> If i understand everything correct, this SHOULD never happen. Snapshots should be totally immutable, right?
>> 
>> Am i maybe hitting a bug or is there some rare case with running repair operations or what-so-ever which can change snapshotted data?
>> I already searched through cassandra jira but couldn't find a bug which looks related to this behaviour.
>> 
>> Would love to get some help on this.
>> 
>> —
>> Paul Dunkler
> 

—
Paul Dunkler

Re: [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

Posted by Paul Dunkler <pa...@uplex.de>.
Hi Mike,

> Hi Paul, what is the value of the snapshot_before_compaction property in your cassandra.yaml?

snapshot_before_compaction: false

> Say if another snapshot is being taken (because compaction kicked in and snapshot_before_compaction property is set to TRUE) and at this moment you're tarring the snapshot folders......

Okay, totally understand. But this feature is currently disabled on our side.
> Maybe can take a look at the records in system.compaction:
> 
> select * from system.compaction_history;
> 
I did so and found a snapshot exactly starting at 01:30 (roughly at the same time the snapshot starts).
Name of the snapshotted table matches the .db-File, tar is complaining about.

What happened here is that we are saving incremental backups every 10 minutes. In that process we do a manual nodetool flush.
This nodetool flush seems to trigger compactions. Just checked the cassandra log and compactions always take place at every 10 minutes.

We are using the SizeTieredCompactionStrategy for all tables.
Is it true that - using that strategy - compaction is triggered exactly at the point where a flush (automatically or manually) is done?

Probably it would be a better idea to not do manual flushes when saving the incremental_backups (because then compactions won't happen at same time with snapshot), right?
> Regards,
> 
> Mike Yeap
> 
> 
> 
> 
> On Tue, May 31, 2016 at 5:21 PM, Paul Dunkler <paul@uplex.de <ma...@uplex.de>> wrote:
> And - as an addition:
> 
> Shoudln't that be documented that even snapshot files can change?
> 
>> I guess this might come from the incremental repairs...
>>> The repair time is stored in the sstable (RepairedAt timestamp metadata).
>> 
>> ok, that sounds interesting.
>> Could that also happen to incremental backup files as well? I had another case where incremental backup files were totally deleted automagically.
>> 
>> And - what is the suggested way to solve that problem? Should i try again tar-ing the snapshot until it doesn't happen anymore that something changes in between?
>> Or is there a way to "pause" the incremental repairs?
>> 
>> 
>>> Cheers,
>>> Reynald
>>> 
>>> On 31/05/2016 11:03, Paul Dunkler wrote:
>>>> Hi there,
>>>> 
>>>> i am sometimes running in very strange errors while backing up snapshots from a cassandra cluster.
>>>> 
>>>> Cassandra version:
>>>> 2.1.11
>>>> 
>>>> What i basically do:
>>>> 1. nodetool snapshot
>>>> 2. tar all snapshot folders into one file
>>>> 3. transfer them to another server
>>>> 
>>>> What happens is that tar just sometimes give the error message "file changed as we read it" while its adding a .db-file from the folder of the previously created snapshot.
>>>> If i understand everything correct, this SHOULD never happen. Snapshots should be totally immutable, right?
>>>> 
>>>> Am i maybe hitting a bug or is there some rare case with running repair operations or what-so-ever which can change snapshotted data?
>>>> I already searched through cassandra jira but couldn't find a bug which looks related to this behaviour.
>>>> 
>>>> Would love to get some help on this.
>>>> 
>>>> —
>>>> Paul Dunkler
>>> 
>> 
>> —
>> Paul Dunkler
> 
> —
> Paul Dunkler

—
Paul Dunkler

Re: [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

Posted by Mike Yeap <wk...@gmail.com>.
Hi Paul, what is the value of the snapshot_before_compaction property in
your cassandra.yaml?

Say if another snapshot is being taken (because compaction kicked in and
snapshot_before_compaction property is set to TRUE) and at this moment
you're tarring the snapshot folders......

Maybe can take a look at the records in system.compaction:

select * from system.compaction_history;


Regards,
Mike Yeap



On Tue, May 31, 2016 at 5:21 PM, Paul Dunkler <pa...@uplex.de> wrote:

> And - as an addition:
>
> Shoudln't that be documented that even snapshot files can change?
>
> I guess this might come from the incremental repairs...
>
> The repair time is stored in the sstable (RepairedAt timestamp metadata).
>
>
> ok, that sounds interesting.
> Could that also happen to incremental backup files as well? I had another
> case where incremental backup files were totally deleted automagically.
>
> And - what is the suggested way to solve that problem? Should i try again
> tar-ing the snapshot until it doesn't happen anymore that something changes
> in between?
> Or is there a way to "pause" the incremental repairs?
>
>
> Cheers,
> Reynald
>
> On 31/05/2016 11:03, Paul Dunkler wrote:
>
> Hi there,
>
> i am sometimes running in very strange errors while backing up snapshots
> from a cassandra cluster.
>
> Cassandra version:
> 2.1.11
>
> What i basically do:
> 1. nodetool snapshot
> 2. tar all snapshot folders into one file
> 3. transfer them to another server
>
> What happens is that tar just sometimes give the error message "file
> changed as we read it" while its adding a .db-file from the folder of the
> previously created snapshot.
> If i understand everything correct, this SHOULD never happen. Snapshots
> should be totally immutable, right?
>
> Am i maybe hitting a bug or is there some rare case with running repair
> operations or what-so-ever which can change snapshotted data?
> I already searched through cassandra jira but couldn't find a bug which
> looks related to this behaviour.
>
> Would love to get some help on this.
>
> —
> Paul Dunkler
>
>
>
> —
> Paul Dunkler
>
> ** * * UPLEX - Nils Goroll Systemoptimierung
>
> Scheffelstraße 32
> 22301 Hamburg
>
> tel +49 40 288 057 31
> mob +49 151 252 228 42
> fax +49 40 429 497 53
>
> xmpp://pauldunkler@jabber.ccc.de
>
> http://uplex.de/
>
>
> —
> Paul Dunkler
>
> ** * * UPLEX - Nils Goroll Systemoptimierung
>
> Scheffelstraße 32
> 22301 Hamburg
>
> tel +49 40 288 057 31
> mob +49 151 252 228 42
> fax +49 40 429 497 53
>
> xmpp://pauldunkler@jabber.ccc.de
>
> http://uplex.de/
>
>

Re: [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

Posted by Paul Dunkler <pa...@uplex.de>.
And - as an addition:

Shoudln't that be documented that even snapshot files can change?

> I guess this might come from the incremental repairs...
>> The repair time is stored in the sstable (RepairedAt timestamp metadata).
> 
> ok, that sounds interesting.
> Could that also happen to incremental backup files as well? I had another case where incremental backup files were totally deleted automagically.
> 
> And - what is the suggested way to solve that problem? Should i try again tar-ing the snapshot until it doesn't happen anymore that something changes in between?
> Or is there a way to "pause" the incremental repairs?
> 
> 
>> Cheers,
>> Reynald
>> 
>> On 31/05/2016 11:03, Paul Dunkler wrote:
>>> Hi there,
>>> 
>>> i am sometimes running in very strange errors while backing up snapshots from a cassandra cluster.
>>> 
>>> Cassandra version:
>>> 2.1.11
>>> 
>>> What i basically do:
>>> 1. nodetool snapshot
>>> 2. tar all snapshot folders into one file
>>> 3. transfer them to another server
>>> 
>>> What happens is that tar just sometimes give the error message "file changed as we read it" while its adding a .db-file from the folder of the previously created snapshot.
>>> If i understand everything correct, this SHOULD never happen. Snapshots should be totally immutable, right?
>>> 
>>> Am i maybe hitting a bug or is there some rare case with running repair operations or what-so-ever which can change snapshotted data?
>>> I already searched through cassandra jira but couldn't find a bug which looks related to this behaviour.
>>> 
>>> Would love to get some help on this.
>>> 
>>> —
>>> Paul Dunkler
>> 
> 
> —
> Paul Dunkler
> 
> ** * * UPLEX - Nils Goroll Systemoptimierung
> 
> Scheffelstraße 32
> 22301 Hamburg
> 
> tel +49 40 288 057 31
> mob +49 151 252 228 42
> fax +49 40 429 497 53
> 
> xmpp://pauldunkler@jabber.ccc <xm...@jabber.ccc>.de
> 
> http://uplex.de/ <http://uplex.de/>

—
Paul Dunkler

** * * UPLEX - Nils Goroll Systemoptimierung

Scheffelstraße 32
22301 Hamburg

tel +49 40 288 057 31
mob +49 151 252 228 42
fax +49 40 429 497 53

xmpp://pauldunkler@jabber.ccc <xm...@jabber.ccc>.de

http://uplex.de/ <http://uplex.de/>

Re: [Marketing Mail] Cassandra 2.1: Snapshot data changing while transferring

Posted by Paul Dunkler <pa...@uplex.de>.
Hi there,

> I guess this might come from the incremental repairs...
> The repair time is stored in the sstable (RepairedAt timestamp metadata).

ok, that sounds interesting.
Could that also happen to incremental backup files as well? I had another case where incremental backup files were totally deleted automagically.

And - what is the suggested way to solve that problem? Should i try again tar-ing the snapshot until it doesn't happen anymore that something changes in between?
Or is there a way to "pause" the incremental repairs?


> Cheers,
> Reynald
> 
> On 31/05/2016 11:03, Paul Dunkler wrote:
>> Hi there,
>> 
>> i am sometimes running in very strange errors while backing up snapshots from a cassandra cluster.
>> 
>> Cassandra version:
>> 2.1.11
>> 
>> What i basically do:
>> 1. nodetool snapshot
>> 2. tar all snapshot folders into one file
>> 3. transfer them to another server
>> 
>> What happens is that tar just sometimes give the error message "file changed as we read it" while its adding a .db-file from the folder of the previously created snapshot.
>> If i understand everything correct, this SHOULD never happen. Snapshots should be totally immutable, right?
>> 
>> Am i maybe hitting a bug or is there some rare case with running repair operations or what-so-ever which can change snapshotted data?
>> I already searched through cassandra jira but couldn't find a bug which looks related to this behaviour.
>> 
>> Would love to get some help on this.
>> 
>> —
>> Paul Dunkler
> 

—
Paul Dunkler

** * * UPLEX - Nils Goroll Systemoptimierung

Scheffelstraße 32
22301 Hamburg

tel +49 40 288 057 31
mob +49 151 252 228 42
fax +49 40 429 497 53

xmpp://pauldunkler@jabber.ccc <xm...@jabber.ccc>.de

http://uplex.de/ <http://uplex.de/>