You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefan Miklosovic (Jira)" <ji...@apache.org> on 2022/12/15 22:39:00 UTC
[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

    [ https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17648276#comment-17648276 ] 

Stefan Miklosovic edited comment on CASSANDRA-14013 at 12/15/22 10:38 PM:
--------------------------------------------------------------------------

_We did not have a test on DescriptorTest#testKeyspaceTableParsing to pick up the scenario of a legacy "backups" table directory and keyspace, when the table id suffix is missing so I added it here._

So, we _do_ support that, still, right? In that case, could you add a test in SSTableLoaderTest as it was, that it is loading it just fine without uuid as well? Just same thing as it was before.

When it comes to branches, more branches better it is :D I made the peace with having it in 4.0+, you will have bonus points for anything older. However, having data being lost on this kind of stuff is rather embarrassing in 2022 (almost 2023!)



was (Author: smiklosovic):
_On this commit, I added the ./* prefix to the regex which made it pick up the case of a "backups" table in the legacy directory format without the table uuid. I also updated SSTableLoaderTest to use the new table directory format._

Just to be sure, if you changed that test to support new table format, does that mean that a user in 4.2 / 5.0 will not be able to import sstables in table dir called "backups"? That is basically regression when it comes to CASSANDRA-16235.

But I think I am wrong because here in your you write:

_We did not have a test on DescriptorTest#testKeyspaceTableParsing to pick up the scenario of a legacy "backups" table directory and keyspace, when the table id suffix is missing so I added it here._

So, we _do_ support that, still, right? In that case, could you add a test in SSTableLoaderTest as it was, that it is loading it just fine without uuid as well? Just same thing as it was before.

When it comes to branches, more branches better it is :D I made the peace with having it in 4.0+, you will have bonus points for anything older. However, having data being lost on this kind of stuff is rather embarrassing in 2022 (almost 2023!)


> Data loss in snapshots keyspace after service restart
> -----------------------------------------------------
>
>                 Key: CASSANDRA-14013
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Core, Local/Snapshots
>            Reporter: Gregor Uhlenheuer
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 4.0.x, 4.1.x, 4.x
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing because I can't imagine a reasonable answer for the behavior I see right now :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after restarting the Cassandra service. Say I do have 1000 records in a table called *snapshots.test_idx* then after restart the table has less entries or is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill <cassandra-pid>
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org