You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Paulo Motta (Jira)" <ji...@apache.org> on 2022/12/14 17:20:00 UTC

[jira] [Commented] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

    [ https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17647625#comment-17647625 ] 

Paulo Motta commented on CASSANDRA-14013:
-----------------------------------------

I think the proposed fix looks reasonable, with the caveat that it will remove support for pre-2.1 backups/snapshots, since the "-" separator is not present in pre-2.1 table directories. I don't think this is a big issue since we no longer support pre-2.1 format and there is an easy workaround.

I agree with the sentiment expressed in this ticket that the current approach to extract keyspace/index/table from sstable directories is fragile and needs improvement.

In order to make the extracton of keyspace/table/index more robust, I propose encoding the expected sstable directory structure into the following regex:
{noformat}
{keyspace}/{tableName}-{tableId}[\backups|\snapshots\{tag}][.indexName]/{component}.db
{noformat}
This simplifies the extraction of information from directories to:
{noformat}
    if (regex.matches(fullSstablePath))
    {
        keyspaceName = regex.group("keyspace");
        tableName =  regex.group("tableName");
        String indexName =  regex.group("indexName");
        if (indexName != null)
        {
            tableName = String.format("%s.%s", tableName, indexName);
        }
    }
    else if (validateDirs)
    {
        throw invalidSSTable(name, "cannot extract keyspace and table name; make sure the sstable is in the proper sub-directories");
    }
{noformat}
The deterministic regex allow to fail-fast when an illegal directory structure is found. Furthermore it can simplify supporting more complex or multiple directory structures if needed.

For example, in order to make tools read sstable in pre-2.1 directory format, we can update the code above to:
{noformat}
    regex =  SSTABLE_DIR_PATTERN;
    if (!regex.matches(fullSstablePath)){
            logger.info("Regex failed, falling back to legacy sstable format");
            regex = LEGACY_SSTABLE_DIR_PATTERN;
    }
    if (regex.matches(fullSstablePath))
    {
        keyspaceName = regex.group("keyspace");
        tableName =  regex.group("tableName");
        String indexName =  regex.group("indexName");
        if (indexName != null)
        {
            tableName = String.format("%s.%s", tableName, indexName);
        }
    }
    else if (validateDirs)
    {
        throw invalidSSTable(name, "cannot extract keyspace and table name; make sure the sstable is in the proper sub-directories");
    }
{noformat}
I came up with the following tentative regex for sstable directory structure:
{noformat}
(?<keyspace>\w+)\/(?<tableName>\w+)-(?<tableId>[0-9a-f]{32})\/(backups\/|(snapshots\/(?<tag>[\w-]+)\/))?(\.(?<indexName>[\w-]+)\/)?(?<component>[\w-]+)\.db
{noformat}
Breaking down into components:
{noformat}
* keyspace: any word "\w+ "
* tableName: any word  "\w+"
* tableId: 32 HEX characters  "[0-9a-f]{32}"
* optional "/backups" or "/snapshots/{tag}" pre-directories: "(backups\/|(snapshots\/(?<tag>[\w-]+)\/))?"
* optional ".{indexName}" pre-directory: any word with dashes "(.(?<indexName>[\w-]+)\/)?"
* sstable component {component}.db: any word with dashes {{(?<component>[\w-]+)\.db$}}
{noformat}
The above probably needs validation to check if anything is wrong/missing ^. We can further refine component regex parsing if needed (ie. to extract individual component name for example).

You can give it a try on the regex above on [https://regexr.com/] with the examples below:
{noformat}
/path/to/cassandra/data/dir2/dir5/dir6/ks1/tab1-34234234234234234234234234234234/na-1-big-Index.db
/tmp/some/path/tests/keyspace/table-34234234234234234234234234234234/snapshots/snapshots/.index/nb-3g1m_0nuf_3vj5m2k1125165rxa7-big-Index.db
{noformat}
I implemented the above approach on this commit: [https://github.com/pauloricardomg/cassandra/commit/402e7e3b521d92ed12592daf10a9cdbf47846a60] 

When fixing the tests, I noticed that some examples in the tests were not conforming to the expected directory structure, for example:
{noformat}
/path/to/cassandra/data/dir2/dir5/dir6/ks1/tab1-3424234234324/backups/nb-3g1m_0nuf_3vj5m2k1125165rxa7-big-Index.db
{noformat}
As can be seen, the table id did not have 32 characters that is expected for the table uuid - so I fxed these examples to use UUID "34234234234234234234234234234234" instead.

I noticed that {{DescriptorTest.validateNames()}} expects to read Descriptors from sstables not conforming to the expected directory structure, when it's not possible to extract keyspace/table information. I believe this is used by external tools. In other to allow these "unsafe" usages by offline tools, I have added a [validateDirs|https://github.com/apache/cassandra/commit/402e7e3b521d92ed12592daf10a9cdbf47846a60#diff-69de3dfecd03ec3ea98d88bef04bf3a2bca2a02b488fd20e677c04ffad322bbdR282] parameter to {{Descriptor.fromFilenameWithComponent}} that throws an error when the directory does not match the expected structure. Tools that do not require fail-behavior can set validateDirs=false, but for online sstable reading this would throw an error when an illegal directory structure is found.

I agree with [~e.dimitrova] suggestion that we should include keyspace/table/index name in a sstable component to avoid needing to parse directory structure to find out this information.

I propose applying the simpler fix on earlier versions (4.x), and the improved regex-based fix on trunk. What do you think?

> Data loss in snapshots keyspace after service restart
> -----------------------------------------------------
>
>                 Key: CASSANDRA-14013
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Core, Local/Snapshots
>            Reporter: Gregor Uhlenheuer
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 4.0.x, 4.1.x, 4.x
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing because I can't imagine a reasonable answer for the behavior I see right now :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after restarting the Cassandra service. Say I do have 1000 records in a table called *snapshots.test_idx* then after restart the table has less entries or is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill <cassandra-pid>
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org