You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Benedict Elliott Smith (Jira)" <ji...@apache.org> on 2019/11/12 11:05:00 UTC

[jira] [Updated] (CASSANDRA-15368) Failing to flush Memtable without terminating process results in permanent data loss

     [ https://issues.apache.org/jira/browse/CASSANDRA-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benedict Elliott Smith updated CASSANDRA-15368:
-----------------------------------------------
    Resolution: Invalid
        Status: Resolved  (was: Open)

> Failing to flush Memtable without terminating process results in permanent data loss
> ------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15368
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15368
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Commit Log, Local/Memtable
>            Reporter: Benedict Elliott Smith
>            Priority: Normal
>             Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x
>
>
> {{Memtable}} do not contain records that cover a precise contiguous range of {{ReplayPosition}}, since there are only weak ordering constraints when rolling over to a new {{Memtable}} - the last operations for the old {{Memtable}} may obtain their {{ReplayPosition}} after the first operations for the new {{Memtable}}.
> Unfortunately, we treat the {{Memtable}} range as contiguous, and invalidate the entire range on flush.  Ordinarily we only invalidate records when all prior {{Memtable}} have also successfully flushed.  However, in the event of a flush that does not terminate the process (either because of disk failure policy, or because it is a software error), the later flush is able to invalidate the region of the commit log that includes records that should have been flushed in the prior {{Memtable}}
> More problematically, this can also occur on restart without any associated flush failure, as we use commit log boundaries written to our flushed sstables to filter {{ReplayPosition}} on recovery, which is meant to replicate our {{Memtable}} flush behaviour above.  However, we do not know that earlier flushes have completed, and they may complete successfully out-of-order.  So any flush that completes before the process terminates, but began after another flush that _doesn’t_ complete before the process terminates, has the potential to cause permanent data loss.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org