You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Dimitar Dimitrov (Jira)" <ji...@apache.org> on 2019/11/06 11:05:00 UTC

[jira] [Commented] (CASSANDRA-15368) Failing to flush Memtable without terminating process results in permanent data loss

    [ https://issues.apache.org/jira/browse/CASSANDRA-15368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968261#comment-16968261 ] 

Dimitar Dimitrov commented on CASSANDRA-15368:
----------------------------------------------

[~benedict], I assume this is something that you're planning to take up yourself, but let me know if you can use a volunteer in any way. 

Also can you please help me understand some of the details around the pre-conditions for this problem?

I'm probably mising something, but I still can't understand:
 * how _*the last operations for the old Memtable may obtain their ReplayPosition after the first operations for the new Memtable*_ can hold true after CASSANDRA-8383.
 * how _*Unfortunately, we treat the Memtable range as contiguous, and invalidate the entire range on flush*_ can hold true after CASSANDRA-11828 (with some interaction with CASSANDRA-9669).

I'm also wondering, is _*More problematically, this can also occur on restart without any associated flush failure, as we use commit log boundaries written to our flushed sstables to filter ReplayPosition on recovery*_ related to {{CommitLogReplayer#firstNotCovered(Collection<IntervalSet<CommitLogPosition>>)}} and its caveats?

P.S. Specifically for the upper bound of the old memtable being above the lower bound of the new memtable, I've tried to explicitly write down the possible orderings, and I can't see how that could happen - I'll format and post my notes in a separate comment a bit later.

> Failing to flush Memtable without terminating process results in permanent data loss
> ------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15368
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15368
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Commit Log, Local/Memtable
>            Reporter: Benedict Elliott Smith
>            Priority: Normal
>             Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x
>
>
> {{Memtable}} do not contain records that cover a precise contiguous range of {{ReplayPosition}}, since there are only weak ordering constraints when rolling over to a new {{Memtable}} - the last operations for the old {{Memtable}} may obtain their {{ReplayPosition}} after the first operations for the new {{Memtable}}.
> Unfortunately, we treat the {{Memtable}} range as contiguous, and invalidate the entire range on flush.  Ordinarily we only invalidate records when all prior {{Memtable}} have also successfully flushed.  However, in the event of a flush that does not terminate the process (either because of disk failure policy, or because it is a software error), the later flush is able to invalidate the region of the commit log that includes records that should have been flushed in the prior {{Memtable}}
> More problematically, this can also occur on restart without any associated flush failure, as we use commit log boundaries written to our flushed sstables to filter {{ReplayPosition}} on recovery, which is meant to replicate our {{Memtable}} flush behaviour above.  However, we do not know that earlier flushes have completed, and they may complete successfully out-of-order.  So any flush that completes before the process terminates, but began after another flush that _doesn’t_ complete before the process terminates, has the potential to cause permanent data loss.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org