You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Robert Coli (JIRA)" <ji...@apache.org> on 2012/09/04 23:16:09 UTC

[jira] [Commented] (CASSANDRA-3564) flush before shutdown so restart is faster

    [ https://issues.apache.org/jira/browse/CASSANDRA-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448066#comment-13448066 ] 

Robert Coli commented on CASSANDRA-3564:
----------------------------------------

> That sounds reasonable but I think we should have some kind of ceiling (10 minutes or something) where we kill -9 it, just in case we ever have a bug that causes us not to exit (we've had them before), so we don't hang the shutdown of the entire machine forever.
> ...
> Unless I'm missing something, you can't do anything about a kill -9, you're cooked.

The trivial case of this is a node where the data directory has been marked read-only due to errors but the commitlog is on a different device which is still writable.

In the status quo, stopping such a node will not result in the sstable flush blocking forever. The node just stops. On restart it replays and (CASSANDRA-1967) these replayed memtables are then flushed. This results in the same flush blocking forever, but the node otherwise serves reads and can take writes until it OOMs. It also doesn't need to be sent a SIGKILL at any time.

If the node flushes on shutdown in such a way that it is effectively "drain"ing the node, then in order to avoid data loss you merely need to wait for the commitlog sync. The relevant thing seems to be that the *commitlog* is synced before you SIGKILL the node, not whether the *flush* succeeds or not. In practice, this window seems likely to be sized in a small number of seconds even with the most lenient commitlog flushing, and is therefore likely irrelevant.

However, with flush-on-shutdown you *have* to send the process SIGKILL in this case, because the flush can hang indefinitely. I get worried any time I *have* to send SIGKILL to a database, even if I understand logically that is it safe. Adding flush to the shutdown path seems to create a new case in which I *have* to do this uncomfortable thing.
 
> That is to say, with the status quo, if you want to flush before shutdown, you call nodetool flush. Not a big deal. But if we made it flush-everything-by-default then to make it NOT flush our options include.

I don't understand why calling nodetool flush/drain is not a big deal here, but is a big deal enough to special case shutdown when durable_writes is off in CASSANDRA-2958.

In my opinion, the sane default here is the pre-2958 status quo : no flushing on shutdown ever, including when durable_writes is off. Operators who want to drain nodes before stopping them can do so via nodetool.
                
> flush before shutdown so restart is faster
> ------------------------------------------
>
>                 Key: CASSANDRA-3564
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3564
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Packaging
>            Reporter: Jonathan Ellis
>            Assignee: Brandon Williams
>            Priority: Minor
>             Fix For: 1.2.0
>
>         Attachments: 3564.patch, 3564.patch
>
>
> Cassandra handles flush in its shutdown hook for durable_writes=false CFs (otherwise we're *guaranteed* to lose data) but leaves it up to the operator otherwise.  I'd rather leave it that way to offer these semantics:
> - cassandra stop = shutdown nicely [explicit flush, then kill -int]
> - kill -INT = shutdown faster but don't lose any updates [current behavior]
> - kill -KILL = lose most recent writes unless durable_writes=true and batch commits are on [also current behavior]
> But if it's not reasonable to use nodetool from the init script then I guess we can just make the shutdown hook flush everything.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira