You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Joshua McKenzie (JIRA)" <ji...@apache.org> on 2015/06/21 18:43:00 UTC

[jira] [Commented] (CASSANDRA-9627) fsync should not be "best effort" (and silently fail on e.g. windows)

    [ https://issues.apache.org/jira/browse/CASSANDRA-9627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595117#comment-14595117 ] 

Joshua McKenzie commented on CASSANDRA-9627:
--------------------------------------------

I got up close and personal with NTFS on this topic awhile back (see [this comment|https://issues.apache.org/jira/browse/CASSANDRA-7772?focusedCommentId=14098916&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14098916] on CASSANDRA-7772). The tl;dr of the tech is that records in ntfs are doubly-linked between directory handles and their children on NTFS, so after a power failure a chkdsk run will automatically repair those links on the next boot; we don't need to fsync on directories on Windows and there really isn't any method or API available to make it happen.

Now - as to whether this (and other CLibrary calls) should be best effort and fail silently - I agree that we should be more thorough about checking on the more critical operations in there. Not a Windows-specific problem though - un-assigning from myself for now.

> fsync should not be "best effort" (and silently fail on e.g. windows)
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-9627
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9627
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Benedict
>            Assignee: Joshua McKenzie
>            Priority: Blocker
>             Fix For: 2.2.0 rc2
>
>
> Currently we make an effort to synchronize both the file contents and the directory contents. Both are essential to ensure no data loss. Currently we just try to do this, and ignore the problem if we can't. Presumably this behaviour was to "sort of" support Windows (i.e. not crash). Now we officially support Windows, we need to behave better, and really IMO we should _never_ for any platform ignore a failure here. It should be part of our pre-flight checks: if we cannot do it, we cannot run safely.
> It looks like this may be supported trivially through FileChannel, by opening one on the directory itself (and calling force()), although it's not clear if this will still be supported in Java 9 [see discussion here|http://mail.openjdk.java.net/pipermail/nio-dev/2015-May/003140.html].
> [~JoshuaMcKenzie]: assigning to you for now, just so it's tracked by the Windows overlord.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)