You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2013/05/03 18:02:16 UTC

[jira] [Updated] (CASSANDRA-5535) Manifest file not fsynced

     [ https://issues.apache.org/jira/browse/CASSANDRA-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-5535:
--------------------------------------

             Priority: Minor  (was: Critical)
    Affects Version/s:     (was: 1.2.3)
                       1.0.0

We've already fixed this in 2.0 by moving level information into sstable metadata (CASSANDRA-4872), but it may be reasonable to add an fsync to 1.2.  What do you think [~krummas]?
                
> Manifest file not fsynced
> -------------------------
>
>                 Key: CASSANDRA-5535
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5535
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>         Environment: RHEL 6.4
> java -version
> java version "1.6.0_31"
> Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
>            Reporter: Terry Cumaranatunge
>            Priority: Minor
>
> We had several cases where the the manifest file would get corrupted when doing power reset tests or iLO resets to mimic power failure scenarios, ungraceful resets, kernel panics etc. It wasn't clear at the time where the problem was, but I think the data below shows that Cassandra is missing an fsync call to the manifest file prior to closing it. This particular stack trace from below is on Cassandra 1.2.4.
> The trace was captured using strace options:
> strace -f -p 2200 -e trace=open,close,write,fsync,fdatasync,rename
> [pid 9710] open("/opt/mp/storage/persistent/cassandra/cassandra-lib/data/MSA/subinfo/subinfo-tmp.json", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 238
> [pid 9710] write(238, "{\n \"generations\" : [ {\n \"gen"..., 3996) = 3996
> [pid 9710] write(238, "14, 263161, 263484, 270816, 2593"..., 3996) = 3996
> [pid 9710] write(238, "275136, 275137, 275138, 275139, "..., 1173) = 1173
> [pid 9710] close(238) = 0
> [pid 9710] rename("/opt/mp/storage/persistent/cassandra/cassandra-lib/data/MSA/subinfo/subinfo.json"
> , "/opt/mp/storage/persistent/cassandra/cassandra-lib/data/MSA/subinfo/subinfo-old.json") = 0
> [pid 9710] rename("/opt/mp/storage/persistent/cassandra/cassandra-lib/data/MSA/subinfo/subinfo-tmp.j
> son", "/opt/mp/storage/persistent/cassandra/cassandra-lib/data/MSA/subinfo/subinfo.json") = 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira