You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org> on 2011/10/13 21:03:11 UTC
[jira] [Commented] (CASSANDRA-3248) CommitLog writer should call
fdatasync instead of fsync
[ https://issues.apache.org/jira/browse/CASSANDRA-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126809#comment-13126809 ]
Jonathan Ellis commented on CASSANDRA-3248:
-------------------------------------------
Any news, Zhu?
> CommitLog writer should call fdatasync instead of fsync
> -------------------------------------------------------
>
> Key: CASSANDRA-3248
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3248
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.6.13, 0.7.9, 0.8.6, 1.0.0, 1.1
> Environment: Linux
> Reporter: Zhu Han
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> CommitLogSegment uses SequentialWriter to flush the buffered data to log device. It depends on FileDescriptor#sync() which invokes fsync() as it force the file attributes to disk.
> However, at least on Linux, fdatasync() is good enough for commit log flush:
> bq. fdatasync() is similar to fsync(), but does not flush modified metadata unless that metadata is needed in order to allow a subsequent data retrieval to be correctly handled. For example, changes to st_atime or st_mtime (respectively, time of last access and time of last modification; see stat(2)) do not require flushing because they are not necessary for a subsequent data read to be handled correctly. On the other hand, a change to the file size (st_size, as made by say ftruncate(2)), would require a metadata flush.
> File size is synced to disk by fdatasync() either. Although the commit log recovery logic sorts the commit log segements on their modify timestamp, it can be removed safely, IMHO.
> I checked the native code of JRE 6. On Linux and Solaris, FileChannel#force(false) invokes fdatasync(). On windows, the false flag does not have any impact.
> On my log device (commodity SATA HDD, write cache disabled), there is large performance gap between fsync() and fdatasync():
> {quote}
> $sysbench --test=fileio --num-threads=1 --file-num=1 --file-total-size=10G --file-fsync-all=on --file-fsync-mode={color:red}fdatasync{color} --file-test-mode=seqwr --max-time=600 --file-block-size=2K --max-requests=0 run
> {color:blue}54.90{color} Requests/sec executed
> per-request statistics:
> min: 8.29ms
> avg: 18.18ms
> max: 108.36ms
> approx. 95 percentile: 25.02ms
> $ sysbench --test=fileio --num-threads=1 --file-num=1 --file-total-size=10G --file-fsync-all=on --file-fsync-mode={color:red}fsync{color} --file-test-mode=seqwr --max-time=600 --file-block-size=2K --max-requests=0 run
> {color:blue}28.08{color} Requests/sec executed
> per-request statistics:
> min: 33.28ms
> avg: 35.61ms
> max: 911.87ms
> approx. 95 percentile: 41.69ms
> {quote}
> I do think this is a very critical performance improvement.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira