You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Randall Leeds (JIRA)" <ji...@apache.org> on 2010/06/08 08:58:13 UTC

[jira] Updated: (COUCHDB-767) do a non-blocking file:sync

     [ https://issues.apache.org/jira/browse/COUCHDB-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Randall Leeds updated COUCHDB-767:
----------------------------------

    Attachment: async_fsync.patch

Here's my attempt to combine Adam's patch with my patch from COUCHDB-786. In this approach, couch_file exports a sync_file/1 which takes a path instead of a file descriptor. Unlike Adam's patch, the From tag from the handle_call(sync, From, File) that spawns the async sync_file/1 is not passed, but the spawn'd fun replies with the result of sync_file/1. The desirable consequence is that sync_file/1 may be called without going through a gen_server handler. This approach allows the couch_db_updater:commit_data/2 function to call couch_file:sync_file/1 directly, bypassing the gen_server:call operations that COUCHDB-786 was trying to avoid.

I think this is win win, but I agree with Adam about testing. I welcome any comprehensive performance suite that could run against these changes to get some detailed statistics.

> do a non-blocking file:sync
> ---------------------------
>
>                 Key: COUCHDB-767
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-767
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>    Affects Versions: 0.11
>            Reporter: Adam Kocoloski
>             Fix For: 1.1
>
>         Attachments: 767-async-fsync.patch, async_fsync.patch
>
>
> I've been taking a close look at couch_file performance in our production systems.  One of things I've noticed is that reads are occasionally blocked for a long time by a slow call to file:sync.  I think this is unnecessary.  I think we could do something like
> handle_call(sync, From, #file{name=Name}=File) ->
>     spawn_link(fun() -> sync_file(Name, From) end),
>     {noreply, File};
> and then
> sync_file(Name, From) ->
>     {ok, Fd} = file:open(Name, [read, raw]),
>     gen_server:reply(From, file:sync(Fd)),
>     file:close(Fd).
> Does anyone see a downside to this?  Individual clients of couch_file still see exactly the same behavior as before, only readers are not blocked by syncs initiated in the db_updater process.  When data needs to be flushed file:sync is _much_ slower than spawning a local process and opening the file again --  in the neighborhood of 1000x slower even on Linux with its less-than-durable use of vanilla fsync.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.