You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@couchdb.apache.org by "Jan Lehnardt (Created) (JIRA)" <ji...@apache.org> on 2011/11/16 14:29:51 UTC

[jira] [Created] (COUCHDB-1342) Asynchronous file writes

Asynchronous file writes
------------------------

                 Key: COUCHDB-1342
                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
             Project: CouchDB
          Issue Type: Improvement
          Components: Database Core
            Reporter: Jan Lehnardt
             Fix For: 1.3
         Attachments: COUCHDB-1342.patch

This change updates the file module so that it can do
asynchronous writes. Basically it replies immediately
to process asking to write something to the file, with
the position where the chunks will be written to the
file, while a dedicated child process keeps collecting
chunks and write them to the file (and batching them
when possible). After issuing a series of write request
to the file module, the caller can call its 'flush'
function which will block the caller until all the
chunks it requested to write are effectively written
to the file.

This maximizes the IO subsystem, as for example, while
the updater is traversing and modifying the btrees and
doing CPU bound tasks, the writes are happening in
parallel.

Originally described at http://s.apache.org/TVu

Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by Paul Davis <pa...@gmail.com>.

On Thu, Nov 17, 2011 at 9:43 PM, Jason Smith <jh...@iriscouch.com> wrote:
> On Fri, Nov 18, 2011 at 10:06 AM, Damien Katz (Commented) (JIRA)
> <ji...@apache.org> wrote:
>> Right now we have a culture of everyone's pet concerns must addressed before code gets checked in, which is demoralizing and slows things down, which is a very big problem the project has right now. I want your help in trying to change that.
>
> I'm sorry to agree. I made two (IMO) useful contributions and both
> died in JIRA committee.
>
> My CORS implementation[1] was ignored for three months (15 May - 24
> August), despite production deployment at Iris Couch. Ultimately, a
> competing patch showed up, with no reason given. After some
> discussion, it too languishes.
>
> My inbox database feature[2] is suffering the same fate. Both features
> are highly requested by real-world Couch app developers.
>

I understand the frustration that a patch languishes in JIRA. I too
have had serious amounts of effort rot away never to be heard from
again.

Though I would point out that both of these patches have significant
security concerns attached. CORS opens up some crazy XSS stuff for a
spec that I'm not even sure is out of draft status (which we have a
history of not implementing in core).

As to the inbox db thing, last I looked it was pattern matching on
request properties to implement its security model. That's a model
destined to break and since its security related the possibility that
it breaks in bad ways is (IMO) second only in severity to the
possibility that we lose data.

> I have a new CommonJS feature mostly done but I've postponed it.
>
> (I'm not changing the subject, just sharing that it's not just newbies
> who are frustrated. I'm not super frustrated, just a little
> frustrated.)

Oh, for sure. I get frustrated as well. It takes effort and
determination to get big patches through the JIRA process and into
trunk. As well it should. If we make it too easy to just throw random
patches into trunk and "fix things as they come up" then we end up
losing our credibility as serious database engineers.

>
> My inbox database patches[2] are also
> [1]: https://issues.apache.org/jira/browse/COUCHDB-431
>
> The last two big contributions I made have been ignored in JIRA. My
> CORS implementation[1] was ignored
>>
>>> Asynchronous file writes
>>> ------------------------
>>>
>>>                 Key: COUCHDB-1342
>>>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>>>             Project: CouchDB
>>>          Issue Type: Improvement
>>>          Components: Database Core
>>>            Reporter: Jan Lehnardt
>>>             Fix For: 1.3
>>>
>>>         Attachments: COUCHDB-1342.patch
>>>
>>>
>>> This change updates the file module so that it can do
>>> asynchronous writes. Basically it replies immediately
>>> to process asking to write something to the file, with
>>> the position where the chunks will be written to the
>>> file, while a dedicated child process keeps collecting
>>> chunks and write them to the file (and batching them
>>> when possible). After issuing a series of write request
>>> to the file module, the caller can call its 'flush'
>>> function which will block the caller until all the
>>> chunks it requested to write are effectively written
>>> to the file.
>>> This maximizes the IO subsystem, as for example, while
>>> the updater is traversing and modifying the btrees and
>>> doing CPU bound tasks, the writes are happening in
>>> parallel.
>>> Originally described at http://s.apache.org/TVu
>>> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554
>>
>> --
>> This message is automatically generated by JIRA.
>> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
>> For more information on JIRA, see: http://www.atlassian.com/software/jira
>>
>>
>>
>
>
>
> --
> Iris Couch
>

Re: [jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by Jason Smith <jh...@iriscouch.com>.

On Fri, Nov 18, 2011 at 10:06 AM, Damien Katz (Commented) (JIRA)
<ji...@apache.org> wrote:
> Right now we have a culture of everyone's pet concerns must addressed before code gets checked in, which is demoralizing and slows things down, which is a very big problem the project has right now. I want your help in trying to change that.

I'm sorry to agree. I made two (IMO) useful contributions and both
died in JIRA committee.

My CORS implementation[1] was ignored for three months (15 May - 24
August), despite production deployment at Iris Couch. Ultimately, a
competing patch showed up, with no reason given. After some
discussion, it too languishes.

My inbox database feature[2] is suffering the same fate. Both features
are highly requested by real-world Couch app developers.

I have a new CommonJS feature mostly done but I've postponed it.

(I'm not changing the subject, just sharing that it's not just newbies
who are frustrated. I'm not super frustrated, just a little
frustrated.)

My inbox database patches[2] are also
[1]: https://issues.apache.org/jira/browse/COUCHDB-431

The last two big contributions I made have been ignored in JIRA. My
CORS implementation[1] was ignored
>
>> Asynchronous file writes
>> ------------------------
>>
>>                 Key: COUCHDB-1342
>>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>>             Project: CouchDB
>>          Issue Type: Improvement
>>          Components: Database Core
>>            Reporter: Jan Lehnardt
>>             Fix For: 1.3
>>
>>         Attachments: COUCHDB-1342.patch
>>
>>
>> This change updates the file module so that it can do
>> asynchronous writes. Basically it replies immediately
>> to process asking to write something to the file, with
>> the position where the chunks will be written to the
>> file, while a dedicated child process keeps collecting
>> chunks and write them to the file (and batching them
>> when possible). After issuing a series of write request
>> to the file module, the caller can call its 'flush'
>> function which will block the caller until all the
>> chunks it requested to write are effectively written
>> to the file.
>> This maximizes the IO subsystem, as for example, while
>> the updater is traversing and modifying the btrees and
>> doing CPU bound tasks, the writes are happening in
>> parallel.
>> Originally described at http://s.apache.org/TVu
>> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>



-- 
Iris Couch

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Jan Lehnardt (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151498#comment-13151498 ] 

Jan Lehnardt commented on COUCHDB-1342:
---------------------------------------

> I disagree fairly strongly on this point. couch_file is about as core of an API as it gets. couch_file:flush/1 shouldn't even exist as far as I'm concerned. This isn't a mediocre API that should be improved later, its a bad API that shouldn't go in at all. Its 100% foot-gun. 

The new issue would be definitely a blocker for the next release. I don't see a problem with this, but I'm happy to iterate the patch in JIRA as well.

> I don't need another patch to know that this one is complicated and could be less complicated. 

Earlier versions of this patch did use gen_servers and they were neither less complicated nor did they give the desired performance improvements, thus we went past them. They add a lot of overhead for such a low level module. So yes, I think we need an alternative patch to see if it were simpler and as useful.

> You're pointing at code that's about three abstractions away from couch_file's writer loop. Suffice it to say, erlang:monitor/2 on a random process in the write path does little to assuage my fears that we're entering dangerous territory relying on our own file writing buffers.

Not sure where there this is three abstractions away, but I gotta have to defer to the experts Filipe and Damien here.
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Robert Newson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151567#comment-13151567 ] 

Robert Newson commented on COUCHDB-1342:
----------------------------------------

Technical comments notwithstanding, saying "We won't be holding up this patch because it seems more complicated than before. It's about what users want." is a little bold. As I understand the Apache rules, if Paul feels strongly he can vote -1 which is a veto.

>From http://www.apache.org/foundation/voting.html:

"Votes on code modifications follow a different model. In this scenario, a negative vote constitutes a veto , which cannot be overridden."

Any suggestion that one person can force in a controversial change to an Apache project needs to be challenged.

                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Randall Leeds (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152729#comment-13152729 ] 

Randall Leeds commented on COUCHDB-1342:
----------------------------------------

Ok. Wow.

I'm with Paul on these things: proc_lib and an exported loop function are a must.
I'm with Damien/Jason on these things: I fear a culture where things languish too much because we don't have it together, collectively, to iterate on things effectively.
I'm with Bob on this thing: I don't see why this can't sit on a branch where we can actually iterate and commit things _together_.

I stand alone in that: I made you all a branch. Guess what it's called!? COUCHDB-1342. Have at it!
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Paul Joseph Davis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152557#comment-13152557 ] 

Paul Joseph Davis commented on COUCHDB-1342:
--------------------------------------------

Damn it JIRA just ate a comment.

Short and skinny of it was that I just remembered that there's an (unmeasured) optimization in zip_server that I learned from gen_server2. The basic idea is to pull all messages out of the mailbox and then process them off a queue. The original idea was that it'd prevent pattern matching from running away scanning for messages though I think writer_loop's repeated implementation. Might be worth a check but I only note this cause the thought occurred. Feel free to ignore.

Also, just to be clear I'm not suggesting the use of zip_server or anything nutty like that. It just occurred to me that it might help inform the current discussion.
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (COUCHDB-1342) Asynchronous file writes

Posted by "Jan Lehnardt (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jan Lehnardt updated COUCHDB-1342:
----------------------------------

    Attachment: COUCHDB-1342.patch
    
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by Robert Dionne <di...@dionne-associates.com>.

On Nov 17, 2011, at 10:06 PM, Damien Katz (Commented) (JIRA) wrote:

> 
>    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152608#comment-13152608 ] 
> 
> Damien Katz commented on COUCHDB-1342:
> --------------------------------------
> 
> Paul, what I mean by "Apache users concerns" is that #3 isn't something that vanilla Apache CouchDB users deal with, but third parties who modify the code or embed in interesting way might (I suppose Cloudant has to deal with this). Perhaps I'm mistaken about that. I do think patches should only be concerned with the vanilla use cases in order to be considered check-in quality.
> 
> #4 is a style issue, not a correctness issue, or at least you haven't made a case that it's a correctness issues. I have no problems with you changing it to a style you prefer, but we should not expect that submitters of patches conform to an undocumented style.
> 
> There is no urgency around this patch, at Couchbase we can keep adding performance enhancements and drift our codebase further and further from Apache. I don't want to see that happen, but it only hurts the Apache project.

Damien,

I agree with both these points, your codebase at Couchbase is drifting but you're not alone in that, we do need a culture where more correct fast code is checked in. I've only had a couple of days to look at this and I've not had the time to read your Couchbase work. As I look at this patch almost every concern Paul is raising is technically valid. We do have to consider more than the vanilla CouchDB as it gets embedded in BigCouch for example, and CouchDB is designed to be distributed, right? I first ran a simple test, adding 10K empty docs, and notice a 40K difference in the db file size. Probably harmless, but I don't know why. There's no real way to independently verify if this patch changes the db layout other than via the semantics of the code.

Databases are hard, as you mention, very hard. Without good performance they are next to useless, but a lack correctness is also problematic, certainly in some domains. I share other's frustration with patches languishing. The patches to date I've submitted have all been small and have often had to be refactored as the code migrated away (I think I have 3 now, 2 of them bugs). COUCHDB-911 for example is a real bug, involving both couch_db and couch_db_updater, and as Adam notes is not just a bulk docs issue. It reports a conflict but adds data to the db anyway. Can you believe that? I tried a couple of fixes to minimize the surface area touched but there was no real way to solve it correctly without adding to the data structures. When I saw this patch my first reaction was wow, but now I'll have to rework 911 again as your patch also touches the same files. It's totally orthogonal so no big deal.

I mention this only to point out that the review process is awesome and when taken seriously makes for a better result. This isn't just people's pet concerns. It takes time to do this. Fortunately it's not rocket science, it's just databases. The solution to the culture problem is "best practices". Best practices have to be practiced, and someone (Jan as the project lead I'm looking at you :) needs to crack the whip and set the tone. Of course I'm assuming that we're talking about a process to produce "production" quality code. I quote production as that phrase has evolved considerably over the years. If master is deemed acceptable for prototypes, proofs of concept, etc. then fine but otherwise I'd suggest we follow Randall's lead and work this patch on a branch first. Anyway, 'm sure you know these things, I don't mean to prattle on. 

Best Regards,

Bob

> And I do see we have some culture problems in the Apache project. We need a culture where useful, correct, fast code is verified and checked in, and then is improved incrementally. Right now we have a culture of everyone's pet concerns must addressed before code gets checked in, which is demoralizing and slows things down, which is a very big problem the project has right now. I want your help in trying to change that.
> 
>> Asynchronous file writes
>> ------------------------
>> 
>>                Key: COUCHDB-1342
>>                URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>>            Project: CouchDB
>>         Issue Type: Improvement
>>         Components: Database Core
>>           Reporter: Jan Lehnardt
>>            Fix For: 1.3
>> 
>>        Attachments: COUCHDB-1342.patch
>> 
>> 
>> This change updates the file module so that it can do
>> asynchronous writes. Basically it replies immediately
>> to process asking to write something to the file, with
>> the position where the chunks will be written to the
>> file, while a dedicated child process keeps collecting
>> chunks and write them to the file (and batching them
>> when possible). After issuing a series of write request
>> to the file module, the caller can call its 'flush'
>> function which will block the caller until all the
>> chunks it requested to write are effectively written
>> to the file.
>> This maximizes the IO subsystem, as for example, while
>> the updater is traversing and modifying the btrees and
>> doing CPU bound tasks, the writes are happening in
>> parallel.
>> Originally described at http://s.apache.org/TVu
>> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554
> 
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
> 
>

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Damien Katz (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152608#comment-13152608 ] 

Damien Katz commented on COUCHDB-1342:
--------------------------------------

Paul, what I mean by "Apache users concerns" is that #3 isn't something that vanilla Apache CouchDB users deal with, but third parties who modify the code or embed in interesting way might (I suppose Cloudant has to deal with this). Perhaps I'm mistaken about that. I do think patches should only be concerned with the vanilla use cases in order to be considered check-in quality.

#4 is a style issue, not a correctness issue, or at least you haven't made a case that it's a correctness issues. I have no problems with you changing it to a style you prefer, but we should not expect that submitters of patches conform to an undocumented style.

There is no urgency around this patch, at Couchbase we can keep adding performance enhancements and drift our codebase further and further from Apache. I don't want to see that happen, but it only hurts the Apache project.

And I do see we have some culture problems in the Apache project. We need a culture where useful, correct, fast code is verified and checked in, and then is improved incrementally. Right now we have a culture of everyone's pet concerns must addressed before code gets checked in, which is demoralizing and slows things down, which is a very big problem the project has right now. I want your help in trying to change that.
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Bob Dionne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190655#comment-13190655 ] 

Bob Dionne commented on COUCHDB-1342:
-------------------------------------

I revisited this a bit this morning. I tried to rebase master just to see how far it's moved, it's not too bad, couch_db had some conflicts mostly relating to a new field #db.updater_fd

Given Damien's abandonment of the project I'm not sure whether we should push on this or not. I suppose it's worth cleaning up and using if in fact the improvements are substantial
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Randall Leeds (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190584#comment-13190584 ] 

Randall Leeds commented on COUCHDB-1342:
----------------------------------------

Bump! I made a branch on the central git repo so you all could iterate, not so I could kill the topic.
What's left to make everyone feel good about merging this?
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Robert Newson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152240#comment-13152240 ] 

Robert Newson commented on COUCHDB-1342:
----------------------------------------

I saw the (hardcoded) 1 mib limit but my context is hundreds or thousands of such things at once. Making it configurable or even optional would be an improvement.

I appreciate the effort to close some of these gaps but, speaking personally and from (bitter) experience, trying to get a large piece of work into trunk with promises to fix things 'later' really bothers me. I don't see why this work can't sit on a branch while the discussion, and the enhancements continue.

                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Paul Joseph Davis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151734#comment-13151734 ] 

Paul Joseph Davis commented on COUCHDB-1342:
--------------------------------------------

Also, I'd point out that I've not so much as implied that this patch shouldn't go in. The patch came to JIRA, I reviewed it and I've been trying to discuss the technical aspects of this patch. Somehow that got dragged off into the weeds about voting and other silly things. I'd prefer if we could avoid spending time squabbling like raccoons in a dumpster. There's a good idea here and we should be spending our time working on the engineering to make it happen.
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Bob Dionne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190660#comment-13190660 ] 

Bob Dionne commented on COUCHDB-1342:
-------------------------------------

I should add that I'd like to get Filipe's opinion on this as this was a collaborative effort
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Paul Joseph Davis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151314#comment-13151314 ] 

Paul Joseph Davis commented on COUCHDB-1342:
--------------------------------------------

This patch looks like its a mishmash of a couple patches that are applied slightly differently. Not sure if that's a branch thing or what. But for instance, there's a lot of id_btree(Db) -> Db#db.by_docid_btree type changes. There's also a switch between couch_file:sync(Filepath) and couch_file:sync(Fd). I'm not sure if that's on purpose or not because we switched to Filepath on purpose. I see it as being possible but I don't see a reference to it.

I'm really not a fan of exposing the inner workings of couch_file to every client as can be seen by the massive number of flush calls that this creates. I find it quite likely that we'll eventually end up missing one of these flush calls and causing a nasty bug because the error would look like data hadn't been written. I'd want to see things rewritten so that the usage of couch_file doesn't change. Easiest solution I see would be to automatically flush buffers if a read request comes in for something that's not on disk.

I'm pretty sure we talked about this at one point, but can someone describe the differences between this and Erlang's delayed_write mode for files?

couch_file becomes much more complicated with this patch. I'd prefer to see it simplified if at all possible. I'm not entirely certain how it'd look but I might start by making a couch_file_reader and couch_file_writer gen_servers instead of having bare processes in couch_file.

I have to say that this patch scares me a bit. If we make the switch to something like this then we're opening up a whole new type of failure scenario where file "writes" appear to succeed but then end up failing after the caller has moved on. While there are certainly cases where I can see this being a fine tradeoff (view updaters, compactors, etc) it worries me a bit for the main database file. The fact is that I can't (yet) reasonably consider all of the possible new failure modes and convince myself that things are correct. I've already seen some super weird reactions to things like running out of disk space, it seems like not knowing about such an error until you've "written" a megabyte of data seems awkward.

                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Damien Katz (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152225#comment-13152225 ] 

Damien Katz commented on COUCHDB-1342:
--------------------------------------

Robert, the inflight batching of writes is limited to 1 meg per database. we've not seen it increase memory usage, but the code is written such that it could be made configurable to smaller max batch size. If we are concerned, this is a detail I'd like to see addressed post check-in.
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Damien Katz (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152222#comment-13152222 ] 

Damien Katz commented on COUCHDB-1342:
--------------------------------------

I don't mean to imply that Paul, or any committer isn't smart enough to handle a flush call. I _know_ Paul is has the smarts and talent to deal with much more complexity. What I am saying is that if a flush call requirement makes it so that someone can't work on the internals of CouchDB, then they aren't suited for core database development. Database engines are complex beasts.

Paul's point is about that the flush call can maybe be gotten rid of seems right. Originally, we didn't have the code that prevented the write queue getting overwhelmed, because in our product it's not possible. But I added it to make the rest of the enhancements suitable for Apache, and now it seems it could be used to prevent the reads of unflushed data. However, there is another optimization coming where a raw erlang FD is used in a calling process to avoid messaging overhead (another big performance improvement in certain long operations), which will maybe make it necessary again. We can remove it in the meantime, but it may need to be added back in the future.

The concern with doubling the # non-db file descriptors is a real one. How big of a concern of this? Do you have ideas how to fix this? Can we address this post check-in?

Your 3rd and 4th concerns aren't Apache user concerns, but can be easily addressed after check-in. I have no objections, but I would prefer we have a culture of small changes/environment specific changes like that happening after checkin. That will increase the rate the of progress on the project in general. If you agree, would you be willing to add those changes post check-in?

The 5th concern would definitely make code more complicated for callers, and would involved them batching usually a non-optimal amount of data. This code makes the batching automatic and parallelize the writes, retiring batched data as fast as it can, and prevented the batching of too much data.
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Damien Katz (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151639#comment-13151639 ] 

Damien Katz commented on COUCHDB-1342:
--------------------------------------

Robert, from the next paragraph:

"To prevent vetos from being used capriciously, they must be accompanied by a technical justification showing why the change is bad (opens a security exposure, negatively affects performance, etc. ). A veto without a justification is invalid and has no weight."

No is suggesting one person wants this change and can force it in. I am saying that we don't refuse progress simply because someone finds it more complicated, especially when it offers what users want.
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Filipe Manana (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190665#comment-13190665 ] 

Filipe Manana commented on COUCHDB-1342:
----------------------------------------

I really appreciate Randall's pushing for collaboration rather than expect a single person to do it all or let this fall into oblivion.

I will do some updates to the branch soon as well repost some performance benchmarks, and instructions how to reproduce them as usual, in comparison to latest master (the results posted months ago don't account for many improvements that came after such as COUCHDB-1334).
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Paul Joseph Davis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151729#comment-13151729 ] 

Paul Joseph Davis commented on COUCHDB-1342:
--------------------------------------------

"has to flush to service a write call" should obviously have been "has to flush to service a read call".
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Jan Lehnardt (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213770#comment-13213770 ] 

Jan Lehnardt commented on COUCHDB-1342:
---------------------------------------

I tried looking into this but make check fails with

./test/etap/242-replication-many-leaves.t .. 9/? Bailout called.  Further testing stopped:  Timeout waiting for replication to finish

and it seems master got ahead again.
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Paul Joseph Davis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151430#comment-13151430 ] 

Paul Joseph Davis commented on COUCHDB-1342:
--------------------------------------------


> These are deliberate. Currently we are maintaining an Fd/couch_file pair for reads and writes in a bit of a clumsy way in the #btree.fd record and the couch_file module and do a switcheroo for writes vs. reads. Now all of that is moved into couch_file and couch_db doesn't have to concern itself with the details of efficient file access.

Oh I hadn't noticed that weirdness before, but I see it now.

> Good point, API-wise this is a bit leaky and we should definitely look into making this better, but I don't think this blocks the bulk of the improvements that this patch introduces. I'm happy to open a new ticket about this. Good starting points are calling flush() inside couch_file 1) after writing a header and 2) if a read fails with {error, eof} (fwiw, Damien tried the latter before but removed it again, the details elude me). 

I disagree fairly strongly on this point. couch_file is about as core of an API as it gets. couch_file:flush/1 shouldn't even exist as far as I'm concerned. This isn't a mediocre API that should be improved later, its a bad API that shouldn't go in at all. Its 100% foot-gun.

> The difference is that delayed_write does buffering whereas we just want to keep writing to the file all the time. delayed_write would not write until the buffer is full or the timeout is hit. In addition, we wouldn't be able to tell for delayed_commits=false writes when data hit the file reliably. 

Oh right, I remember that discussion about clearing the buffer being an issue now. Good call.

> The whole point of this patch is to make couch_file smarter and move the reader/writer abstraction further down the stack and reap the associated performance benefits. Unless we have an alternate patch, we can't really compare this. 

Code Review 2.0:

  Q: "This looks complicated, can we think about trying to simplify some of the organization?"
  A: "No. Write it yourself if you care that much."

I don't need another patch to know that this one is complicated and could be less complicated.

> This doesn't change the error scenarios. We already rely on monitoring to tell the request process a couch_file write didn't work

You're pointing at code that's about three abstractions away from couch_file's writer loop. Suffice it to say, erlang:monitor/2 on a random process in the write path does little to assuage my fears that we're entering dangerous territory relying on our own file writing buffers.

                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Damien Katz (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151536#comment-13151536 ] 

Damien Katz commented on COUCHDB-1342:
--------------------------------------

This is about write performance of database internals, which this improves dramatically. If database devs can't deal with a flush call after some writes, then they probably shouldn't be working on databases. Unless someone has have issues with the correctness (note, writes also get buffered in other places as well, fsync is the only thing that ensures they hit disk), this is what USERs want. Better overall performance. This improves view index creation and write through greatly.

Databases are complicated, making them perform well is hard. We won't be holding up this patch because it seems more complicated than before. It's about what users want.
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Jan Lehnardt (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151401#comment-13151401 ] 

Jan Lehnardt commented on COUCHDB-1342:
---------------------------------------

Good review Paul, thanks! :)

Let me quote and reply inline to each of your points:

> This patch looks like its a mishmash of a couple patches that are applied slightly differently. Not sure if that's a branch thing or what. But for instance, there's a lot of id_btree(Db) -> Db#db.couch type changes.

These are deliberate. Currently we are maintaining an Fd/couch_file pair for reads and writes in a bit of a clumsy way in the #btree.fd record and the couch_file module and do a switcheroo for writes vs. reads. Now all of that is moved into couch_file and couch_db doesn't have to concern itself with the details of efficient file access.

> There's also a switch between couch_file:sync(Filepath) and couch_file:sync(Fd). I'm not sure if that's on purpose or not because we switched to Filepath on purpose. I see it as being possible but I don't see a reference to it.

We used to pass in the filepath to allow the fsync() to be async so it wouldn't block any readers or writers. That behaviour is now obsolete as couch_file will now pass the fsync() request to it's writer child process.


> I'm really not a fan of exposing the inner workings of couch_file to every client as can be seen by the massive number of flush calls that this creates. I find it quite likely that we'll eventually end up missing one of these flush calls and causing a nasty bug because the error would look like data hadn't been written. I'd want to see things rewritten so that the usage of couch_file doesn't change. Easiest solution I see would be to automatically flush buffers if a read request comes in for something that's not on disk.

Good point, API-wise this is a bit leaky and we should definitely look into making this better, but I don't think this blocks the bulk of the improvements that this patch introduces. I'm happy to open a new ticket about this. Good starting points are calling flush() inside couch_file 1) after writing a header and 2) if a read fails with {error, eof} (fwiw, Damien tried the latter before but removed it again, the details elude me).

> I'm pretty sure we talked about this at one point, but can someone describe the differences between this and Erlang's delayed_write mode for files?

The difference is that delayed_write does buffering whereas we just want to keep writing to the file all the time. delayed_write would not write until the buffer is full or the timeout is hit. In addition, we wouldn't be able to tell for delayed_commits=false writes when data hit the file reliably.

> couch_file becomes much more complicated with this patch. I'd prefer to see it simplified if at all possible. I'm not entirely certain how it'd look but I might start by making a couch_file_reader and couch_file_writer gen_servers instead of having bare processes in couch_file.

The whole point of this patch is to make couch_file smarter and move the reader/writer abstraction further down the stack and reap the associated performance benefits. Unless we have an alternate patch, we can't really compare this.

> I have to say that this patch scares me a bit. If we make the switch to something like this then we're opening up a whole new type of failure scenario where file "writes" appear to succeed but then end up failing after the caller has moved on. While there are certainly cases where I can see this being a fine tradeoff (view updaters, compactors, etc) it worries me a bit for the main database file. The fact is that I can't (yet) reasonably consider all of the possible new failure modes and convince myself that things are correct. I've already seen some super weird reactions to things like running out of disk space, it seems like not knowing about such an error until you've "written" a megabyte of data seems awkward.

This doesn't change the error scenarios. We already rely on monitoring to tell the request process a couch_file write didn't work: 

  https://github.com/fdmanana/couchdb/blob/master/src/couchdb/couch_db.erl#L819 ff. (825 in particular)
  
   
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Adam Kocoloski (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152565#comment-13152565 ] 

Adam Kocoloski commented on COUCHDB-1342:
-----------------------------------------

The gen_server2 hack is designed to make gen_server:call more efficient inside the server loop.  It's also ~obsolete in R14+.  Not really an issue for writer_loop and reader_loop here because they've got only one receive statement apiece with a clause for every message they expect to receive.
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Filipe Manana (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190821#comment-13190821 ] 

Filipe Manana commented on COUCHDB-1342:
----------------------------------------

Made a few more tests comparing latest master against COUCHDB-1342 branch (after merging master into it).

Database writes
=============

* 1Kb documents (https://github.com/fdmanana/basho_bench_couch/blob/master/couch_docs/doc_1kb.json)

http://graphs.mikeal.couchone.com/#/graph/7c13e2bdebfcd17aab424e68f225fe9a

* 2Kb documents (https://github.com/fdmanana/basho_bench_couch/blob/master/couch_docs/doc_2kb.json)

http://graphs.mikeal.couchone.com/#/graph/7c13e2bdebfcd17aab424e68f2261504

* 11Kb documents
(https://github.com/fdmanana/basho_bench_couch/blob/master/couch_docs/doc_11kb.json)

http://graphs.mikeal.couchone.com/#/graph/7c13e2bdebfcd17aab424e68f2262e98


View indexer
==========

Test database:  http://fdmanana.iriscouch.com/_utils/many_docs

* master

$ echo 3 > /proc/sys/vm/drop_caches
$ time curl http://localhost:5984/many_docs/_design/test/_view/test1
{"rows":[
{"key":null,"value":20000000}
]}

real	29m42.041s
user	0m0.016s
sys	0m0.036s


* master + patch (branch COUCHDB-1342)

$ echo 3 > /proc/sys/vm/drop_caches
$ time curl http://localhost:5984/many_docs/_design/test/_view/test1
{"rows":[
{"key":null,"value":20000000}
]}

real	26m13.112s
user	0m0.008s
sys	0m0.036s

Before COUCHDB-1334, and possibly the refactored indexer as well, the difference used to be more significant (like in the results presented in the dev mail http://s.apache.org/TVu).
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Paul Joseph Davis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151723#comment-13151723 ] 

Paul Joseph Davis commented on COUCHDB-1342:
--------------------------------------------

@Damien

That's an awful lot of disappointment packed into a single comment. First, resorting to an ad hominem attack to insinuate that I'm not intelligent enough to work on databases is quite disconcerting. Secondly, its an egregious fallacy to suggest that because a patch appears to be technically correct that it should be committed. Thirdly, declaring what is and isn't a valid reason to hold up a patch is not how the ASF works.

And now back to the regularly scheduled technical discussion.

First, couch_file:flush/1. Unless I'm missing something extremely subtle here, it's existence is so that clients can read their own writes. Yet the couch_file gen_server has all the knowledge it needs to know if it has to flush to service a write call. If the requested read position is between #file.eof and #file.eof + #file.queued_write_bytes, then it can call flush and move on with its life. Not only does this mean that clients don't have remember to call flush, but it removes unnecessary message passing that every unconditional call to flush would generate.

Second, this is doubling the number of file descriptors required for anything that isn't a database. On the first production machine I checked that's an increase of 75% from 40K to 70K file descriptors. That's a fairly serious change that ought to be discussed. At the very least it ought to be mentioned somewhere so ops teams know to expect it.

Third, this is spawning long lived processes that aren't looping on exported functions. After two code upgrades this would crash every couch_file in the VM simultaneously.

Fourth, as I've mentioned numerous times before, the proper way to synchronously start a process that might fail to initialize is to use proc_lib:start_link and proc_lib:init_ack.

Fifth, has anyone considered using a write buffer outside of the couch_file API that would allow clients more precise control. For instance, thinking briefly on the view updater, you could buffer writes for a single add_remove call. This also leads to the possibility that mostly read views aren't needlessly holding open a writer fd  for no reason.


                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Paul Joseph Davis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152525#comment-13152525 ] 

Paul Joseph Davis commented on COUCHDB-1342:
--------------------------------------------

@Damien

> Robert, the inflight batching of writes is limited to 1 meg per database.

No, its up to 1 meg per file that's being written to. It's also important to note that the buffering isn't actually a passive thing like is generally done. The "buffer" is actually just the mailbox for the writer_loop process. The queued_bytes_len or whatever is just counting how much data has been submitted to to the process that hasn't been acked to prevent blowing the top of that mailbox (which is quite reasonable).

The writer isn't really buffering anything itself, its just leaning on Erlang's message passing internals to be that buffer (which is quite reasonable). Then all the writer_loop does is accept messages and respond to the parent couch_file gen_server. If it happens to find multiple write messages in the mailbox consecutively at the same time, it'll write those in a single call to file:write/2.

I would not be at all surprised if it were shown that the bulk of the improvement from this patch is due to this specific part of the patch. For the curious, the zip_server test at [1] tests something quite similar to this setup.

[1] https://github.com/davisp/zip_server
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by Jan Lehnardt <ja...@apache.org>.

Moving this out of JIRA.

> Thirdly, every time someone asks, "Can it wait till it's on trunk?", all I hear is, "Can I ignore what you just said and commit this anyway?" If I point at something and say that its broken its because I'm expecting the patch to change or an explanation of why I'm wrong. And I'm fine being wrong. It happens quite often.

Paul, thanks for explaining how this ends up on your end (communication is a two way horse, yay). I can only speak for myself, but I never ever intend to ignore the feedback. Hence my latest reply in this regard to open blocking issues with the comments for the next release off the particular branch a patch goes into. I see where your frustration is coming from now, it wasn't clear before.

> But this pattern of submitting patches and asking for all concerns to be addressed after the patch is in trunk is starting to get a bit annoying.

You are saying "all concerns" and I take it as a figure of speech, but the reverse, addressing all concerns (ever so vague and little) is equally frustrating.

The solution is clear to me, we need to try and strike a balance. I don't think anyone argues that having a patch mature in JRIA for a few days (or weeks) is bad and equally, once it hits a sweet spot, that it can continue to simmer in a release branch or master and have further kinks worked out as they show up in broader use of that branch.

Knowing when to make the cut is the tricky part of course so that we don't have patches rotting on the one side and master erode into a mess on the other.

I don't think any sort of policy or process helps here, but we should collectively be aware of the trade-offs we are making and have a natural urgency to get good patches into master and stemming from that an urgency to help mature said patches in JIRA before they do hit master (all modulo this is a volunteer effort, of course :).

I feel I am starting to state the obvious here, so I'll leave you with this.

Cheers
Jan
--

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Paul Joseph Davis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152507#comment-13152507 ] 

Paul Joseph Davis commented on COUCHDB-1342:
--------------------------------------------


@Damien

> However, there is another optimization coming where a raw erlang FD is used in a calling process to avoid messaging overhead (another big performance improvement in certain long operations), which will maybe make it necessary again. 

I'm not sure what you mean here. Something along the lines of a file:open call in the couch_db_updater process (and couch_mrview_updater)? If so that's an interesting idea. Seems like we could make couch_file handle that quite easily along the lines of how file handles #prim_file vs #file (if I recall those record names correctly). This could also solve some of the fd duplication if we only need an extra fd for views that are updating.

> The concern with doubling the # non-db file descriptors is a real one. How big of a concern of this?

The thing is, I'm not certain how it'll behave. Hence why it concerns me. Is it a matter of just making sure that ulimit is set sufficiently high? How high is sufficient? If I'm running in production, and I upgrade to a version of CouchDB that has this patch, can I at least guestimate how configs might need to change? Maybe I'm being overly paranoid and its not an issue. I dunno. Hence why it concerns me.

> The 5th concern would definitely make code more complicated for callers

I agree. I should've prefaced that bit with a "I wonder if in the future there's a follow up direction we can go". It only occurred as I was finishing that comment so I figured I'd write it down.

> Your 3rd and 4th concerns aren't Apache user concerns, but can be easily addressed after check-in.

I have no idea what you mean by "Apache user concerns" here. If you're referring to "no one cares how the sausage is made so long as its faster" then I'm going to have to disagree. Strongly. Saying that databases are complicated so we shouldn't concern ourselves with code quality is just going to leave us with a source tree in an even worse state than it already is.

And I'd like to address this argument about progress and the desires of users. This patch was submitted to JIRA yesterday. My initial review was up within 3.5h. This patch changes how the file abstraction works. In a database. As far as I'm concerned development on this started yesterday at 13:28 when Jan uploaded the patch to JIRA. If you wanted things to be moving more quickly at this stage you should have been developing this on a branch in git and asking for input from the community.

Secondly, while I understand that you're highly motivated to help users by improving performance, what does that have to do with the conversation about the technical merits of this patch? This sense of urgency that progress must be made so lets address the issues I brought up after its in trunk is not a convincing argument. You could address my comments by spending thirty minutes in an editor and resubmitting the patch. Instead you're asking me to clean this up for you after its committed.

Thirdly, every time someone asks, "Can it wait till it's on trunk?", all I hear is, "Can I ignore what you just said and commit this anyway?" If I point at something and say that its broken its because I'm expecting the patch to change or an explanation of why I'm wrong. And I'm fine being wrong. It happens quite often. But this pattern of submitting patches and asking for all concerns to be addressed after the patch is in trunk is starting to get a bit annoying. If we want to adjust our policies around CTR vs RTC for larger patches, that's fine. Perhaps adding an edge branch in git that will accept all our bigger somewhat scary commits would be beneficial. If we start doing automated package building then users could even pull bleeding edge code to test. But I digress.

                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Robert Newson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152022#comment-13152022 ] 

Robert Newson commented on COUCHDB-1342:
----------------------------------------

@Damien, thanks for the clarification. It's possible to read what you originally wrote as an intention to dismiss concerns over introducing further complexity as long as it improved performance. Since Paul has now explicitly described several technical problems with the patch I'm sure we can all move on to addressing them. I'll only add that I had read the entire page on voting and didn't feel that the section you highlighted applied in this case, which is why I didn't mention it myself.

I would like to know why couch_file now needs two file descriptors to provide asynchronous writes, though. If it's integral, I'd appreciate knowing why, and whether there are alternatives. If not, a separate ticket seems appropriate. The only thing I can think of is the inability to usefully pass a handle to a file opened with [raw] between processes. In any case, doubling the consumption of precious server resources should be called out explicitly, rather than incidentally, in my opinion.

Am I also right in thinking that the improved write performance entails increased memory usage (and therefore increased GC too)? What's the impact of that on very busy servers?

                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Posted by "Paul Joseph Davis (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152685#comment-13152685 ] 

Paul Joseph Davis commented on COUCHDB-1342:
--------------------------------------------

@Damien

For #3 the issue is that it's broken. Granted there's lots in CouchDB that breaks Erlang idioms, but this is basic broken window theory. We *know* this is broken and its easily fixable. Why would we check in something that we know is broken?

For #4 the patch is reimplementing parts of the standard library. In Erlang there's a Right Way™ to start a process synchronously that might fail to initialize. It reduces the code that we have to write and maintain and its considerably more tested than the custom implementation we have here. Trying to minimize this by calling it a "style issue" is not conducive to the technical discourse.

You're also forgetting #2 which you agreed can be done. And if done it not only reduces the size of this patch in terms of total lines, but reduces the number of files that are touched, as well as removes the need for people to scour the source for places that we might need to add calls to couch_file:flush/1 and also prevents the need for devs to remember to add it in the future. And it also makes the flushing more optimal in terms of decreased message passing latency.

I'm sorry, but the culture issue here is that you continue to be subtly derogatory by deriding technical comments as a "pet concerns". This is especially disheartening when I've spent so much time and effort on trying to help by reviewing this patch and responding to this conversation. The sad part is that most of the things I've voiced an opinion on are easy to fix or abstract enough that a brief discussion on the issue with a list of things to watch out for would have been enough to assuage my concerns.

This patch is less than 48 hours old and it changes a core aspect of how we write data to disk in a database that prides itself on not losing data. What are you honestly expecting for a time table here? It hasn't been two days and you've already referenced speed of development and project progress multiple times in response to technical points.

You ask for my help yet you remain dismissive of anything I'm offering. I'm sorry you're frustrated but this is how code review works. I just recently finished maintaing a 7K line refactor for over a month while people reviewed it and even made changes to code that I was refactoring which I had to then merge (behavior, not code) into my branch. I understand how painful review can be, but that's how this system works and I, for one, thinks it makes us stronger.
                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira