You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Filipe Manana (JIRA)" <ji...@apache.org> on 2011/07/12 18:35:00 UTC

[jira] [Created] (COUCHDB-1218) Better logger performance

Better logger performance
-------------------------

                 Key: COUCHDB-1218
                 URL: https://issues.apache.org/jira/browse/COUCHDB-1218
             Project: CouchDB
          Issue Type: Improvement
            Reporter: Filipe Manana
            Assignee: Filipe Manana
         Attachments: 0001-Better-logger-performance.patch

I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file.
It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread:

http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3C5C39FB5A-0ACA-4FF9-BD90-2EBECF271850@apache.org%3E

Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default):

http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f

The reads got a better throughput (bottom graph, easier to visualize).

The patch (also attached here), which has a descriptive comment, is at:

https://github.com/fdmanana/couchdb/compare/logger_perf.patch



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1218) Better logger performance

Posted by "Filipe Manana (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064518#comment-13064518 ] 

Filipe Manana commented on COUCHDB-1218:
----------------------------------------

I tried something hand crafted, like the async couch_file does, before and the results were not as good as with disk_log:
http://friendpaste.com/X9tX1LCb9Mn6SiTueFlXY

Unless there's a high cpu or memory usage risk (or high chances of losing messages), I don't see a problem in using disk_log - if it's part of standard OTP, it's available in every installation, plus we get the benefits of wide exposure (testing, bug fixes, enhancements)

> Better logger performance
> -------------------------
>
>                 Key: COUCHDB-1218
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1218
>             Project: CouchDB
>          Issue Type: Improvement
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>         Attachments: 0001-Better-logger-performance.patch
>
>
> I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file.
> It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread:
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3C5C39FB5A-0ACA-4FF9-BD90-2EBECF271850@apache.org%3E
> Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default):
> http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f
> The reads got a better throughput (bottom graph, easier to visualize).
> The patch (also attached here), which has a descriptive comment, is at:
> https://github.com/fdmanana/couchdb/compare/logger_perf.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1218) Better logger performance

Posted by "Paul Joseph Davis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064179#comment-13064179 ] 

Paul Joseph Davis commented on COUCHDB-1218:
--------------------------------------------

Hadn't heard of the disk_log module before but it looks interesting. There's a noticeable amount of text in the prefix that references things like fixing up logs that weren't shut down properly. Is this something we have to worry about? I'm fine if a kill -9 ends up garbling some log output, but if it breaks server boot, that'd be bad.

> Better logger performance
> -------------------------
>
>                 Key: COUCHDB-1218
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1218
>             Project: CouchDB
>          Issue Type: Improvement
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>         Attachments: 0001-Better-logger-performance.patch
>
>
> I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file.
> It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread:
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3C5C39FB5A-0ACA-4FF9-BD90-2EBECF271850@apache.org%3E
> Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default):
> http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f
> The reads got a better throughput (bottom graph, easier to visualize).
> The patch (also attached here), which has a descriptive comment, is at:
> https://github.com/fdmanana/couchdb/compare/logger_perf.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1218) Better logger performance

Posted by "Robert Newson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064188#comment-13064188 ] 

Robert Newson commented on COUCHDB-1218:
----------------------------------------

disk_log module seems a bit weird, I wonder if it's intended for this? Seems more like a binary logger sort of thing.

The patch still goes through the gen_event server, though only to print to screen, seems a shame that we can't remove it at all.

I'm equivocal. Logging can be improved for sure but I'm not sure this is the way. Something pluggable might be better, for example, I'd rather send the log statements over UDP to a syslog daemon and let it handle the writing and file rotation.

> Better logger performance
> -------------------------
>
>                 Key: COUCHDB-1218
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1218
>             Project: CouchDB
>          Issue Type: Improvement
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>         Attachments: 0001-Better-logger-performance.patch
>
>
> I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file.
> It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread:
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3C5C39FB5A-0ACA-4FF9-BD90-2EBECF271850@apache.org%3E
> Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default):
> http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f
> The reads got a better throughput (bottom graph, easier to visualize).
> The patch (also attached here), which has a descriptive comment, is at:
> https://github.com/fdmanana/couchdb/compare/logger_perf.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (COUCHDB-1218) Better logger performance

Posted by "Filipe Manana (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Filipe Manana updated COUCHDB-1218:
-----------------------------------

    Attachment: 0001-Better-logger-performance.patch

> Better logger performance
> -------------------------
>
>                 Key: COUCHDB-1218
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1218
>             Project: CouchDB
>          Issue Type: Improvement
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>         Attachments: 0001-Better-logger-performance.patch
>
>
> I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file.
> It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread:
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3C5C39FB5A-0ACA-4FF9-BD90-2EBECF271850@apache.org%3E
> Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default):
> http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f
> The reads got a better throughput (bottom graph, easier to visualize).
> The patch (also attached here), which has a descriptive comment, is at:
> https://github.com/fdmanana/couchdb/compare/logger_perf.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (COUCHDB-1218) Better logger performance

Posted by "Filipe Manana (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Filipe Manana resolved COUCHDB-1218.
------------------------------------

       Resolution: Fixed
    Fix Version/s: 1.2

Applied to trunk

> Better logger performance
> -------------------------
>
>                 Key: COUCHDB-1218
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1218
>             Project: CouchDB
>          Issue Type: Improvement
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>             Fix For: 1.2
>
>         Attachments: 0001-Better-logger-performance.patch
>
>
> I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file.
> It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread:
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3C5C39FB5A-0ACA-4FF9-BD90-2EBECF271850@apache.org%3E
> Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default):
> http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f
> The reads got a better throughput (bottom graph, easier to visualize).
> The patch (also attached here), which has a descriptive comment, is at:
> https://github.com/fdmanana/couchdb/compare/logger_perf.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1218) Better logger performance

Posted by "Paul Joseph Davis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064181#comment-13064181 ] 

Paul Joseph Davis commented on COUCHDB-1218:
--------------------------------------------

Also, browsing the those benchmark graphs I notice that response times aren't changing, but the number of requests per second is increasing. Have you looked at memory usage patterns during such a test? I'd be curious to see of the disk_log process is just being out competed by .couch file i/o and then it buffers the log messages in RAM. Not that I have any idea how bad that'd be in terms of size.

> Better logger performance
> -------------------------
>
>                 Key: COUCHDB-1218
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1218
>             Project: CouchDB
>          Issue Type: Improvement
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>         Attachments: 0001-Better-logger-performance.patch
>
>
> I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file.
> It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread:
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3C5C39FB5A-0ACA-4FF9-BD90-2EBECF271850@apache.org%3E
> Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default):
> http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f
> The reads got a better throughput (bottom graph, easier to visualize).
> The patch (also attached here), which has a descriptive comment, is at:
> https://github.com/fdmanana/couchdb/compare/logger_perf.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1218) Better logger performance

Posted by "Robert Newson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064500#comment-13064500 ] 

Robert Newson commented on COUCHDB-1218:
----------------------------------------

Ok, I get that. I wonder if we can't make a simple module that does those things (the async queue, flush for errors, etc) without pulling in disk_log? disk_log demonstrates that those things improve logging performance, which is a good thing. Doing likewise in couch_log would give us the boost and keep the simplicity.

I mention syslog in passing because we generally replace/disable couch_log anyway (sending output to /dev/null and using runit to capture console output to a managed log file, or replacing it with a custom logger that sends to syslog). A program that manages its own log files, rather than integrating with the OS's logging system is generally annoying to manage. I'd keep this one simple.


> Better logger performance
> -------------------------
>
>                 Key: COUCHDB-1218
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1218
>             Project: CouchDB
>          Issue Type: Improvement
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>         Attachments: 0001-Better-logger-performance.patch
>
>
> I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file.
> It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread:
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3C5C39FB5A-0ACA-4FF9-BD90-2EBECF271850@apache.org%3E
> Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default):
> http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f
> The reads got a better throughput (bottom graph, easier to visualize).
> The patch (also attached here), which has a descriptive comment, is at:
> https://github.com/fdmanana/couchdb/compare/logger_perf.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (COUCHDB-1218) Better logger performance

Posted by "Robert Newson (Reopened) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Newson reopened COUCHDB-1218:
------------------------------------


The current fix was reverted due to the inability to truncate the file during external log rotation. The performance improvements are still valuable so we should revisit this for 1.3 (and possibly backport to 1.2.1 if there is one)
                
> Better logger performance
> -------------------------
>
>                 Key: COUCHDB-1218
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1218
>             Project: CouchDB
>          Issue Type: Improvement
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>             Fix For: 1.2.1, 1.3
>
>         Attachments: 0001-Better-logger-performance.patch
>
>
> I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file.
> It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread:
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3C5C39FB5A-0ACA-4FF9-BD90-2EBECF271850@apache.org%3E
> Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default):
> http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f
> The reads got a better throughput (bottom graph, easier to visualize).
> The patch (also attached here), which has a descriptive comment, is at:
> https://github.com/fdmanana/couchdb/compare/logger_perf.patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (COUCHDB-1218) Better logger performance

Posted by "Robert Newson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Newson updated COUCHDB-1218:
-----------------------------------

    Fix Version/s:     (was: 1.2)
                   1.3
                   1.2.1
    
> Better logger performance
> -------------------------
>
>                 Key: COUCHDB-1218
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1218
>             Project: CouchDB
>          Issue Type: Improvement
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>             Fix For: 1.2.1, 1.3
>
>         Attachments: 0001-Better-logger-performance.patch
>
>
> I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file.
> It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread:
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3C5C39FB5A-0ACA-4FF9-BD90-2EBECF271850@apache.org%3E
> Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default):
> http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f
> The reads got a better throughput (bottom graph, easier to visualize).
> The patch (also attached here), which has a descriptive comment, is at:
> https://github.com/fdmanana/couchdb/compare/logger_perf.patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1218) Better logger performance

Posted by "Filipe Manana (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064495#comment-13064495 ] 

Filipe Manana commented on COUCHDB-1218:
----------------------------------------

Yep, disk_log was made with the purpose of logging into files. It has many features, like log rotation etc. It can log terms or raw data (i'm using the later), it can "repair" log files, etc. I'm using it in the simplest way possible, to achieve exactly the same of what is being done currently by couch_log.

I haven't seen increase in cpu and memory usage (via dstat and htop) compared to current trunk.
The async api basically puts the messages into a queue and then a disk_log worker is constantly dequeing from that queue and writing to the file.

Something plugabble seems like a completely different issue and it's not what I'm trying to address here. Plus depending on syslog, or something else external, doesn't seem a good thing by default for me - how would it work on Windows, or mobile?

That said, I like the simplicity of our logger - nothing fancy, small code and plain text files.

> Better logger performance
> -------------------------
>
>                 Key: COUCHDB-1218
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1218
>             Project: CouchDB
>          Issue Type: Improvement
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>         Attachments: 0001-Better-logger-performance.patch
>
>
> I made some experiments with OTP's disk_log module (available since 2001 at least) to use it to manage the log file.
> It turns out I got better throughput by using it. Basically it adopts a strategy similar to the asynchronous couch_file Damien described in this thread:
> http://mail-archives.apache.org/mod_mbox/couchdb-dev/201106.mbox/%3C5C39FB5A-0ACA-4FF9-BD90-2EBECF271850@apache.org%3E
> Here's a benchmark with relaximation, 50 writers, 100 readers, documents of 1Kb, delayed_commits set to false and 'info' log level (default):
> http://graphs.mikeal.couchone.com/#/graph/9e19f6d9eeb318c70cabcf67bc013c7f
> The reads got a better throughput (bottom graph, easier to visualize).
> The patch (also attached here), which has a descriptive comment, is at:
> https://github.com/fdmanana/couchdb/compare/logger_perf.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira