You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Filipe Manana (Resolved) (JIRA)" <ji...@apache.org> on 2011/11/16 13:02:55 UTC

[jira] [Resolved] (COUCHDB-1334) Indexer speedup (for non-native view servers)

     [ https://issues.apache.org/jira/browse/COUCHDB-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Filipe Manana resolved COUCHDB-1334.
------------------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 1.2)
                   1.3

Latest patch applied against master.
                
> Indexer speedup (for non-native view servers)
> ---------------------------------------------
>
>                 Key: COUCHDB-1334
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1334
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core, JavaScript View Server, View Server Support
>            Reporter: Filipe Manana
>            Assignee: Filipe Manana
>             Fix For: 1.3
>
>         Attachments: 0001-More-efficient-view-updater-writes.patch, 0002-More-efficient-communication-with-the-view-server.patch, master-0002-More-efficient-communication-with-the-view-server.patch, master-2-0002-More-efficient-communication-with-the-view-server.patch, master-3-0002-More-efficient-communication-with-the-view-server.patch, master-4-0002-More-efficient-communication-with-the-view-server.patch
>
>
> The following 2 patches significantly improve view index generation/update time and reduce CPU consumption.
> The first patch makes the view updater's batching more efficient, by ensuring each btree bulk insertion adds/removes a minimum of N (=100) key/value pairts. This also makes the index file size grow not so fast with old data (old btree nodes basically). This behaviour is already done in master/trunk in the new indexer (by Paul Davis).
> The second patch maximizes the throughput with an external view server (such as couchjs). Basically it makes the pipe (erlang port) communication between the Erlang VM (couch_os_process basically) and the view server more efficient since the 2 sides spend less time block on reading from the pipe.
> Here follow some benchmarks.
> test database at  http://fdmanana.iriscouch.com/test_db  (1 million documents)
> branch 1.2.x
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real	2m45.097s
> user	0m0.006s
> sys	0m0.007s
> view file size: 333Mb
> CPU usage:
> $ sar 1 60
> 22:27:20  %usr  %nice   %sys   %idle
> 22:27:21   38      0     12     50
> (....)
> 22:28:21   39      0     13     49
> Average:     39      0     13     47   
> branch 1.2.x + batch patch (first patch)
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real	2m12.736s
> user	0m0.006s
> sys	0m0.005s
> view file size 72Mb
> branch 1.2.x + batch patch + os_process patch
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real	1m9.330s
> user	0m0.006s
> sys	0m0.004s
> view file size:  72Mb
> CPU usage:
> $ sar 1 60
> 22:22:55  %usr  %nice   %sys   %idle
> 22:23:53   22      0      6     72
> (....)
> 22:23:55   22      0      6     72
> Average:     22      0      7     70   
> master/trunk
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real	1m57.296s
> user	0m0.006s
> sys	0m0.005s
> master/trunk + os_process patch
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real	0m53.768s
> user	0m0.006s
> sys	0m0.006s

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Resolved] (COUCHDB-1334) Indexer speedup (for non-native view servers)

Posted by Dave Cottlehuber <da...@muse.net.nz>.
On 16 November 2011 13:02, Filipe Manana (Resolved) (JIRA)
<ji...@apache.org> wrote:
>
>     [ https://issues.apache.org/jira/browse/COUCHDB-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Filipe Manana resolved COUCHDB-1334.
> ------------------------------------
>
>       Resolution: Fixed
>    Fix Version/s:     (was: 1.2)
>                   1.3
>
> Latest patch applied against master.
>
>> Indexer speedup (for non-native view servers)
>> ---------------------------------------------
>>
>>                 Key: COUCHDB-1334
>>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1334
[snip]
>> The following 2 patches significantly improve view index generation/update time and reduce CPU consumption.
>> The first patch makes the view updater's batching more efficient, by ensuring each btree bulk insertion adds/removes a minimum of N (=100) key/value pairts. This also makes the index file size grow not so fast with old data (old btree nodes basically). This behaviour is already done in master/trunk in the new indexer (by Paul Davis).
>> The second patch maximizes the throughput with an external view server (such as couchjs). Basically it makes the pipe (erlang port) communication between the Erlang VM (couch_os_process basically) and the view server more efficient since the 2 sides spend less time block on reading from the pipe.

Hi Filipe,

Just a heads up, but I am consistently having the Erlang VM hang when
doing a master build today. Clearly, my issue may not be related to
this patch but I can't see anything else after a quick look in git
history that stands out.

As this uses the same Spidermonkey, Erlang R14B03 + patches etc as for
all my other builds, I am assuming this is a bug triggered by CouchDB
in the VM somewhere.

Hopefully tomorrow I'll both isolate the commit where things go awry,
and also find a way of getting some logging or a test case -- it would
be great to have a fix for this go into OTP/R15.

If you have time, or some advice to share, the binary build is at
https://www.dropbox.com/s/jeifcxpbtpo78ak/Snapshots/20111122

Thanks
Dave