You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Adrian Duong <ad...@gmail.com> on 2011/12/09 22:57:09 UTC

Moderate size database(s) with fresh views hang on view query

Hello everyone,

I have a few moderately sized database, created via bulkloading from raw
source data, which have fresh (up-to-date) views. When I query these views
(via curl), the request hangs for an indeterminate length of time (on the
order of hours), and then returns the requested data. Once the data has
returned, I can query the views without delay. The problem is
re-encountered some time in the future (possibly after a restart of the
CouchDB server).

Throughout the beginning of the wait (first hour or so), I can confirm that
/_active_tasks reports no running tasks. CouchDB is reading from disk, but
not writing to it. No view server instances are running. I have tried the
query with stale=ok and stale=update_after, and I encounter the same
problem. An MD5 digest of the respective .couch and .view files for a few
of these "problem databases/views" reveals that the files have (highly
likely) not changed before and after the query.

The logging level is set to debug and the GET request is not logged until
after the view data is returned hours later.

Each database has a single design document containing upwards of 300 views
consisting of only a simple map function. It is possible for me to split
the design document, but, as I understand, this number of views in a single
document should not be the problem. Each document maps to zero or one row
in the view.

The databases have anywhere between 30 000 to 1 400 000 documents and range
in size from 100 MB to 6 GB. The .view file for each database ranges in
size from 2 GB to 50 GB. Some (potentially none or all) of these have
undergone compaction.

The views are written in Perl and use a Perl view server (see
CouchDB::View::Server on CPAN). I also believe this should not be any
different from using JS design documents and the included JS view server.

The databases were created through bulkloading the documents. After
loading, the views are indexed by a query to one of the views in the single
design document. After a newly loaded, newly indexed database, querying the
view appears to be quite successful.

I highly appreciate any insight into this problem. Thank you.

Adrian Duong