You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Nils Breunese <N....@vpro.nl> on 2010/06/01 16:10:40 UTC

Performance issues with external process

Hello all,

Last weekend we ran a mashup site [0] for the Dutch Pinkpop music
festival. A backend harvesting process stored all data in CouchDB and we
used couchdb-lucene for full-text indexing. (There are a lot of other
moving parts, it's not all run on CouchDB.)

All individual parts of our setup performed very well, but still we had
some serious performance problems which seem to trace back to the fact
that CouchDB and external processes communicate via stdin/stdout (a
pipe), which AFAIK is a communication channel that does not allow for
concurrency. Queries to CouchDB were fast, query times on couchdb-lucene
were fast, but still the end result was slow, because queries to
couchdb-lucene all needed to be serialized and go through the pipe to
couchdb-lucene's handler script.

We have discussed this with Robert Newson of couchdb-lucene and he
suggested going around CouchDB and talking HTTP to couchdb-lucene
directly. While this may work, I thought I'd join the dev list and bring
up this issue on the dev list here and ask if there might be a way to
allow concurrent access to external processes somehow, because this was
a performance bottleneck we hadn't accounted for and I feel others may
run into this as well at some point.

Nils.

[0] http://pinkpop.vpro.nl/

De informatie vervat in deze  e-mail en meegezonden bijlagen is uitsluitend bedoeld voor gebruik door de geadresseerde en kan vertrouwelijke informatie bevatten. Openbaarmaking, vermenigvuldiging, verspreiding en/of verstrekking van deze informatie aan derden is voorbehouden aan geadresseerde. De VPRO staat niet in voor de juiste en volledige overbrenging van de inhoud van een verzonden e-mail, noch voor tijdige ontvangst daarvan.