You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Norman Barker <no...@gmail.com> on 2010/07/27 23:30:50 UTC

external handlers are not in sync with commits

Hi,

I have written couchdb-clucene
(http://github.com/normanb/couchdb-clucene) and am doing a lot of
testing with heavy datasets where I am sending a bulk doc request with
10 docs at a time, a couple of these every second for a couple of
minutes.

Very quickly couchdb backs up and hogs the cpu since the database
commit and return doesn't wait for an external handler to do its job.
The model of fire and forget is fine and I like it, very similar to
JMS, however since the external process is a singleton it has to be
very quick to keep up with load or the system slowly backs up.

Is there a way to either define a pool of externals, or to change the
default behaviour from fire and forget?

thanks,

Norman

Re: external handlers are not in sync with commits

Posted by Robert Newson <ro...@gmail.com>.
couchdb-lucene used to be updated via update_notification but this
required couchdb to launch Java which... sucked.

couchdb-lucene 0.5+ runs as a separate daemon and pulls data from
couchdb via _changes calls, and this has been much better.

the externals interface still serializes queries which sucks for other reasons.

B.

On Tue, Jul 27, 2010 at 10:56 PM, Norman Barker <no...@gmail.com> wrote:
> Bob,
>
> can you explain more, I should have said update_notification, I have
> an external query handler and a update notification for updating the
> index. The problem is that the data gets into the database quickly I
> perform an external query and the update handler hasn't had time to
> work.
>
> I am using _changes in the update_notification listener to get the
> doc, hence the problem is amplified as this gives me more docs than I
> was originally updated about (since the writes and handlers are not in
> sync).
>
> If I am understanding you I could set up a process that listens to
> _changes, could I bring this process under the control of erlang as
> per an update notification handler?
>
> thanks,
>
> Norman
>
> On Tue, Jul 27, 2010 at 3:41 PM, Robert Newson <ro...@gmail.com> wrote:
>> reading _changes instead of using the (deprecated?) externals feature
>> would avoid the problem?
>>
>> B.
>>
>> On Tue, Jul 27, 2010 at 10:37 PM, J Chris Anderson <jc...@apache.org> wrote:
>>>
>>> On Jul 27, 2010, at 2:30 PM, Norman Barker wrote:
>>>
>>>> Hi,
>>>>
>>>> I have written couchdb-clucene
>>>> (http://github.com/normanb/couchdb-clucene) and am doing a lot of
>>>> testing with heavy datasets where I am sending a bulk doc request with
>>>> 10 docs at a time, a couple of these every second for a couple of
>>>> minutes.
>>>>
>>>> Very quickly couchdb backs up and hogs the cpu since the database
>>>> commit and return doesn't wait for an external handler to do its job.
>>>> The model of fire and forget is fine and I like it, very similar to
>>>> JMS, however since the external process is a singleton it has to be
>>>> very quick to keep up with load or the system slowly backs up.
>>>>
>>>> Is there a way to either define a pool of externals, or to change the
>>>> default behaviour from fire and forget?
>>>>
>>>
>>> Yes, but involves feature development for CouchDB. Essentially we need the externals protocol to be non-blocking. There is a thread on dev@ that touches on this. I'm not sure who wants to own making the patch, but the technical requirements are pretty well known.
>>>
>>> Thank you for working on something so awesome!
>>>
>>> Chris
>>>
>>>> thanks,
>>>>
>>>> Norman
>>>
>>>
>>
>

Re: external handlers are not in sync with commits

Posted by Norman Barker <no...@gmail.com>.
Bob,

can you explain more, I should have said update_notification, I have
an external query handler and a update notification for updating the
index. The problem is that the data gets into the database quickly I
perform an external query and the update handler hasn't had time to
work.

I am using _changes in the update_notification listener to get the
doc, hence the problem is amplified as this gives me more docs than I
was originally updated about (since the writes and handlers are not in
sync).

If I am understanding you I could set up a process that listens to
_changes, could I bring this process under the control of erlang as
per an update notification handler?

thanks,

Norman

On Tue, Jul 27, 2010 at 3:41 PM, Robert Newson <ro...@gmail.com> wrote:
> reading _changes instead of using the (deprecated?) externals feature
> would avoid the problem?
>
> B.
>
> On Tue, Jul 27, 2010 at 10:37 PM, J Chris Anderson <jc...@apache.org> wrote:
>>
>> On Jul 27, 2010, at 2:30 PM, Norman Barker wrote:
>>
>>> Hi,
>>>
>>> I have written couchdb-clucene
>>> (http://github.com/normanb/couchdb-clucene) and am doing a lot of
>>> testing with heavy datasets where I am sending a bulk doc request with
>>> 10 docs at a time, a couple of these every second for a couple of
>>> minutes.
>>>
>>> Very quickly couchdb backs up and hogs the cpu since the database
>>> commit and return doesn't wait for an external handler to do its job.
>>> The model of fire and forget is fine and I like it, very similar to
>>> JMS, however since the external process is a singleton it has to be
>>> very quick to keep up with load or the system slowly backs up.
>>>
>>> Is there a way to either define a pool of externals, or to change the
>>> default behaviour from fire and forget?
>>>
>>
>> Yes, but involves feature development for CouchDB. Essentially we need the externals protocol to be non-blocking. There is a thread on dev@ that touches on this. I'm not sure who wants to own making the patch, but the technical requirements are pretty well known.
>>
>> Thank you for working on something so awesome!
>>
>> Chris
>>
>>> thanks,
>>>
>>> Norman
>>
>>
>

Re: external handlers are not in sync with commits

Posted by Robert Newson <ro...@gmail.com>.
Ah, I've confused external with update_notification. You can contact
couchdb-lucene on its own port which handles concurrent calls but,
yes, calls via an external are serialized (which sucks).

B.


On Tue, Jul 27, 2010 at 10:48 PM, Nils Breunese <N....@vpro.nl> wrote:
> We ran into performance problems when we put a site that was pretty couchdb-lucene heavy into production. The query times in couchdb-lucene were fast, but we believe it was all those concurrent queries that killed performance, since they all had to go through this single externals pipe to couchdb-lucene. Using the _changes feed as couchdb-lucene's input sounds like a good idea, but then couchdb-lucene queries also need to be going directly to the couchdb-lucene instance, right? I believe it supports that now, but applications may need to be modified for this. Or is there a way that this could still go through the current _fti URL's (without adding some mod_rewrite like magic).
>
> Nils.
> ________________________________________
> Van: Robert Newson [robert.newson@gmail.com]
> Verzonden: dinsdag 27 juli 2010 23:41
> Aan: user@couchdb.apache.org
> Onderwerp: Re: external handlers are not in sync with commits
>
> reading _changes instead of using the (deprecated?) externals feature
> would avoid the problem?
>
> B.
>
> On Tue, Jul 27, 2010 at 10:37 PM, J Chris Anderson <jc...@apache.org> wrote:
>>
>> On Jul 27, 2010, at 2:30 PM, Norman Barker wrote:
>>
>>> Hi,
>>>
>>> I have written couchdb-clucene
>>> (http://github.com/normanb/couchdb-clucene) and am doing a lot of
>>> testing with heavy datasets where I am sending a bulk doc request with
>>> 10 docs at a time, a couple of these every second for a couple of
>>> minutes.
>>>
>>> Very quickly couchdb backs up and hogs the cpu since the database
>>> commit and return doesn't wait for an external handler to do its job.
>>> The model of fire and forget is fine and I like it, very similar to
>>> JMS, however since the external process is a singleton it has to be
>>> very quick to keep up with load or the system slowly backs up.
>>>
>>> Is there a way to either define a pool of externals, or to change the
>>> default behaviour from fire and forget?
>>>
>>
>> Yes, but involves feature development for CouchDB. Essentially we need the externals protocol to be non-blocking. There is a thread on dev@ that touches on this. I'm not sure who wants to own making the patch, but the technical requirements are pretty well known.
>>
>> Thank you for working on something so awesome!
>>
>> Chris
>>
>>> thanks,
>>>
>>> Norman
>>
>>
>
> De informatie vervat in deze  e-mail en meegezonden bijlagen is uitsluitend bedoeld voor gebruik door de geadresseerde en kan vertrouwelijke informatie bevatten. Openbaarmaking, vermenigvuldiging, verspreiding en/of verstrekking van deze informatie aan derden is voorbehouden aan geadresseerde. De VPRO staat niet in voor de juiste en volledige overbrenging van de inhoud van een verzonden e-mail, noch voor tijdige ontvangst daarvan.
>

RE: external handlers are not in sync with commits

Posted by Nils Breunese <N....@vpro.nl>.
We ran into performance problems when we put a site that was pretty couchdb-lucene heavy into production. The query times in couchdb-lucene were fast, but we believe it was all those concurrent queries that killed performance, since they all had to go through this single externals pipe to couchdb-lucene. Using the _changes feed as couchdb-lucene's input sounds like a good idea, but then couchdb-lucene queries also need to be going directly to the couchdb-lucene instance, right? I believe it supports that now, but applications may need to be modified for this. Or is there a way that this could still go through the current _fti URL's (without adding some mod_rewrite like magic).

Nils.
________________________________________
Van: Robert Newson [robert.newson@gmail.com]
Verzonden: dinsdag 27 juli 2010 23:41
Aan: user@couchdb.apache.org
Onderwerp: Re: external handlers are not in sync with commits

reading _changes instead of using the (deprecated?) externals feature
would avoid the problem?

B.

On Tue, Jul 27, 2010 at 10:37 PM, J Chris Anderson <jc...@apache.org> wrote:
>
> On Jul 27, 2010, at 2:30 PM, Norman Barker wrote:
>
>> Hi,
>>
>> I have written couchdb-clucene
>> (http://github.com/normanb/couchdb-clucene) and am doing a lot of
>> testing with heavy datasets where I am sending a bulk doc request with
>> 10 docs at a time, a couple of these every second for a couple of
>> minutes.
>>
>> Very quickly couchdb backs up and hogs the cpu since the database
>> commit and return doesn't wait for an external handler to do its job.
>> The model of fire and forget is fine and I like it, very similar to
>> JMS, however since the external process is a singleton it has to be
>> very quick to keep up with load or the system slowly backs up.
>>
>> Is there a way to either define a pool of externals, or to change the
>> default behaviour from fire and forget?
>>
>
> Yes, but involves feature development for CouchDB. Essentially we need the externals protocol to be non-blocking. There is a thread on dev@ that touches on this. I'm not sure who wants to own making the patch, but the technical requirements are pretty well known.
>
> Thank you for working on something so awesome!
>
> Chris
>
>> thanks,
>>
>> Norman
>
>

De informatie vervat in deze  e-mail en meegezonden bijlagen is uitsluitend bedoeld voor gebruik door de geadresseerde en kan vertrouwelijke informatie bevatten. Openbaarmaking, vermenigvuldiging, verspreiding en/of verstrekking van deze informatie aan derden is voorbehouden aan geadresseerde. De VPRO staat niet in voor de juiste en volledige overbrenging van de inhoud van een verzonden e-mail, noch voor tijdige ontvangst daarvan.

Re: external handlers are not in sync with commits

Posted by Robert Newson <ro...@gmail.com>.
reading _changes instead of using the (deprecated?) externals feature
would avoid the problem?

B.

On Tue, Jul 27, 2010 at 10:37 PM, J Chris Anderson <jc...@apache.org> wrote:
>
> On Jul 27, 2010, at 2:30 PM, Norman Barker wrote:
>
>> Hi,
>>
>> I have written couchdb-clucene
>> (http://github.com/normanb/couchdb-clucene) and am doing a lot of
>> testing with heavy datasets where I am sending a bulk doc request with
>> 10 docs at a time, a couple of these every second for a couple of
>> minutes.
>>
>> Very quickly couchdb backs up and hogs the cpu since the database
>> commit and return doesn't wait for an external handler to do its job.
>> The model of fire and forget is fine and I like it, very similar to
>> JMS, however since the external process is a singleton it has to be
>> very quick to keep up with load or the system slowly backs up.
>>
>> Is there a way to either define a pool of externals, or to change the
>> default behaviour from fire and forget?
>>
>
> Yes, but involves feature development for CouchDB. Essentially we need the externals protocol to be non-blocking. There is a thread on dev@ that touches on this. I'm not sure who wants to own making the patch, but the technical requirements are pretty well known.
>
> Thank you for working on something so awesome!
>
> Chris
>
>> thanks,
>>
>> Norman
>
>

Re: external handlers are not in sync with commits

Posted by J Chris Anderson <jc...@apache.org>.
On Jul 27, 2010, at 2:30 PM, Norman Barker wrote:

> Hi,
> 
> I have written couchdb-clucene
> (http://github.com/normanb/couchdb-clucene) and am doing a lot of
> testing with heavy datasets where I am sending a bulk doc request with
> 10 docs at a time, a couple of these every second for a couple of
> minutes.
> 
> Very quickly couchdb backs up and hogs the cpu since the database
> commit and return doesn't wait for an external handler to do its job.
> The model of fire and forget is fine and I like it, very similar to
> JMS, however since the external process is a singleton it has to be
> very quick to keep up with load or the system slowly backs up.
> 
> Is there a way to either define a pool of externals, or to change the
> default behaviour from fire and forget?
> 

Yes, but involves feature development for CouchDB. Essentially we need the externals protocol to be non-blocking. There is a thread on dev@ that touches on this. I'm not sure who wants to own making the patch, but the technical requirements are pretty well known.

Thank you for working on something so awesome!

Chris

> thanks,
> 
> Norman