You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by "slubowsky@netzero.net" <sl...@netzero.net> on 2009/11/25 20:12:22 UTC

filters for _changes dont seem to work

I am trying to use a simple filter to filter the changes I get when asking for changes using the _changes API. 

I find that often the filter is not even called (I write a message to the log when the filter is called, when it returns false, and when it returns true) and I don't get changes I should have received (especially changes to existing documents, new documents seem to do better, deletes never get included, not even the first one). 

What typically happens is that the first couple of changes do get through as expected (and I see the appropriate log messages written by the filter) but then after a few updates, my filter doesn't even get called (I see no log messages from my filter) and couch seems to decide on its own not include the change but just updates the last_seq number and sends that.

My filter and a snippet of the log file showing one PUT causing the filter to be called and a change sent out, followed by a nearly identical PUT that doesn't even cause the filter to be called and the change fails to get sent out follows.

Any help would be appreciated.
Thanks
Stephen

function(doc, req) { 
    log('filter called'); 
    if(req.query.time >= doc.dateOf_) { 
        log('filter passed'); 
        return true; 
    } else { 
        log('filter failed'); 
        return false; 
    }
}

[Wed, 25 Nov 2009 13:37:06 GMT] [info] [<0.11350.0>] 10.50.17.52 - - 'GET' /thing0a321134-24df-0585/_changes?feed=longpoll&timeout=10000&since=84&filter=test/deltas&time=1259170547605&_=1259174167457 200

[Wed, 25 Nov 2009 13:37:14 GMT] [info] [<0.11363.0>] 127.0.0.1 - - 'PUT' /thing0a321134-24df-0585/C41924D7-8B00-0001-D841-1E21AD40B120 201

[Wed, 25 Nov 2009 13:37:14 GMT] [info] [<0.7003.0>] OS Process :: filter called

[Wed, 25 Nov 2009 13:37:14 GMT] [info] [<0.7003.0>] OS Process :: filter passed

[Wed, 25 Nov 2009 13:37:15 GMT] [info] [<0.11023.0>] 10.50.17.52 - - 'GET' /thing0a321134-24df-0585/_changes?feed=longpoll&timeout=10000&since=85&filter=test/deltas&time=1259170547605&_=1259174176091 200

[Wed, 25 Nov 2009 13:37:15 GMT] [info] [<0.4049.0>] checkpointing view update at seq 85 for thing0a321134-24df-0585 _design/test

[Wed, 25 Nov 2009 13:37:21 GMT] [info] [<0.11382.0>] 127.0.0.1 - - 'PUT' /thing0a321134-24df-0585/C41924D7-8B00-0001-D841-1E21AD40B120 201

[Wed, 25 Nov 2009 13:37:22 GMT] [info] [<0.11350.0>] 10.50.17.52 - - 'GET' /thing0a321134-24df-0585/_changes?feed=longpoll&timeout=10000&since=86&filter=test/deltas&time=1259170547605&_=1259174182984 200
 

____________________________________________________________
Human Resource Training
Complete an accredited human resources degree, 100% online. Free info!
http://thirdpartyoffers.netzero.net/TGL2231/c?cp=FFjGpJY4nPYAJ60w7RLkJAAAJz7PXgnlLC7u5ZtKh1zh0o-lAAQAAAAFAAAAAEa2Mz4AAAMlAAAAAAAAAAAAAAAAAAASIwAAAAA=

Re: filters for _changes dont seem to work

Posted by Sven Helmberger <sv...@gmx.de>.
slubowsky@netzero.net wrote:
> I am trying to use a simple filter to filter the changes I get when asking for changes using the _changes API. 
> 
> I find that often the filter is not even called (I write a message to the log when the filter is called, when it returns false, and when it returns true) and I don't get changes I should have received (especially changes to existing documents, new documents seem to do better, deletes never get included, not even the first one). 
> 
> What typically happens is that the first couple of changes do get through as expected (and I see the appropriate log messages written by the filter) but then after a few updates, my filter doesn't even get called (I see no log messages from my filter) and couch seems to decide on its own not include the change but just updates the last_seq number and sends that.
> 
> My filter and a snippet of the log file showing one PUT causing the filter to be called and a change sent out, followed by a nearly identical PUT that doesn't even cause the filter to be called and the change fails to get sent out follows.
> 
> Any help would be appreciated.

You are not writing back documents with the same content, are you?

Regards,
Sven Helmberger

Re: filters for _changes dont seem to work

Posted by Roger Binns <ro...@rogerbinns.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Anderson wrote:
> I'll let this one sit for a couple of days while people have a chance to try it out.

Hah your reverse psychology does not work on me.  Wait, it did.  I have code
that reproduces it. (I'll update the bug.)

The bug is far more complex and has "random" behaviour.  The prerequisites
are a _changes longpoll request using a filter function.  The symptom you
get is that the filter function is not called and treated as though it
returns false some of the time.  (You can verify it is not called by calling
log and that matching items are not returned.)

The items being changed also need to be in a view.  The view needs to be
accessed between changes.  (Changes means create/change/delete.)  This is
the crucial part of reproduction.

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAksUk7QACgkQmOOfHg372QThLACfaMIIzJluPHVSg2fz6yPGmYlB
LLwAn0d2Uy+7raeGviMyTxPHF9vNMnVs
=zyPB
-----END PGP SIGNATURE-----

Re: filters for _changes dont seem to work

Posted by Chris Anderson <jc...@apache.org>.
On Sun, Nov 29, 2009 at 11:33 PM, Roger Binns <ro...@rogerbinns.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Chris Anderson wrote:
>>> This should only have effected longpoll. I'm still trying to reproduce your bug.
>
> On checking my source I am using longpoll.  I tried a simple reproduction
> using curl in a shell script and the bug doesn't show up.  My real source
> also does queries for each changed item but doing that started getting silly
> in curl.  It also looks like there must be some sort of delay.

Thanks Roger and Stephen. I'll let this one sit for a couple of days
while people have a chance to try it out. If you were using longpoll
this definitely should have fixed it.

If anyone sees similar issues on trunk please add your experience to
the Jira ticket:

https://issues.apache.org/jira/browse/COUCHDB-582

Chris


-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: filters for _changes dont seem to work

Posted by Roger Binns <ro...@rogerbinns.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Anderson wrote:
>> This should only have effected longpoll. I'm still trying to reproduce your bug.

On checking my source I am using longpoll.  I tried a simple reproduction
using curl in a shell script and the bug doesn't show up.  My real source
also does queries for each changed item but doing that started getting silly
in curl.  It also looks like there must be some sort of delay.

I can write reproduction code in Python (which is what my source is).  Would
that work for you?  You'll need the Python couchdb module which depends on a
whole bunch of other things giving you a dependency hell.  Or if you have
Ubuntu 9.10 then apt-get install python-couchdb works.

>> This mailing list a good place but you might want to start another
>> thread. 

You missed my point completely :-)  I wanted the Ubuntu CouchDB *and* svn on
the same machine at the same time without trampling each other.  I did get
it working (you have to be very careful what order you do things and which
commands are prefixed, cleans etc).  The good news is it is now all documented:

  http://wiki.apache.org/couchdb/Running%20CouchDB%20in%20Dev%20Mode

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAksTdU0ACgkQmOOfHg372QS4RQCgiJ21Bq0o7rU7RyKkaNg5Hw4q
BPkAn0KipC4leqrjEnUXogXwand/iCzQ
=pEAz
-----END PGP SIGNATURE-----

Re: filters for _changes dont seem to work

Posted by Chris Anderson <jc...@apache.org>.
On Sun, Nov 29, 2009 at 9:16 PM, Roger Binns <ro...@rogerbinns.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Chris Anderson wrote:
>> Please try it again and let me know how it works.
>
> Do you expect this to have fixed the main bug (in my case the filter matched
> every document and I am not using longpoll)?

This should only have effected longpoll. I'm still trying to reproduce your bug.

>
> Secondly where is the appropriate place to discuss getting the svn version
> built and running on Ubuntu 9.10 but without touching the Ubuntu installed
> version?

This mailing list a good place but you might want to start another
thread. Once you've updated and upgraded, `apt-get install couchdb`
should be enough to have it running on 9.10 on EC2. I have some code
that might be helpful:

http://github.com/couchio/couchdb-ec2/blob/karmic/bin/remote/create-couchdb-image-remote#L37

Chris


>
> Roger
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAksTVSgACgkQmOOfHg372QSagQCfXfYPH0KefYeAjjJYMLOnUII0
> j2AAn3rWccbZbOQLBh+n46J8CAQEDdBi
> =ztFb
> -----END PGP SIGNATURE-----
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: filters for _changes dont seem to work

Posted by Roger Binns <ro...@rogerbinns.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Chris Anderson wrote:
> Please try it again and let me know how it works.

Do you expect this to have fixed the main bug (in my case the filter matched
every document and I am not using longpoll)?

Secondly where is the appropriate place to discuss getting the svn version
built and running on Ubuntu 9.10 but without touching the Ubuntu installed
version?

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAksTVSgACgkQmOOfHg372QSagQCfXfYPH0KefYeAjjJYMLOnUII0
j2AAn3rWccbZbOQLBh+n46J8CAQEDdBi
=ztFb
-----END PGP SIGNATURE-----

Re: filters for _changes dont seem to work

Posted by Chris Anderson <jc...@apache.org>.
On Wed, Nov 25, 2009 at 11:12 AM, slubowsky@netzero.net
<sl...@netzero.net> wrote:
> I am trying to use a simple filter to filter the changes I get when asking for changes using the _changes API.
>
> I find that often the filter is not even called (I write a message to the log when the filter is called, when it returns false, and when it returns true) and I don't get changes I should have received (especially changes to existing documents, new documents seem to do better, deletes never get included, not even the first one).
>

I've committed a test and a fix for the case where a longpoll would
return with an empty results list when the first change does not match
the filter.

This fix should make the behavior more predictable. Please try it
again and let me know how it works.

Thanks for reporting.

Chris



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: filters for _changes dont seem to work

Posted by Chris Anderson <jc...@apache.org>.
On Sat, Nov 28, 2009 at 1:53 PM, Roger Binns <ro...@rogerbinns.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Sven Helmberger wrote:
>> You are not writing back documents with the same content, are you?
>
> In my case, no.  For my testing it is documents being created and destroyed
> which aren't been noticed by filter after they initially are.
>
> Robert Dionne wrote:
>> I just happened to be looking at the code that handles this and am
>> wondering if it's a timing issue. Could you try increasing the timeout
>> or not specifying it (the default looks to be 60s) and/or setting
>> delayed_commits to false in the config file.
>>
>> I could be that the code that runs the filters is not getting the
>> notifications in time. I"m just speculating but it's an easy test to try.
>
> Changing delayed_commits to false did not make a difference.  The timeout
> just sets the maximum amount of time before the http response completes.  If
> not specified it is 60 seconds, works as expected when set to 10 seconds and
> when set to lots of 9's behaved as though it was set to 60 seconds.  I am
> using heartbeat which works as documented.
>
> While looking at that code something that would be very nice is if the http
> response does not complete until there is at least one change *after* the
> filter has been applied (or a timeout etc).  Currently the response
> completes if there is at least one change before the filter - I have the
> filter because I only care about .1% of my documents changing (filter on
> doc.type).

It sounds like this needs more testing. There is a test for filter in
the test suite and available on your CouchDB server at
/_utils/script/test/changes.js

There is an additional process optimization that needs to be done,
which should simplify the code base a little and perhaps flush these
bugs out as well.

Thanks for bringing this to our attention. Any patches you can make to
the test suite to show the errors would be extremely helpful.

I've also created a Jira ticket for the issue:

http://issues.apache.org/jira/browse/COUCHDB-582

I'll look into it in the next few days.

Thanks,
Chris



>
> Roger
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAksRm9gACgkQmOOfHg372QRJIACaAvEaSHrvbmT6CC1YSn6yT0fC
> yfwAoNZDqGTobyKuxR4iMG9na2/GBzXP
> =okOU
> -----END PGP SIGNATURE-----
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: filters for _changes dont seem to work

Posted by Roger Binns <ro...@rogerbinns.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sven Helmberger wrote:
> You are not writing back documents with the same content, are you?

In my case, no.  For my testing it is documents being created and destroyed
which aren't been noticed by filter after they initially are.

Robert Dionne wrote:
> I just happened to be looking at the code that handles this and am
> wondering if it's a timing issue. Could you try increasing the timeout
> or not specifying it (the default looks to be 60s) and/or setting
> delayed_commits to false in the config file.
> 
> I could be that the code that runs the filters is not getting the
> notifications in time. I"m just speculating but it's an easy test to try.

Changing delayed_commits to false did not make a difference.  The timeout
just sets the maximum amount of time before the http response completes.  If
not specified it is 60 seconds, works as expected when set to 10 seconds and
when set to lots of 9's behaved as though it was set to 60 seconds.  I am
using heartbeat which works as documented.

While looking at that code something that would be very nice is if the http
response does not complete until there is at least one change *after* the
filter has been applied (or a timeout etc).  Currently the response
completes if there is at least one change before the filter - I have the
filter because I only care about .1% of my documents changing (filter on
doc.type).

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAksRm9gACgkQmOOfHg372QRJIACaAvEaSHrvbmT6CC1YSn6yT0fC
yfwAoNZDqGTobyKuxR4iMG9na2/GBzXP
=okOU
-----END PGP SIGNATURE-----

Re: filters for _changes dont seem to work

Posted by Robert Dionne <di...@dionne-associates.com>.
I just happened to be looking at the code that handles this and am  
wondering if it's a timing issue. Could you try increasing the timeout  
or not specifying it (the default looks to be 60s) and/or setting  
delayed_commits to false in the config file.

I could be that the code that runs the filters is not getting the  
notifications in time. I"m just speculating but it's an easy test to  
try.

Cheers,

Bob




On Nov 25, 2009, at 2:12 PM, slubowsky@netzero.net wrote:

> I am trying to use a simple filter to filter the changes I get when  
> asking for changes using the _changes API.
>
> I find that often the filter is not even called (I write a message  
> to the log when the filter is called, when it returns false, and  
> when it returns true) and I don't get changes I should have received  
> (especially changes to existing documents, new documents seem to do  
> better, deletes never get included, not even the first one).
>
> What typically happens is that the first couple of changes do get  
> through as expected (and I see the appropriate log messages written  
> by the filter) but then after a few updates, my filter doesn't even  
> get called (I see no log messages from my filter) and couch seems to  
> decide on its own not include the change but just updates the  
> last_seq number and sends that.
>
> My filter and a snippet of the log file showing one PUT causing the  
> filter to be called and a change sent out, followed by a nearly  
> identical PUT that doesn't even cause the filter to be called and  
> the change fails to get sent out follows.
>
> Any help would be appreciated.
> Thanks
> Stephen
>
> function(doc, req) {
>    log('filter called');
>    if(req.query.time >= doc.dateOf_) {
>        log('filter passed');
>        return true;
>    } else {
>        log('filter failed');
>        return false;
>    }
> }
>
> [Wed, 25 Nov 2009 13:37:06 GMT] [info] [<0.11350.0>] 10.50.17.52 - -  
> 'GET' /thing0a321134-24df-0585/_changes? 
> feed=longpoll&timeout=10000&since=84&filter=test/ 
> deltas&time=1259170547605&_=1259174167457 200
>
> [Wed, 25 Nov 2009 13:37:14 GMT] [info] [<0.11363.0>] 127.0.0.1 - -  
> 'PUT' /thing0a321134-24df-0585/C41924D7-8B00-0001-D841-1E21AD40B120  
> 201
>
> [Wed, 25 Nov 2009 13:37:14 GMT] [info] [<0.7003.0>] OS Process ::  
> filter called
>
> [Wed, 25 Nov 2009 13:37:14 GMT] [info] [<0.7003.0>] OS Process ::  
> filter passed
>
> [Wed, 25 Nov 2009 13:37:15 GMT] [info] [<0.11023.0>] 10.50.17.52 - -  
> 'GET' /thing0a321134-24df-0585/_changes? 
> feed=longpoll&timeout=10000&since=85&filter=test/ 
> deltas&time=1259170547605&_=1259174176091 200
>
> [Wed, 25 Nov 2009 13:37:15 GMT] [info] [<0.4049.0>] checkpointing  
> view update at seq 85 for thing0a321134-24df-0585 _design/test
>
> [Wed, 25 Nov 2009 13:37:21 GMT] [info] [<0.11382.0>] 127.0.0.1 - -  
> 'PUT' /thing0a321134-24df-0585/C41924D7-8B00-0001-D841-1E21AD40B120  
> 201
>
> [Wed, 25 Nov 2009 13:37:22 GMT] [info] [<0.11350.0>] 10.50.17.52 - -  
> 'GET' /thing0a321134-24df-0585/_changes? 
> feed=longpoll&timeout=10000&since=86&filter=test/ 
> deltas&time=1259170547605&_=1259174182984 200
>
>
> ____________________________________________________________
> Human Resource Training
> Complete an accredited human resources degree, 100% online. Free info!
> http://thirdpartyoffers.netzero.net/TGL2231/c?cp=FFjGpJY4nPYAJ60w7RLkJAAAJz7PXgnlLC7u5ZtKh1zh0o-lAAQAAAAFAAAAAEa2Mz4AAAMlAAAAAAAAAAAAAAAAAAASIwAAAAA=


Re: filters for _changes dont seem to work

Posted by Roger Binns <ro...@rogerbinns.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

slubowsky@netzero.net wrote:
> I am trying to use a simple filter to filter the changes I get when asking for changes using the _changes API. 
> 
> I find that often the filter is not even called 

I am seeing the same behaviour.  For example if two documents are added in
rapid succession the filter function is only called for the first one.  My
filter function doesn't do "fancy" stuff like look at the request:

function(doc,req) {
  //  log('filter called '+toJSON(doc));
  return doc.type=="collection" || doc._deleted==true;
}

> What typically happens is that the first couple of changes do get through as expected (and I see
> the appropriate log messages written by the filter) but then after a few updates, my filter doesn't 
> even get called (I see no log messages from my filter) and couch seems to decide on its own not 
> include the change but just updates the last_seq number and sends that.

My swag is it seems to be some sort of race condition.  In most cases I only
get the first change filtered.  I do some other queries and then issue the
change request again.  Things seem to work ok when using firefox manually
altering the URL for each change.

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAksQvSsACgkQmOOfHg372QR6nwCZAZLSdSnJAKnDhNz3UQXdmdwf
qg4AoJJY62Yu/MilUHkbo/K4Z0RIOt1Y
=mQ0u
-----END PGP SIGNATURE-----