You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Matt Goodall <ma...@gmail.com> on 2010/06/10 11:49:57 UTC

Intermittent view failure

Hi,

An application test started "randomly" failing when getting a
relatively simple map view. Here's the error from the couchdb logs:

[info] [<0.459.0>] 127.0.0.1 - - 'PUT' /killer 201
[info] [<0.460.0>] 127.0.0.1 - - 'POST' /killer 201
[info] [<0.468.0>] 127.0.0.1 - - 'POST' /killer 201
[error] [<0.469.0>] Uncaught error in HTTP request: {exit,
                                {noproc,
                                 {gen_server,call,
                                  [<0.449.0>,{request_group,2},infinity]}}}
[info] [<0.469.0>] Stacktrace: [{gen_server,call,3},
            {couch_view_group,request_group,2},
            {couch_view,get_map_view,4},
            {couch_httpd_view,design_doc_view,5},
            {couch_httpd_db,do_db_req,2},
            {couch_httpd,handle_request_int,5},
            {mochiweb_http,headers,5},
            {proc_lib,init_p_do_apply,3}]
[info] [<0.469.0>] 127.0.0.1 - - 'GET' /killer/_design/killer/_view/killer 500

I've managed to whittle the test down to a simple failing script that
only depends on curl:

set -e
db="http://localhost:5984/killer"
while `true`; do
   curl -s $db -X DELETE > /dev/null
   curl -fs $db -X PUT > /dev/null
   curl -fs $db -X POST -d @- > /dev/null << EOF
{
   "_id": "_design/killer",
   "views": {
       "killer": {
           "map": "function(doc) {emit(null, null);}"
       }
   }
}
EOF
   curl -fs $db -X POST -d {} > /dev/null
   curl -fs $db/_design/killer/_view/killer
done

After running for some period of time the server dies with the above
error. Unfortunately, "some period of time" is not very consistent.
I've seen it die after as few as 5 loops and as many as 1500, but it
does eventually die for me.

Some interesting points:

* The view returns the same error from then on.
* The database's _design directory does not get created.
* CouchDB does not log any view build checkpoints.
* Restarting couchdb fixes things. The _design directory gets created
and the view is built.
* Updating the design doc fixes it.
* Adding a new document does not fix it.
* Deliberately killing the view server does not fix it. CouchDB
doesn't even bother starting a new one.
* I've now seen this error on three machines (a netbook, a decent
laptop, and a decent server).
* The same error occurred with CouchDB 0.11.0 and latest 0.11.x when
running the tests that first demonstrated the problem. However, I
haven't run the above script against 0.11.0 yet.

It's as if CouchDB thinks the view is up to date when, in fact, it
hasn't even been created. I'm guessing there's some race condition in
creating the _design directory or after updating a design doc but
that's pure speculation.

Anyway, I'm going to do a bit more playing here but I thought I'd get
my initial findings "out there" in case anyone recognised where the
issue may be.

- Matt

Re: Intermittent view failure

Posted by Matt Goodall <ma...@gmail.com>.
I've been running my script against trunk for a while and it hasn't
failed yet. I guess that means that this can be ignored unless you
guys think it might be a problem that still exists.

- Matt


On 10 June 2010 12:19, Matt Goodall <ma...@gmail.com> wrote:
> On 10 June 2010 10:49, Matt Goodall <ma...@gmail.com> wrote:
>> Hi,
>>
>> An application test started "randomly" failing when getting a
>> relatively simple map view. Here's the error from the couchdb logs:
>>
>> [info] [<0.459.0>] 127.0.0.1 - - 'PUT' /killer 201
>> [info] [<0.460.0>] 127.0.0.1 - - 'POST' /killer 201
>> [info] [<0.468.0>] 127.0.0.1 - - 'POST' /killer 201
>> [error] [<0.469.0>] Uncaught error in HTTP request: {exit,
>>                                {noproc,
>>                                 {gen_server,call,
>>                                  [<0.449.0>,{request_group,2},infinity]}}}
>> [info] [<0.469.0>] Stacktrace: [{gen_server,call,3},
>>            {couch_view_group,request_group,2},
>>            {couch_view,get_map_view,4},
>>            {couch_httpd_view,design_doc_view,5},
>>            {couch_httpd_db,do_db_req,2},
>>            {couch_httpd,handle_request_int,5},
>>            {mochiweb_http,headers,5},
>>            {proc_lib,init_p_do_apply,3}]
>> [info] [<0.469.0>] 127.0.0.1 - - 'GET' /killer/_design/killer/_view/killer 500
>>
>> I've managed to whittle the test down to a simple failing script that
>> only depends on curl:
>>
>> set -e
>> db="http://localhost:5984/killer"
>> while `true`; do
>>   curl -s $db -X DELETE > /dev/null
>>   curl -fs $db -X PUT > /dev/null
>>   curl -fs $db -X POST -d @- > /dev/null << EOF
>> {
>>   "_id": "_design/killer",
>>   "views": {
>>       "killer": {
>>           "map": "function(doc) {emit(null, null);}"
>>       }
>>   }
>> }
>> EOF
>>   curl -fs $db -X POST -d {} > /dev/null
>>   curl -fs $db/_design/killer/_view/killer
>> done
>>
>> After running for some period of time the server dies with the above
>> error. Unfortunately, "some period of time" is not very consistent.
>> I've seen it die after as few as 5 loops and as many as 1500, but it
>> does eventually die for me.
>>
>> Some interesting points:
>>
>> * The view returns the same error from then on.
>> * The database's _design directory does not get created.
>> * CouchDB does not log any view build checkpoints.
>> * Restarting couchdb fixes things. The _design directory gets created
>> and the view is built.
>> * Updating the design doc fixes it.
>> * Adding a new document does not fix it.
>> * Deliberately killing the view server does not fix it. CouchDB
>> doesn't even bother starting a new one.
>> * I've now seen this error on three machines (a netbook, a decent
>> laptop, and a decent server).
>> * The same error occurred with CouchDB 0.11.0 and latest 0.11.x when
>> running the tests that first demonstrated the problem. However, I
>> haven't run the above script against 0.11.0 yet.
>
> Another one for this list:
>
> * Deleting the database does not help, i.e. once the error occurs the
> script fails with the same error in its first iteration of the loop on
> every run.
>
>
> Also, I just modified the script to allow me to use a unique database
> name, design doc name or view name per iteration, depending on what
> variable $num is appended to:
>
> set -e
> DB="http://localhost:5984/killer"
> num=1
> while `true`; do
>   echo $num
>   db="$DB$num"
>   design="killer"
>   view="killer"
>   curl -s $db -X DELETE > /dev/null
>   curl -fs $db -X PUT > /dev/null
>   curl -fs $db -X POST -d @- > /dev/null << EOF
> {
>   "_id": "_design/$design",
>   "views": {
>       "$view": {
>           "map": "function(doc) {emit(null, null);}"
>       }
>   }
> }
> EOF
>   curl -fs $db -X POST -d {} > /dev/null
>   curl -fs $db/_design/$design/_view/$view > /dev/null
>   let "num+=1"
> done
>
>
> So far, I've not managed to make it fail if the database name or the
> view name is changed each iteration. However, it definitely does fail
> when the design doc name is changed. I know there's nothing conclusive
> about that given that it's intermittent anyway, but it might mean
> suggest something.
>
> - Matt
>

Re: Intermittent view failure

Posted by Matt Goodall <ma...@gmail.com>.
On 10 June 2010 10:49, Matt Goodall <ma...@gmail.com> wrote:
> Hi,
>
> An application test started "randomly" failing when getting a
> relatively simple map view. Here's the error from the couchdb logs:
>
> [info] [<0.459.0>] 127.0.0.1 - - 'PUT' /killer 201
> [info] [<0.460.0>] 127.0.0.1 - - 'POST' /killer 201
> [info] [<0.468.0>] 127.0.0.1 - - 'POST' /killer 201
> [error] [<0.469.0>] Uncaught error in HTTP request: {exit,
>                                {noproc,
>                                 {gen_server,call,
>                                  [<0.449.0>,{request_group,2},infinity]}}}
> [info] [<0.469.0>] Stacktrace: [{gen_server,call,3},
>            {couch_view_group,request_group,2},
>            {couch_view,get_map_view,4},
>            {couch_httpd_view,design_doc_view,5},
>            {couch_httpd_db,do_db_req,2},
>            {couch_httpd,handle_request_int,5},
>            {mochiweb_http,headers,5},
>            {proc_lib,init_p_do_apply,3}]
> [info] [<0.469.0>] 127.0.0.1 - - 'GET' /killer/_design/killer/_view/killer 500
>
> I've managed to whittle the test down to a simple failing script that
> only depends on curl:
>
> set -e
> db="http://localhost:5984/killer"
> while `true`; do
>   curl -s $db -X DELETE > /dev/null
>   curl -fs $db -X PUT > /dev/null
>   curl -fs $db -X POST -d @- > /dev/null << EOF
> {
>   "_id": "_design/killer",
>   "views": {
>       "killer": {
>           "map": "function(doc) {emit(null, null);}"
>       }
>   }
> }
> EOF
>   curl -fs $db -X POST -d {} > /dev/null
>   curl -fs $db/_design/killer/_view/killer
> done
>
> After running for some period of time the server dies with the above
> error. Unfortunately, "some period of time" is not very consistent.
> I've seen it die after as few as 5 loops and as many as 1500, but it
> does eventually die for me.
>
> Some interesting points:
>
> * The view returns the same error from then on.
> * The database's _design directory does not get created.
> * CouchDB does not log any view build checkpoints.
> * Restarting couchdb fixes things. The _design directory gets created
> and the view is built.
> * Updating the design doc fixes it.
> * Adding a new document does not fix it.
> * Deliberately killing the view server does not fix it. CouchDB
> doesn't even bother starting a new one.
> * I've now seen this error on three machines (a netbook, a decent
> laptop, and a decent server).
> * The same error occurred with CouchDB 0.11.0 and latest 0.11.x when
> running the tests that first demonstrated the problem. However, I
> haven't run the above script against 0.11.0 yet.

Another one for this list:

* Deleting the database does not help, i.e. once the error occurs the
script fails with the same error in its first iteration of the loop on
every run.


Also, I just modified the script to allow me to use a unique database
name, design doc name or view name per iteration, depending on what
variable $num is appended to:

set -e
DB="http://localhost:5984/killer"
num=1
while `true`; do
   echo $num
   db="$DB$num"
   design="killer"
   view="killer"
   curl -s $db -X DELETE > /dev/null
   curl -fs $db -X PUT > /dev/null
   curl -fs $db -X POST -d @- > /dev/null << EOF
{
   "_id": "_design/$design",
   "views": {
       "$view": {
           "map": "function(doc) {emit(null, null);}"
       }
   }
}
EOF
   curl -fs $db -X POST -d {} > /dev/null
   curl -fs $db/_design/$design/_view/$view > /dev/null
   let "num+=1"
done


So far, I've not managed to make it fail if the database name or the
view name is changed each iteration. However, it definitely does fail
when the design doc name is changed. I know there's nothing conclusive
about that given that it's intermittent anyway, but it might mean
suggest something.

- Matt