You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Benoit Chesneau <bc...@gmail.com> on 2010/08/16 13:00:33 UTC

160-* etap test failure from time to time

So I've found why 160- test fails from time to time:

Vhosts are loaded at startup when creating the mochiweb loop. So it
only get depending how fast is your machine 1 result and maybe 0 for
really fast machines since the ini isn't defined. Which also make it
impossible to change vhosts on the fly apparently (via POST on
/_config).

To solve that and allows us to change/add vhosts on the fly I would
like to change the code and put all vhosts handling in a gen_server
keeping states etc so it could be possible to changes vhosts in memory
depending on ini change. Also it would make vhost handling pluggable
so someone could imagine to create them via doc posted in a db.

What do you think ?

- benoit

Re: 160-* etap test failure from time to time

Posted by Jan Lehnardt <ja...@apache.org>.
Robert, thanks for digging all this out.
I'll have a closer look in the morning!

Cheers
Jan
--
On 18 Aug 2010, at 21:48, Robert Dionne wrote:

> actually as things currently stand, it no longer involves 160-* at all.
> 
> the issue as I see it is here:
> 
> http://github.com/bdionne/couchdb/blob/master/src/couchdb/couch_config.erl#L124
> 
> every call to couch_config:set  will cause registered notify_funs to be invoked, *but*, spawned in their own pid. This implies that when they execute the original couch_config:set may have already returned and another set processed. So there's no guarantee that the registered notify_funs see the config state that triggered their invocation.
> 
> In this case it meant that couch_httpd was not being restarted for each config change to vhosts. The first would trigger a restart but by the time it's gets to registering itself again the state was already changed. The mochiweb upgrade likely slowed things enough to expose it.
> 
> Pulling the vhost stuff into it's own gen_server made the issue go away.
> 
> A similar issue happened once before with the hash_password function in couch_server and that's the only other place where this problem exists that I can see.
> 
> Perhaps a distinction needs to be made between config changes that require a restart and those that don't
> 
> 
> 
> On Aug 18, 2010, at 3:35 PM, Benoit Chesneau wrote:
> 
>> On Wed, Aug 18, 2010 at 9:29 PM, Jan Lehnardt <ja...@apache.org> wrote:
>>> 
>>> On 18 Aug 2010, at 20:17, Robert Dionne wrote:
>>> 
>>>> The vhosts refactoring made this issue go away. The underlying problem still exists in couch_config. It's a race condition
>>> 
>>> The refactoring also added a whole lot of things that are separate from this issue.
>>> 
>>> I recon the test could start couch_http and only issue requests once it is fully up*.
>>> 
>>> *I haven't looked at any code here, just handwaving.
>>> 
>>> Cheers
>>> Jan
>>> --
>>> 
>> 
>> On *slow* machines there are other tests failing for the same reason.
>> 140- for example. 160- .
>> 
>> Even f ini was just set after the couch_server_sup start the problem
>> happened. More likely due to the fact that couch_httpd is loaded in
>> the same time and the Loop created at this time with value currently
>> in ini.
>> 
>> imo, the configuration need some love love or, maybe just the way we
>> reload couch_httpd after a change in the conf.
>> 
>> - benoit
> 


Re: 160-* etap test failure from time to time

Posted by Robert Dionne <di...@dionne-associates.com>.
actually as things currently stand, it no longer involves 160-* at all.

the issue as I see it is here:

http://github.com/bdionne/couchdb/blob/master/src/couchdb/couch_config.erl#L124

every call to couch_config:set  will cause registered notify_funs to be invoked, *but*, spawned in their own pid. This implies that when they execute the original couch_config:set may have already returned and another set processed. So there's no guarantee that the registered notify_funs see the config state that triggered their invocation.

In this case it meant that couch_httpd was not being restarted for each config change to vhosts. The first would trigger a restart but by the time it's gets to registering itself again the state was already changed. The mochiweb upgrade likely slowed things enough to expose it.

Pulling the vhost stuff into it's own gen_server made the issue go away.

A similar issue happened once before with the hash_password function in couch_server and that's the only other place where this problem exists that I can see.

Perhaps a distinction needs to be made between config changes that require a restart and those that don't



On Aug 18, 2010, at 3:35 PM, Benoit Chesneau wrote:

> On Wed, Aug 18, 2010 at 9:29 PM, Jan Lehnardt <ja...@apache.org> wrote:
>> 
>> On 18 Aug 2010, at 20:17, Robert Dionne wrote:
>> 
>>> The vhosts refactoring made this issue go away. The underlying problem still exists in couch_config. It's a race condition
>> 
>> The refactoring also added a whole lot of things that are separate from this issue.
>> 
>> I recon the test could start couch_http and only issue requests once it is fully up*.
>> 
>> *I haven't looked at any code here, just handwaving.
>> 
>> Cheers
>> Jan
>> --
>> 
> 
> On *slow* machines there are other tests failing for the same reason.
> 140- for example. 160- .
> 
> Even f ini was just set after the couch_server_sup start the problem
> happened. More likely due to the fact that couch_httpd is loaded in
> the same time and the Loop created at this time with value currently
> in ini.
> 
> imo, the configuration need some love love or, maybe just the way we
> reload couch_httpd after a change in the conf.
> 
> - benoit


Re: 160-* etap test failure from time to time

Posted by Benoit Chesneau <bc...@gmail.com>.
On Wed, Aug 18, 2010 at 9:29 PM, Jan Lehnardt <ja...@apache.org> wrote:
>
> On 18 Aug 2010, at 20:17, Robert Dionne wrote:
>
>> The vhosts refactoring made this issue go away. The underlying problem still exists in couch_config. It's a race condition
>
> The refactoring also added a whole lot of things that are separate from this issue.
>
> I recon the test could start couch_http and only issue requests once it is fully up*.
>
> *I haven't looked at any code here, just handwaving.
>
> Cheers
> Jan
> --
>

On *slow* machines there are other tests failing for the same reason.
140- for example. 160- .

Even f ini was just set after the couch_server_sup start the problem
happened. More likely due to the fact that couch_httpd is loaded in
the same time and the Loop created at this time with value currently
in ini.

imo, the configuration need some love love or, maybe just the way we
reload couch_httpd after a change in the conf.

- benoit

Re: 160-* etap test failure from time to time

Posted by Jan Lehnardt <ja...@apache.org>.
On 18 Aug 2010, at 20:17, Robert Dionne wrote:

> The vhosts refactoring made this issue go away. The underlying problem still exists in couch_config. It's a race condition

The refactoring also added a whole lot of things that are separate from this issue.

I recon the test could start couch_http and only issue requests once it is fully up*.

*I haven't looked at any code here, just handwaving.

Cheers
Jan
-- 


> 
> 
> 
> 
> On Aug 18, 2010, at 2:01 PM, Jan Lehnardt wrote:
> 
>> 
>> On 16 Aug 2010, at 13:00, Benoit Chesneau wrote:
>> 
>>> So I've found why 160- test fails from time to time:
>>> 
>>> Vhosts are loaded at startup when creating the mochiweb loop. So it
>>> only get depending how fast is your machine 1 result and maybe 0 for
>>> really fast machines since the ini isn't defined.
>> 
>> this sounds like a race condition that should be fixed in the test.
>> 
>>> Which also make it
>>> impossible to change vhosts on the fly apparently (via POST on
>>> /_config).
>> 
>> I do that all the time and it works. I'm not sure how you're drawing
>> your conclusion here :)
>> 
>> Cheers
>> Jan
>> -- 
>> 
>> 
>>> To solve that and allows us to change/add vhosts on the fly I would
>>> like to change the code and put all vhosts handling in a gen_server
>>> keeping states etc so it could be possible to changes vhosts in memory
>>> depending on ini change. Also it would make vhost handling pluggable
>>> so someone could imagine to create them via doc posted in a db.
>>> 
>>> What do you think ?
>>> 
>>> - benoit
>> 
> 


Re: 160-* etap test failure from time to time

Posted by Robert Dionne <di...@dionne-associates.com>.
The vhosts refactoring made this issue go away. The underlying problem still exists in couch_config. It's a race condition




On Aug 18, 2010, at 2:01 PM, Jan Lehnardt wrote:

> 
> On 16 Aug 2010, at 13:00, Benoit Chesneau wrote:
> 
>> So I've found why 160- test fails from time to time:
>> 
>> Vhosts are loaded at startup when creating the mochiweb loop. So it
>> only get depending how fast is your machine 1 result and maybe 0 for
>> really fast machines since the ini isn't defined.
> 
> this sounds like a race condition that should be fixed in the test.
> 
>> Which also make it
>> impossible to change vhosts on the fly apparently (via POST on
>> /_config).
> 
> I do that all the time and it works. I'm not sure how you're drawing
> your conclusion here :)
> 
> Cheers
> Jan
> -- 
> 
> 
>> To solve that and allows us to change/add vhosts on the fly I would
>> like to change the code and put all vhosts handling in a gen_server
>> keeping states etc so it could be possible to changes vhosts in memory
>> depending on ini change. Also it would make vhost handling pluggable
>> so someone could imagine to create them via doc posted in a db.
>> 
>> What do you think ?
>> 
>> - benoit
> 


Re: 160-* etap test failure from time to time

Posted by Jan Lehnardt <ja...@apache.org>.
On 16 Aug 2010, at 13:00, Benoit Chesneau wrote:

> So I've found why 160- test fails from time to time:
> 
> Vhosts are loaded at startup when creating the mochiweb loop. So it
> only get depending how fast is your machine 1 result and maybe 0 for
> really fast machines since the ini isn't defined.

this sounds like a race condition that should be fixed in the test.

> Which also make it
> impossible to change vhosts on the fly apparently (via POST on
> /_config).

I do that all the time and it works. I'm not sure how you're drawing
your conclusion here :)

Cheers
Jan
-- 


> To solve that and allows us to change/add vhosts on the fly I would
> like to change the code and put all vhosts handling in a gen_server
> keeping states etc so it could be possible to changes vhosts in memory
> depending on ini change. Also it would make vhost handling pluggable
> so someone could imagine to create them via doc posted in a db.
> 
> What do you think ?
> 
> - benoit