You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Jeff Trawick <tr...@gmail.com> on 2006/03/15 22:34:35 UTC

pool use/mutex initialization in util_ldap not thread safe?

Plz forgive any misunderstanding, as well as my use of 2.0 function
names ;)  Also, for being slow at learning what ldap stands for.  I
know this code has been hashed over many many times over the last few
years.

util_ldap_create_config() creates the per-server config for util_ldap.
 That saves a config-time pool in the per-server config.

util_ldap_connection_find() is called on a request and allocates a
mutex using st->pool without holding a mutex, if the mutex wasn't
created before.  IOW, the first <very-small-number> of threads in that
process that try to find a connection will allocate a mutex. 
Hopefully very-small-number is one

(This is exactly the type of bug I had the displeasure of diagnosing
in a proprietary module some time back; somebody eons ago had deferred
some initialization until the first request, assuming incorrectly that
there was no way that at the time of the first request for a certain
vhost that there would actually be 2+ concurrent requests stomping on
each other.  one of the baby bells proved otherwise continually until
the problem was found and fixed)

The child init hook really needs to talk the server_recs and create
the mutex there, right?

Also, is the pool growth of the config-time pool reasonable?

What happens when some other module cheats and uses a config-time pool
on request processing thread, even if it is "smart" enough to protect
the pool misuse with a mutex?  (kaboom)

If it is reasonable to grow some pool over time during the request
processing, shouldn't it be an ldap-unique pool?

Re: pool use/mutex initialization in util_ldap not thread safe?

Posted by Nick Kew <ni...@webthing.com>.
On Wednesday 15 March 2006 21:34, Jeff Trawick wrote:

[ disclaimer: I haven't looked at the code in question, but I've
dealt with the issue I mention in mod_dbd, and gone through
the logic at greater length for the book ]

> util_ldap_create_config() creates the per-server config for util_ldap.
>  That saves a config-time pool in the per-server config.
>
> The child init hook really needs to talk the server_recs and create
> the mutex there, right?

That makes sense.

There's a somewhat similar issue in mod_dbd in 2.2.0, using pchild
where it should be using its own pool and a mutex.  This is fixed in
trunk (thanks mostly to Chris Darroch) and is currently languishing
at the top of PATCHES PROPOSED TO BACKPORT FROM TRUNK.
Perhaps people could look at that whilst on the subject of pools usage?

> Also, is the pool growth of the config-time pool reasonable?

Is it growing per-request?  If so it definitely wants some kind of
workaround, such as a private pool with garbage-cleaning.

> If it is reasonable to grow some pool over time during the request
> processing, shouldn't it be an ldap-unique pool?

Agreed.

-- 
Nick Kew

Re: pool use/mutex initialization in util_ldap not thread safe?

Posted by Jeff Trawick <tr...@gmail.com>.
On 3/16/06, Graham Leggett <mi...@sharp.fm> wrote:
> Jeff Trawick wrote:
>
> > if ldap server times out the connection and we have to bring one back
> > up, that is no pool growth, right?  we just get pool growth when we
> > talk to an LDAP server we haven't already talked to yet, or when?
>
> Dead LDAP connections (timed out, server gone away, etc) are repaired
> rather than replaced, in other words the record is reused. In theory,
> the pool should never get bigger than the maximum number of simultaneous
> connections we have ever had open at one time.

Great, so fix should be very straightforward.

Re: pool use/mutex initialization in util_ldap not thread safe?

Posted by Graham Leggett <mi...@sharp.fm>.
Jeff Trawick wrote:

> if ldap server times out the connection and we have to bring one back
> up, that is no pool growth, right?  we just get pool growth when we
> talk to an LDAP server we haven't already talked to yet, or when?

Dead LDAP connections (timed out, server gone away, etc) are repaired 
rather than replaced, in other words the record is reused. In theory, 
the pool should never get bigger than the maximum number of simultaneous 
connections we have ever had open at one time.

Regards,
Graham
--

Re: pool use/mutex initialization in util_ldap not thread safe?

Posted by Jeff Trawick <tr...@gmail.com>.
On 3/16/06, Brad Nicholes <BN...@novell.com> wrote:
> >>> On 3/16/2006 at 7:12 am, in message
> <cc...@mail.gmail.com>, "Jeff
> Trawick"
> <tr...@gmail.com> wrote:
> > On 3/16/06, Ruediger Pluem <rp...@apache.org> wrote:
> >>
> >>
> >> On 03/16/2006 03:49 AM, Jeff Trawick wrote:
> >> > On 3/15/06, Brad Nicholes  wrote:
> >>
> >> >
> >> > That is really one pool globally but there is a mutex per
> server_rec.
> >> > So a thread handling a request for one vhost grabs the mutex and
> uses
> >> > the pool but that doesn't protect from a thread handling a request
> for
> >> > a different vhost which grabs a different mutex supposedly
> protecting
> >> > the same pool.
> >>
> >> Would it help to create sub-pools per server_rec from the config
> pool?
> >
> > That sounds like a definite improvement, but I'm eager for LDAP
> gurus
> > to explain how pool growth is mitigated in the current design
> > (assuming thread safety) before jumping to a conclusion
>
> The use of the pool is actually fairly limited.  It is used to create
> and initialize the reusable LDAP resources such as ldap connections.
> The pool of ldap connections can grow but that is based on the amount of
> traffic that requires LDAP authentication.  Even under very heavy use,
> the number of connections in the connection pool might be ~20 but even
> that is a generous number.

if ldap server times out the connection and we have to bring one back
up, that is no pool growth, right?  we just get pool growth when we
talk to an LDAP server we haven't already talked to yet, or when?

>                    Any per-request memory allocations are done
> from the request pool.  So the potential for collisions within the
> global memory pool, although not 0, would be very low.

"very low" == still broken but much much harder to debug than "high"

>
> The purpose of the mutex is not necessarily to protect the memory pool
> but to protect the ldap connection pool.  Whenever mod_ldap needs to
> pull a connection from the ldap connection pool, it first grabs the
> mutex so that it is free to search for a connection and modify its
> parameters without having to worry about other threads doing the same
> thing.  Since most of the global memory pool usage, outside of module
> initialization, also occurs after the connection pool mutex has been
> locked, this also lessens the chance of memory pool collisions.
>
> However, given all of there, I still think that there are things that
> need to be cleaned up especially with the mutex allocation and use in a
> worker MPM environment.

I interpret this as meaning that the following would resolve the
problems (race condition for mutex initialization between multiple
threads for the same vhost and race condition for pool use between
multiple threads in different vhosts):

1) this code needs LDAP-specific subpool of pchild* for each
LDAP-enabled vhost, initialized in child-init hook
2) the vhost-specific mutex is initialized in the child-init hook
(walk the server_rec list)

*do these objects need to get cleaned up when the process goes away? 
maybe it shouldn't be a subpool at all but instead a freestanding pool
(I can't remember at the moment if a freestanding pool keeps it from
being cleaned up at child process exit)

or

1) one global subpool of pchild* for each LDAP-enabled vhost,
initialized in child-init hook
2) one global mutex initialized in the child-init hook

It just depends on expected collision rate.

I see some pthread_mutex_trylock usage, which makes me think the
collision rate can be high, but it is unclear that segregating by
vhost makes much of an improvement.

Re: pool use/mutex initialization in util_ldap not thread safe?

Posted by Brad Nicholes <BN...@novell.com>.
>>> On 3/16/2006 at 7:12 am, in message
<cc...@mail.gmail.com>, "Jeff
Trawick"
<tr...@gmail.com> wrote:
> On 3/16/06, Ruediger Pluem <rp...@apache.org> wrote:
>>
>>
>> On 03/16/2006 03:49 AM, Jeff Trawick wrote:
>> > On 3/15/06, Brad Nicholes  wrote:
>>
>> >
>> > That is really one pool globally but there is a mutex per
server_rec.
>> > So a thread handling a request for one vhost grabs the mutex and
uses
>> > the pool but that doesn't protect from a thread handling a request
for
>> > a different vhost which grabs a different mutex supposedly
protecting
>> > the same pool.
>>
>> Would it help to create sub-pools per server_rec from the config
pool?
> 
> That sounds like a definite improvement, but I'm eager for LDAP
gurus
> to explain how pool growth is mitigated in the current design
> (assuming thread safety) before jumping to a conclusion

The use of the pool is actually fairly limited.  It is used to create
and initialize the reusable LDAP resources such as ldap connections. 
The pool of ldap connections can grow but that is based on the amount of
traffic that requires LDAP authentication.  Even under very heavy use,
the number of connections in the connection pool might be ~20 but even
that is a generous number.  Any per-request memory allocations are done
from the request pool.  So the potential for collisions within the
global memory pool, although not 0, would be very low.

The purpose of the mutex is not necessarily to protect the memory pool
but to protect the ldap connection pool.  Whenever mod_ldap needs to
pull a connection from the ldap connection pool, it first grabs the
mutex so that it is free to search for a connection and modify its
parameters without having to worry about other threads doing the same
thing.  Since most of the global memory pool usage, outside of module
initialization, also occurs after the connection pool mutex has been
locked, this also lessens the chance of memory pool collisions.

However, given all of there, I still think that there are things that
need to be cleaned up especially with the mutex allocation and use in a
worker MPM environment.

Brad

Re: pool use/mutex initialization in util_ldap not thread safe?

Posted by Brad Nicholes <BN...@novell.com>.
>>> On 3/16/2006 at 11:34 am, in message
<cc...@mail.gmail.com>, "Jeff
Trawick"
<tr...@gmail.com> wrote:
> On 3/16/06, Brad Nicholes <BN...@novell.com> wrote:
>> >>> On 3/16/2006 at 7:12 am, in message
>> <cc...@mail.gmail.com>,
"Jeff
>> Trawick"
>> <tr...@gmail.com> wrote:
>> > On 3/16/06, Ruediger Pluem <rp...@apache.org> wrote:
>> >>
>> >>
>> >> On 03/16/2006 03:49 AM, Jeff Trawick wrote:
>> >> > On 3/15/06, Brad Nicholes  wrote:
>> >>
>> >> >
>> >> > That is really one pool globally but there is a mutex per
>> server_rec.
>> >> > So a thread handling a request for one vhost grabs the mutex
and
>> uses
>> >> > the pool but that doesn't protect from a thread handling a
request
>> for
>> >> > a different vhost which grabs a different mutex supposedly
>> protecting
>> >> > the same pool.
>> >>
>> >> Would it help to create sub-pools per server_rec from the config
>> pool?
>> >
>> > That sounds like a definite improvement, but I'm eager for LDAP
>> gurus
>> > to explain how pool growth is mitigated in the current design
>> > (assuming thread safety) before jumping to a conclusion
>>
>> The use of the pool is actually fairly limited.  It is used to
create
>> and initialize the reusable LDAP resources such as ldap
connections.
>> The pool of ldap connections can grow but that is based on the
amount of
>> traffic that requires LDAP authentication.  Even under very heavy
use,
>> the number of connections in the connection pool might be ~20 but
even
>> that is a generous number.
> 
> if ldap server times out the connection and we have to bring one
back
> up, that is no pool growth, right?  we just get pool growth when we
> talk to an LDAP server we haven't already talked to yet, or when?

No, there is no growth there.  The cached or pooled information that
consumes memory, remains the same.  So there is no need to allocate more
memory.  If a connection times out or is broken for whatever reason, it
is just the connection that needs to be re-established given all of the
information that currently exists.  Also the growth does just occur when
we start talking to a new LDAP server.  It occurs when we run out of
connections to an LDAP server and require a new one.  This may involve
more than one LDAP server but in most cases it is just multiple
connections to the same LDAP server.

> 
>>                    Any per-request memory allocations are done
>> from the request pool.  So the potential for collisions within the
>> global memory pool, although not 0, would be very low.
> 
> "very low" == still broken but much much harder to debug than "high"

Right which is why I believe that there is work to be done here.


> 
>>
>> The purpose of the mutex is not necessarily to protect the memory
pool
>> but to protect the ldap connection pool.  Whenever mod_ldap needs
to
>> pull a connection from the ldap connection pool, it first grabs the
>> mutex so that it is free to search for a connection and modify its
>> parameters without having to worry about other threads doing the
same
>> thing.  Since most of the global memory pool usage, outside of
module
>> initialization, also occurs after the connection pool mutex has
been
>> locked, this also lessens the chance of memory pool collisions.
>>
>> However, given all of there, I still think that there are things
that
>> need to be cleaned up especially with the mutex allocation and use
in a
>> worker MPM environment.
> 
> I interpret this as meaning that the following would resolve the
> problems (race condition for mutex initialization between multiple
> threads for the same vhost and race condition for pool use between
> multiple threads in different vhosts):
> 
> 1) this code needs LDAP-specific subpool of pchild* for each
> LDAP-enabled vhost, initialized in child-init hook
> 2) the vhost-specific mutex is initialized in the child-init hook
> (walk the server_rec list)
> 
> *do these objects need to get cleaned up when the process goes away?

> maybe it shouldn't be a subpool at all but instead a freestanding
pool
> (I can't remember at the moment if a freestanding pool keeps it from
> being cleaned up at child process exit)
> 
> or
> 
> 1) one global subpool of pchild* for each LDAP-enabled vhost,
> initialized in child-init hook
> 2) one global mutex initialized in the child-init hook
> 
> It just depends on expected collision rate.
> 
> I see some pthread_mutex_trylock usage, which makes me think the
> collision rate can be high, but it is unclear that segregating by
> vhost makes much of an improvement

I am still looking at the code, but my gut feel is that child-init may
not be the best place to create the pool or the mutex.  All of this is
vhost based and should remained separated in that way.  Another problem
is that memory allocations need to occur during config time which
happens before child-init gets called.

Brad

Re: pool use/mutex initialization in util_ldap not thread safe?

Posted by Jeff Trawick <tr...@gmail.com>.
On 3/16/06, Ruediger Pluem <rp...@apache.org> wrote:
>
>
> On 03/16/2006 03:49 AM, Jeff Trawick wrote:
> > On 3/15/06, Brad Nicholes  wrote:
>
> >
> > That is really one pool globally but there is a mutex per server_rec.
> > So a thread handling a request for one vhost grabs the mutex and uses
> > the pool but that doesn't protect from a thread handling a request for
> > a different vhost which grabs a different mutex supposedly protecting
> > the same pool.
>
> Would it help to create sub-pools per server_rec from the config pool?

That sounds like a definite improvement, but I'm eager for LDAP gurus
to explain how pool growth is mitigated in the current design
(assuming thread safety) before jumping to a conclusion.

Re: pool use/mutex initialization in util_ldap not thread safe?

Posted by Ruediger Pluem <rp...@apache.org>.

On 03/16/2006 03:49 AM, Jeff Trawick wrote:
> On 3/15/06, Brad Nicholes  wrote:

> 
> That is really one pool globally but there is a mutex per server_rec. 
> So a thread handling a request for one vhost grabs the mutex and uses
> the pool but that doesn't protect from a thread handling a request for
> a different vhost which grabs a different mutex supposedly protecting
> the same pool.

Would it help to create sub-pools per server_rec from the config pool?

Regards

RĂ¼diger


Re: pool use/mutex initialization in util_ldap not thread safe?

Posted by Jeff Trawick <tr...@gmail.com>.
On 3/15/06, Brad Nicholes <BN...@novell.com> wrote:

> I think you are right.  I am going to take a closer look at that code
> and see about fixing both the mutex problem and the use of the config
> pool.  This could actually explain some funny things that I have been
> seeing on the NetWare build lately.

Something else to make sure is resolved by any fixes:

Forget the race condition initializing the mutex for a sec...

That is really one pool globally but there is a mutex per server_rec. 
So a thread handling a request for one vhost grabs the mutex and uses
the pool but that doesn't protect from a thread handling a request for
a different vhost which grabs a different mutex supposedly protecting
the same pool.

Re: pool use/mutex initialization in util_ldap not thread safe?

Posted by Brad Nicholes <BN...@novell.com>.
>>> On 3/15/2006 at 2:34 pm, in message
<cc...@mail.gmail.com>, "Jeff
Trawick"
<tr...@gmail.com> wrote:
> Plz forgive any misunderstanding, as well as my use of 2.0 function
> names ;)  Also, for being slow at learning what ldap stands for.  I
> know this code has been hashed over many many times over the last
few
> years.
> 
> util_ldap_create_config() creates the per-server config for
util_ldap.
>  That saves a config-time pool in the per-server config.
> 
> util_ldap_connection_find() is called on a request and allocates a
> mutex using st->pool without holding a mutex, if the mutex wasn't
> created before.  IOW, the first <very-small-number> of threads in
that
> process that try to find a connection will allocate a mutex. 
> Hopefully very-small-number is one
> 
> (This is exactly the type of bug I had the displeasure of diagnosing
> in a proprietary module some time back; somebody eons ago had
deferred
> some initialization until the first request, assuming incorrectly
that
> there was no way that at the time of the first request for a certain
> vhost that there would actually be 2+ concurrent requests stomping
on
> each other.  one of the baby bells proved otherwise continually
until
> the problem was found and fixed)
> 
> The child init hook really needs to talk the server_recs and create
> the mutex there, right?
> 
> Also, is the pool growth of the config-time pool reasonable?
> 
> What happens when some other module cheats and uses a config-time
pool
> on request processing thread, even if it is "smart" enough to
protect
> the pool misuse with a mutex?  (kaboom)
> 
> If it is reasonable to grow some pool over time during the request
> processing, shouldn't it be an ldap-unique pool

I think you are right.  I am going to take a closer look at that code
and see about fixing both the mutex problem and the use of the config
pool.  This could actually explain some funny things that I have been
seeing on the NetWare build lately.

Brad