You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Brian Pane <bp...@pacbell.net> on 2001/07/27 07:50:58 UTC

sms_trivial locking Re: missing apr_pool_child_cleanup_set when using SMS?

Justin Erenkrantz wrote:

>On Wed, Jul 25, 2001 at 10:29:35AM -0700, Brian Pane wrote:
>
>>Lately the httpd won't build when configured with
>>--enable-sms, because of unresolvable references to
>>apr_pool_child_cleanup_set in file_io/unix/open.c.
>>Is there supposed to be an implementation of this
>>function in the sms-pools code, or am I missing
>>something?
>>
>
>I just committed what I think it should be.  Check it out and let 
>me know if it doesn't work.  We *really* need people to look at
>the code and see if they can find segfaults in the SMS code now.
>
Thanks, it compiles now.

But there's a problem with the SMS lock management.
According to gprof, every call to apr_sms_trivial_malloc
acquires and releases a lock.

Ths logic in apr_sms_post_init and apr_sms_thread_register
looks broken.  Every time an SMS is created, it gets registered
with the thread that creates it.  And registration with even
a single thread results in the creation of a lock for the SMS.
Isn't apr_sms_thread_register supposed to create a lock for
the SMS if and only if the count of threads registered with
the SMS is greater than 1?

--Brian




Re: sms_trivial locking Re: missing apr_pool_child_cleanup_set when using SMS?

Posted by Brian Pane <bp...@pacbell.net>.
Luke Kenneth Casson Leighton wrote:

>On Fri, Jul 27, 2001 at 12:05:02AM -0700, Brian Pane wrote:
>
>>hmmm...looking at the code, it makes sense that SMS is
>>half as fast as the original pools code.  I didn't realize
>>this until just now, but the polymorphism in the SMS framework
>>will probably make it impossible to match the performance of pools:
>>
>
>>* The SMS-based implementation has to do essentially the same
>>  work, but it also does an extra function call (apr_sms_malloc
>>  calls apr_sms_trivial_malloc).
>>
>
>okay: how about this.  in the cases where fast-optimisation
>is really really needed, how about calling the apr_sms_xxx_yyy()
>functions or even just function, direct?
>
In Apache, there can be hundreds of apr_palloc calls per request .
To avoid performance degradation, we'd have to bypass apr_sms_malloc
and call apr_sms_trivial_malloc for almost all of them.  And
then we'd have to change them all if we switched from sms_trivial
to some other type of SMS.

I was thinking about a macro-based solution,

#define apr_palloc(sms, n) (sms)->malloc_fn((sms), (n))

but even that has problems (it requires exposing the struct
definitions in apr_sms_private.h to code outside of APR).

--Brian




Re: sms_trivial locking Re: missing apr_pool_child_cleanup_set when using SMS?

Posted by Cliff Woolley <cl...@yahoo.com>.
On Fri, 27 Jul 2001, Luke Kenneth Casson Leighton wrote:

> > * The SMS-based implementation has to do essentially the same
> >   work, but it also does an extra function call (apr_sms_malloc
> >   calls apr_sms_trivial_malloc).
>
> okay: how about this.  in the cases where fast-optimisation
> is really really needed, how about calling the apr_sms_xxx_yyy()
> functions or even just function, direct?

Weren't we going to try switching apr_sms_malloc() from a function to a
macro?  You'd still need an extra pointer indirection, but that's better
than a full-blown function call...

--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: sms_trivial locking Re: missing apr_pool_child_cleanup_set when using SMS?

Posted by Luke Kenneth Casson Leighton <lk...@samba-tng.org>.
On Fri, Jul 27, 2001 at 12:05:02AM -0700, Brian Pane wrote:

> hmmm...looking at the code, it makes sense that SMS is
> half as fast as the original pools code.  I didn't realize
> this until just now, but the polymorphism in the SMS framework
> will probably make it impossible to match the performance of pools:
> 

> * The SMS-based implementation has to do essentially the same
>   work, but it also does an extra function call (apr_sms_malloc
>   calls apr_sms_trivial_malloc).

okay: how about this.  in the cases where fast-optimisation
is really really needed, how about calling the apr_sms_xxx_yyy()
functions or even just function, direct?

luke

Re: sms_trivial locking Re: missing apr_pool_child_cleanup_set when using SMS?

Posted by Bill Stoddard <bi...@wstoddard.com>.
> On Fri, 27 Jul 2001, David Reid wrote:
>
> > Did we ever get anywhere with changing the buckets code to use sms?
>
> I've still got some half-written patches laying around.  I put them aside
> awhile back because I was busy with other things, because SMS was in a
> state of grand flux, and because Brian's gprof twiddlings showed that the
> use of malloc in the buckets code was not the #1 performance killer I'd
> thought it would be.  Later discussions by Dean et al revealed that recent
> mallocs seem to have made great strides in the performance arena.  <shrug>
> Anyhow, I'll get back to the buckets SMS thing, because I still want to
> try it... after Apache 2 beta 2.  =-)
>

malloc/free is a performance killer on many operating systems.  I would be completely
happy using pools where possible and a malloc/free SMS for the bucket code.

Bill


Re: sms_trivial locking Re: missing apr_pool_child_cleanup_set when using SMS?

Posted by Cliff Woolley <cl...@yahoo.com>.
On Fri, 27 Jul 2001, David Reid wrote:

> Did we ever get anywhere with changing the buckets code to use sms?

I've still got some half-written patches laying around.  I put them aside
awhile back because I was busy with other things, because SMS was in a
state of grand flux, and because Brian's gprof twiddlings showed that the
use of malloc in the buckets code was not the #1 performance killer I'd
thought it would be.  Later discussions by Dean et al revealed that recent
mallocs seem to have made great strides in the performance arena.  <shrug>
Anyhow, I'll get back to the buckets SMS thing, because I still want to
try it... after Apache 2 beta 2.  =-)

--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: sms_trivial locking Re: missing apr_pool_child_cleanup_set when using SMS?

Posted by David Reid <dr...@jetnet.co.uk>.
Guys,

If nothing else this discussion and the work done has demonstrated that the
pools code is actually pretty damn good for what we're doing with it in
apache.  The SMS code has uses, but they're probably not going to be in
replacing pools as the basic, underlying memory allocation system in APR :)

I can live with that as at least now I think we're all happier that for
general allocation pools do a good job.

Maybe we should now try to look at where we can improve the pools code...

When looking at the code it did strike me that we now have recursive locks
in APR so maybe some of the locking can be tidied up and streamlined a
little.

Did we ever get anywhere with changing the buckets code to use sms?

Sorry I haven't been following as closely recently but I've been bogged down
in other stuff...

BTW, gprof works just fine for me!

david

> Justin Erenkrantz wrote:
>
> >On Thu, Jul 26, 2001 at 10:50:58PM -0700, Brian Pane wrote:
> >
> >>But there's a problem with the SMS lock management.
> >>According to gprof, every call to apr_sms_trivial_malloc
> >>acquires and releases a lock.
> >>
> >
> >Yup.  I'm working on this right now.  =)  (What I have in my tree
> >right now isn't in commit shape.)  Email me and I can send you
> >my tree diff.
> >
> >I'm debating whether we're getting a win by the SMS stuff or not.
> >In order to do that, we need to kick out every lock we can.
> >
> Yep, I've been wondering the same thing lately.
> The original pool implementation of apr_palloc is
> very close to optimal.  In the last benchmark that
> I did with non-mod_include requests, 99.998% of the
> calls to apr_palloc didn't require locking.  So it's
> going to be tough to improve upon the original design.
>
> The biggest opportunity that I see for optimization
> in the memory-management framework is in sub-pool
> creation.  If an SMS-based implementation can reduce
> the cost of creating sub-pools, it will speed up
> mod_include.
>
> But first, we may have a more fundamental problem:
>
> >Right now, I've got it so that most of the locks are now in libc
> >(aka NIMBY), but the performance still doesn't match pools (by a
> >factor of 2).  I'm scratching my head as to why this is.
> >
> hmmm...looking at the code, it makes sense that SMS is
> half as fast as the original pools code.  I didn't realize
> this until just now, but the polymorphism in the SMS framework
> will probably make it impossible to match the performance of pools:
>
> * apr_palloc (the original pools version) is a very lightweight
>   function in the fast-path case where it doesn't need to
>   acquire a lock.  It consists of a couple of integer/pointer
>   arithmetic operations and two comparisons.
>
> * The SMS-based implementation has to do essentially the same
>   work, but it also does an extra function call (apr_sms_malloc
>   calls apr_sms_trivial_malloc).
>
> * If the cost of a function call is similar to the cost of
>   the two comparisons and half-dozen arithmetic operations
>   in apr_palloc, that would explain why the SMS version is
>   twice as slow.
>
> >-- justin
> >
> >P.S. You are using gprof, how?  I tried -pg and it just doesn't
> >work.  I switched to Forte 6.0U1's collect program now.  It
> >actually writes out info that I can use (er_print is a bit
> >awkward though).
> >
> I'm using gcc on Linux to build profiled code; it's not properly
> including a profiled libc for reasons that I haven't had time to
> debug yet, but it does a decent job of instrumenting the apr and
> httpd code.
>
> --Brian
>
>
>


Re: sms_trivial locking Re: missing apr_pool_child_cleanup_set when using SMS?

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
From: "Brian Pane" <bp...@pacbell.net>
Sent: Friday, July 27, 2001 2:05 AM


> Justin Erenkrantz wrote:
> 
> >Right now, I've got it so that most of the locks are now in libc
> >(aka NIMBY), but the performance still doesn't match pools (by a
> >factor of 2).  I'm scratching my head as to why this is.  
> >
> hmmm...looking at the code, it makes sense that SMS is
> half as fast as the original pools code.  I didn't realize
> this until just now, but the polymorphism in the SMS framework
> will probably make it impossible to match the performance of pools:
> 
> * apr_palloc (the original pools version) is a very lightweight
>   function in the fast-path case where it doesn't need to
>   acquire a lock.  It consists of a couple of integer/pointer
>   arithmetic operations and two comparisons.
> 
> * The SMS-based implementation has to do essentially the same
>   work, but it also does an extra function call (apr_sms_malloc
>   calls apr_sms_trivial_malloc).
> 
> * If the cost of a function call is similar to the cost of
>   the two comparisons and half-dozen arithmetic operations
>   in apr_palloc, that would explain why the SMS version is
>   twice as slow.

You just answered your own question.  Instead of conditionals and stepping
around the locking code, it should be the thread attach/detach that decides
(when it decrements to 1 or increments above 1 thread) to flip the fn
pointers in the SMS to the lock-safe or lock-free functions.  Since the
indirection is _already_ one pain point, make it a productive one :)

Bill


Re: sms_trivial locking Re: missing apr_pool_child_cleanup_set when using SMS?

Posted by Brian Pane <bp...@pacbell.net>.
Justin Erenkrantz wrote:

>On Thu, Jul 26, 2001 at 10:50:58PM -0700, Brian Pane wrote:
>
>>But there's a problem with the SMS lock management.
>>According to gprof, every call to apr_sms_trivial_malloc
>>acquires and releases a lock.
>>
>
>Yup.  I'm working on this right now.  =)  (What I have in my tree 
>right now isn't in commit shape.)  Email me and I can send you
>my tree diff.
>
>I'm debating whether we're getting a win by the SMS stuff or not.
>In order to do that, we need to kick out every lock we can.
>
Yep, I've been wondering the same thing lately.
The original pool implementation of apr_palloc is
very close to optimal.  In the last benchmark that
I did with non-mod_include requests, 99.998% of the
calls to apr_palloc didn't require locking.  So it's
going to be tough to improve upon the original design.

The biggest opportunity that I see for optimization
in the memory-management framework is in sub-pool
creation.  If an SMS-based implementation can reduce
the cost of creating sub-pools, it will speed up
mod_include.

But first, we may have a more fundamental problem:

>Right now, I've got it so that most of the locks are now in libc
>(aka NIMBY), but the performance still doesn't match pools (by a
>factor of 2).  I'm scratching my head as to why this is.  
>
hmmm...looking at the code, it makes sense that SMS is
half as fast as the original pools code.  I didn't realize
this until just now, but the polymorphism in the SMS framework
will probably make it impossible to match the performance of pools:

* apr_palloc (the original pools version) is a very lightweight
  function in the fast-path case where it doesn't need to
  acquire a lock.  It consists of a couple of integer/pointer
  arithmetic operations and two comparisons.

* The SMS-based implementation has to do essentially the same
  work, but it also does an extra function call (apr_sms_malloc
  calls apr_sms_trivial_malloc).

* If the cost of a function call is similar to the cost of
  the two comparisons and half-dozen arithmetic operations
  in apr_palloc, that would explain why the SMS version is
  twice as slow.

>-- justin
>
>P.S. You are using gprof, how?  I tried -pg and it just doesn't
>work.  I switched to Forte 6.0U1's collect program now.  It
>actually writes out info that I can use (er_print is a bit
>awkward though).
>
I'm using gcc on Linux to build profiled code; it's not properly
including a profiled libc for reasons that I haven't had time to
debug yet, but it does a decent job of instrumenting the apr and
httpd code.

--Brian



Re: sms_trivial locking Re: missing apr_pool_child_cleanup_set when using SMS?

Posted by Justin Erenkrantz <je...@ebuilt.com>.
On Thu, Jul 26, 2001 at 10:50:58PM -0700, Brian Pane wrote:
> But there's a problem with the SMS lock management.
> According to gprof, every call to apr_sms_trivial_malloc
> acquires and releases a lock.

Yup.  I'm working on this right now.  =)  (What I have in my tree 
right now isn't in commit shape.)  Email me and I can send you
my tree diff.

I'm debating whether we're getting a win by the SMS stuff or not.
In order to do that, we need to kick out every lock we can.

Right now, I've got it so that most of the locks are now in libc
(aka NIMBY), but the performance still doesn't match pools (by a
factor of 2).  I'm scratching my head as to why this is.  
-- justin

P.S. You are using gprof, how?  I tried -pg and it just doesn't
work.  I switched to Forte 6.0U1's collect program now.  It
actually writes out info that I can use (er_print is a bit
awkward though).