You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Sander Temme <sc...@covalent.net> on 2001/10/10 01:44:07 UTC

Accept locking test code strangeness

Hi all,

I am currently running some tests at the Open Source Development Lab
<http://www.osdl.org/> to try to determine which locking method is the most
efficient on large multi-processor boxes running linux. I'm using the
time-sem.c program in apache-1.3/src/test which runs fine, only the pthread
mutex case does not work. What happens is that, at runtime in the following
snippet (around line 305):

    if (pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED)) {
        perror ("pthread_mutexattr_setpshared");
        exit (1);
    }

the pthread_mutexattr_setpshared call returns 38 (ENOSYS, function not
implemented) but does not set errno... Shouldn't I expect the function call
to return -1 in case of distress and to find something like ENOSYS in errno?
This is on linux 2.4.8.

Now I am no pthreads wizard, but shouldn't this just work? Or is the
pthreads implementation in linux less-than-complete? I did get Apache itself
to run with pthread mutex accept locking without protest or core files.

Any advice appreciated,

Sander

-- 
Covalent Technologies                             sctemme@covalent.net
Engineering group                                Voice: (415) 536 5214
645 Howard St.                                     Fax: (415) 536 5210
San Francisco CA 94105

   PGP Fingerprint: 1E74 4E58 DFAC 2CF5 6A03  5531 AFB1 96AF B584 0AB1

=======================================================
This email message is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message
=======================================================


Re: Accept locking test code strangeness

Posted by Jeff Trawick <tr...@attglobal.net>.
Justin Erenkrantz <je...@ebuilt.com> writes:

> On Wed, Oct 10, 2001 at 07:38:35AM -0400, Jeff Trawick wrote:
> > FYI... He's talking about the test-sem program, not Apache.  With
> > Apache, we don't enable pthread mutexes on Linux (see ap_config.h), so
> > you can't say "acceptmutex pthread".
> 
> To quote Sander (now that my email is caught up):
> 
> > I did get Apache itself    to run with pthread mutex accept locking
> > without protest or core files. 
> 
> If I'm misunderstanding him, my apologies.  =)  

I doubt that it is a misunderstanding, but either

1) he modified Apache 1.3 sources to allow "AcceptMutex pthread" on
   Linux (a trivial thing to do)

or

2) there is a bug which I can't reproduce and which should be fixed ASAP

-- 
Jeff Trawick | trawick@attglobal.net | PGP public key at web site:
       http://www.geocities.com/SiliconValley/Park/9289/
             Born in Roswell... married an alien...

Re: Accept locking test code strangeness

Posted by Justin Erenkrantz <je...@ebuilt.com>.
On Wed, Oct 10, 2001 at 07:38:35AM -0400, Jeff Trawick wrote:
> FYI... He's talking about the test-sem program, not Apache.  With
> Apache, we don't enable pthread mutexes on Linux (see ap_config.h), so
> you can't say "acceptmutex pthread".

To quote Sander (now that my email is caught up):

> I did get Apache itself    to run with pthread mutex accept locking
> without protest or core files. 

If I'm misunderstanding him, my apologies.  =)  

Apache 1.3 should have exited with an error just like the test 
program does.  -- justin


Re: Accept locking test code strangeness

Posted by Jeff Trawick <tr...@attglobal.net>.
Justin Erenkrantz <je...@ebuilt.com> writes:

> [ I still haven't received this message, but saw it on apachelabs.org. ]
> 
> [ Blah, blah, blah, Sander says that pthread mutexes don't work on
>   Linux.  ENOSYS is returned when calling setpshared.  ]
> 
> Yup.  PROCESS_SHARED is defined, but not implemented on Linux 2.4/
> glibc 2+.  Therefore, no cross-process locks based on pthreads on 
> Linux will work.  It will be merely PROCESS_PRIVATE which doesn't do 
> you a lot of good in Apache 1.3 which is based on a prefork model.
> 
> Basically, if you say AcceptMutex pthread on 1.3, you will be running 
> without an accept mutex - which I believe on later Linux versions is 
> actually okay.  IIRC, Dean and Linus went back and forth on this I 
> think on lkml - thundering herd isn't much of a problem on recent 
> versions - Dean will have more concrete knowledge of this though.

FYI... He's talking about the test-sem program, not Apache.  With
Apache, we don't enable pthread mutexes on Linux (see ap_config.h), so
you can't say "acceptmutex pthread".

-- 
Jeff Trawick | trawick@attglobal.net | PGP public key at web site:
       http://www.geocities.com/SiliconValley/Park/9289/
             Born in Roswell... married an alien...

Re: Accept locking test code strangeness

Posted by Dirk-Willem van Gulik <di...@covalent.net>.
On Wed, 10 Oct 2001, Justin Erenkrantz wrote:

> > lack thereof) in the multiple listener case.  They are required to avoid
> > race conditions that can deadlock your server with the 1.3 process/accept
> > model.
>
> In fact, I bet that is why Sander thought that pthread mutex worked
> on Linux - he didn't have a multiple listener config, so the mutexes
> never get called.

Actually - just to be clear; what we are doing at the OSDL is
systematically trying ALL accept locking mechanims on a 1-16 CPU machine
to get a better idea of the cost of the locking methods.

This was promted by a SunTone certification procedure Covalent's product
(which is directly based on apache) had to go through - where we
discovered that on 8+ CPU machines a (in our opinion) too large amounth of
CPU cycles where solidly spend on locking and inter CPU comm's.

Obviously this is mostly/totally a function of the OS; not of apache.  We
just work on the assumption that Apache would just needs some sort of
accept lock.

Dw


Re: Accept locking test code strangeness

Posted by Justin Erenkrantz <je...@ebuilt.com>.
On Wed, Oct 10, 2001 at 08:01:30AM -0700, Marc Slemko wrote:
> On Tue, 9 Oct 2001, Justin Erenkrantz wrote:
> 
> > Basically, if you say AcceptMutex pthread on 1.3, you will be running 
> > without an accept mutex - which I believe on later Linux versions is 
> > actually okay.  IIRC, Dean and Linus went back and forth on this I 
> > think on lkml - thundering herd isn't much of a problem on recent 
> > versions - Dean will have more concrete knowledge of this though.
> 
> Once again, accept mutexes are _NOT_ just a performance optimization (or
> lack thereof) in the multiple listener case.  They are required to avoid
> race conditions that can deadlock your server with the 1.3 process/accept
> model.

In fact, I bet that is why Sander thought that pthread mutex worked
on Linux - he didn't have a multiple listener config, so the mutexes
never get called.

Out of curiousity, where are race conditions - in the system or in
our code?  I've seen the comments about this, but I never fully
understood the problem or where it originates.  -- justin


Re: Accept locking test code strangeness

Posted by Marc Slemko <ma...@znep.com>.
On Wed, 10 Oct 2001, Ryan Bloom wrote:

> On Wednesday 10 October 2001 08:01 am, Marc Slemko wrote:
> > On Tue, 9 Oct 2001, Justin Erenkrantz wrote:
> > > Basically, if you say AcceptMutex pthread on 1.3, you will be running
> > > without an accept mutex - which I believe on later Linux versions is
> > > actually okay.  IIRC, Dean and Linus went back and forth on this I
> > > think on lkml - thundering herd isn't much of a problem on recent
> > > versions - Dean will have more concrete knowledge of this though.
> >
> > Once again, accept mutexes are _NOT_ just a performance optimization (or
> > lack thereof) in the multiple listener case.  They are required to avoid
> > race conditions that can deadlock your server with the 1.3 process/accept
> > model.
> 
> I agree 100%, but I want to add that if the deadlock can occur in the 1.3 process
> model, it can just as easily occur in the threaded model.  It shouldn't matter
> if it is one thread per process sitting in accept, or 64 threads per process sitting
> in accept.  If a deadlock can occur, it will occur in either case.  Therefore, the
> accept mutex is still required in the 2.0 thread/process model.

Right, but you can imagine (well, do a lot more than imagine...) a MPM
where it isn't required, since the race is between the select() on
multiple listening sockets and the accept() on one.  If your process model
doesn't do that, then it doesn't have the race.


Re: Accept locking test code strangeness

Posted by Ryan Bloom <rb...@covalent.net>.
On Wednesday 10 October 2001 08:01 am, Marc Slemko wrote:
> On Tue, 9 Oct 2001, Justin Erenkrantz wrote:
> > Basically, if you say AcceptMutex pthread on 1.3, you will be running
> > without an accept mutex - which I believe on later Linux versions is
> > actually okay.  IIRC, Dean and Linus went back and forth on this I
> > think on lkml - thundering herd isn't much of a problem on recent
> > versions - Dean will have more concrete knowledge of this though.
>
> Once again, accept mutexes are _NOT_ just a performance optimization (or
> lack thereof) in the multiple listener case.  They are required to avoid
> race conditions that can deadlock your server with the 1.3 process/accept
> model.

I agree 100%, but I want to add that if the deadlock can occur in the 1.3 process
model, it can just as easily occur in the threaded model.  It shouldn't matter
if it is one thread per process sitting in accept, or 64 threads per process sitting
in accept.  If a deadlock can occur, it will occur in either case.  Therefore, the
accept mutex is still required in the 2.0 thread/process model.

Ryan

______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

Re: Accept locking test code strangeness

Posted by Marc Slemko <ma...@znep.com>.
On Tue, 9 Oct 2001, Justin Erenkrantz wrote:

> Basically, if you say AcceptMutex pthread on 1.3, you will be running 
> without an accept mutex - which I believe on later Linux versions is 
> actually okay.  IIRC, Dean and Linus went back and forth on this I 
> think on lkml - thundering herd isn't much of a problem on recent 
> versions - Dean will have more concrete knowledge of this though.

Once again, accept mutexes are _NOT_ just a performance optimization (or
lack thereof) in the multiple listener case.  They are required to avoid
race conditions that can deadlock your server with the 1.3 process/accept
model.


Re: Accept locking test code strangeness

Posted by Justin Erenkrantz <je...@ebuilt.com>.
[ I still haven't received this message, but saw it on apachelabs.org. ]

[ Blah, blah, blah, Sander says that pthread mutexes don't work on
  Linux.  ENOSYS is returned when calling setpshared.  ]

Yup.  PROCESS_SHARED is defined, but not implemented on Linux 2.4/
glibc 2+.  Therefore, no cross-process locks based on pthreads on 
Linux will work.  It will be merely PROCESS_PRIVATE which doesn't do 
you a lot of good in Apache 1.3 which is based on a prefork model.

Basically, if you say AcceptMutex pthread on 1.3, you will be running 
without an accept mutex - which I believe on later Linux versions is 
actually okay.  IIRC, Dean and Linus went back and forth on this I 
think on lkml - thundering herd isn't much of a problem on recent 
versions - Dean will have more concrete knowledge of this though.

FWIW, APR does an explicit check at configure-time to check for this 
bogosity and ends up disabling pthread cross-process locking on Linux.  
So, AcceptMutex pthread is not valid with Apache 2.0 on Linux.  =)  
-- justin