You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Michael Durket <du...@rlucier-home2.stanford.edu> on 2008/06/12 15:56:21 UTC
Re: Apache, Solaris, AcceptMutex and EDEADLK
(I sent this originally to Joe Orton who suggested I post it to this
list instead):
I've been recently debugging an issue with Solaris, Apache and
EDEADLK. Turning
to Google, I ran across several posts, but found this fairly recent
post:
http://www.mail-archive.com/dev@apr.apache.org/msg19804.html
"The default was changed to fcntl because of the potential for
deadlocks
in use of cross-process pthread mutexes:
http://marc.info/?l=apr-dev&m=108720968023158&w=2
are those issues not seen any more? Since that decision was due
to a
potential OS bug (robust mutexes which aren't robust) has it been
confirmed with Sun that this fcntl/EDEADLK is definitely not an
OS bug?"
I don't know if a reply was ever received (I haven't found one yet
in my Google
searching). I can confirm (at least in my case) from extensive DTrace
debugging
of Apache 2.2.8 locking behavior under Solaris 10, that, no, this is
not a Solaris
bug - it's properly detecting the classic deadlock case involving (at
least) 2 locks
wherein process 1 holds lock A and wants lock B, and process 2 holds
lock B and
wants lock A. I see this case occur in my DTrace output just before
the EDEADLK
return.
This always involves the Accept Mutex and one other lock, which is
usually a global
mutex. It occurs because the Worker MPM is, of course, threaded and
multi-process, so
it's quite possible for 2 threads in one of the Worker MPM processes
to hold locks - one
holding the AcceptMutex, and the other wanting to lock say, the
mod_rewrite RewriteLock. Then
if another Worker MPM process has 2 threads, one of which is holding
the mod_rewrite RewriteLock
and a second thread in that same process wanting the AcceptMutex lock,
EDEADLK will be returned
to one, because Solaris is looking at the process level, not the
thread level. If the locking were
treated as being at the thread level, there would be no deadlock.
I've seen that, for some people, setting AcceptMutex pthread
fixed a similar problem, but I
was concerned about your comment posted above. Have you heard whether
or not the
cross-process pthread problems involving lock robustness problems have
been solved?
Sincerely yours,
Michael Durket