You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Jim Jagielski <ji...@jaguNET.com> on 2009/03/30 20:07:18 UTC

Posix sems still not recommended?

Anyone know if:

   # POSIX semaphores and cross-process pthread mutexes are not
   # used by default since they have less desirable behaviour when
   # e.g. a process holding the mutex segfaults.

is still applicable, at least for posix sems?

Re: Posix sems still not recommended?

Posted by Jeff Trawick <tr...@gmail.com>.
On Fri, Apr 24, 2009 at 2:10 PM, venkatnv <ve...@yahoo.com> wrote:

>
> Well ... i think we found the root cause, in one of the libraries being
> used,
> the mutex was not being initialized. Thanks!
>
>
> venkatnv wrote:
> >
> > We are observing issues with pthread Mutexes on Apache22/Solaris10. Not
> > sure if this is relevant to this thread, but would appreciate any
>  inputs.
>

Thanks for following up on that ;)  It did seem to be a different issue
altogether.

Good luck!

Re: Posix sems still not recommended?

Posted by venkatnv <ve...@yahoo.com>.
Well ... i think we found the root cause, in one of the libraries being used,
the mutex was not being initialized. Thanks!


venkatnv wrote:
> 
> We are observing issues with pthread Mutexes on Apache22/Solaris10. Not
> sure if this is relevant to this thread, but would appreciate any  inputs.
> 
> - We are running Apache22 in Worker mode. Apache22 is compiled with gcc346
> on Solaris10
> - We are having a custom module (DSO) loaded with Apache.
> 
> On stress test, we see that a mutex is not working as intended.
> (pthread_mutex_lock)
> To be precise, we are seeing core dumps and further investigation revealed
> that there are two threads that have acquired a lock using
> pthread_mutex_lock, a the same time. 
> 
> Please note that we do not see this behavior on Apache2. This occurs only
> with Apache22. Has anyone come across a similar situation. Any help in
> narrowing down the cause would be greatly appreciated!
> 
> Regards,
> Venkat.
> 
> 
> Rainer Jung-3 wrote:
>> 
>> On 30.03.2009 20:58, Jeff Trawick wrote:
>>> On Mon, Mar 30, 2009 at 2:33 PM, Jeff Trawick <trawick@gmail.com
>>> <ma...@gmail.com>> wrote:
>>>
>>>
>>>
>>> On Mon, Mar 30, 2009 at 2:07 PM, Jim Jagielski <jim@jagunet.com
>>> <ma...@jagunet.com>> wrote:
>>>
>>> Anyone know if:
>>>
>>> # POSIX semaphores and cross-process pthread mutexes are not # used
>>> by default since they have less desirable behaviour when # e.g. a
>>> process holding the mutex segfaults.
>>>
>>> is still applicable, at least for posix sems?
>>>
>>>
>>> AFAIK, the Solaris-specific recovery logic for cross-process pthread
>>> mutexes has been working reliably for a long time, but with the
>>> current wind direction APR is choosing fcntl(), which has sysdef
>>> implementations on that
>>>
>>>
>>> ugh; "sysdef implications"
>> 
>> and quite often shows EDEADLOCK, even when you can prove there can't be
>> one. Especially when starting to use more than one lock of that type
>> (e.g. when SSL comes into the game).
>> 
>>> platform.
>>>
>>> no clues here about the POSIX semaphores
>> 
>> I would be much interested in an answer as well. Because of the
>> EDEADLOCK problems I did suggest using the pthread based mutex on
>> Solaris for a while to people and got no problem reports. But what
>> experience do others have?
>> 
>> In a related thread on the Tomcat users list about mod_jk I wrote in
>> February:
>> 
>>    I now did some searching and it turns out that the implementation of
>>    pthread mutexes for Solaris 10 has very recently changed quite a bit.
>>    So all speculations about improved pthread mutex behaviour
>>    (especially for "robust" mutexes) in the last years might have become
>>    obsolete.
>> 
>>    The new implementation is contained in Solaris kernel patch 137137-09
>>    and most likely also in Solaris 10 Update 6 (10/08). I didn't check,
>>    whether that update simply contains the kernel patch or the fix is
>>    included independently.
>> 
>>    Some detail is logged in Sunsolve under the bug IDs
>> 
>>    6296770 2160259 6664275 6697344 6729759 6564706
>> 
>> Regards,
>> 
>> Rainer
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Posix-sems-still-not-recommended--tp22789262p23222108.html
Sent from the APR Dev (Apache Portable Runtime) mailing list archive at Nabble.com.


Re: Posix sems still not recommended?

Posted by venkatnv <ve...@yahoo.com>.
We are observing issues with pthread Mutexes on Apache22/Solaris10. Not sure
if this is relevant to this thread, but would appreciate any  inputs.

- We are running Apache22 in Worker mode. Apache22 is compiled with gcc346
on Solaris10
- We are having a custom module (DSO) loaded with Apache.

On stress test, we see that a mutex is not working as intended.
(pthread_mutex_lock)
To be precise, we are seeing core dumps and further investigation revealed
that there are two threads that have acquired a lock using
pthread_mutex_lock, a the same time. 

Please note that we do not see this behavior on Apache2. This occurs only
with Apache22. Has anyone come across a similar situation. Any help in
narrowing down the cause would be greatly appreciated!

Regards,
Venkat.


Rainer Jung-3 wrote:
> 
> On 30.03.2009 20:58, Jeff Trawick wrote:
>> On Mon, Mar 30, 2009 at 2:33 PM, Jeff Trawick <trawick@gmail.com
>> <ma...@gmail.com>> wrote:
>>
>>
>>
>> On Mon, Mar 30, 2009 at 2:07 PM, Jim Jagielski <jim@jagunet.com
>> <ma...@jagunet.com>> wrote:
>>
>> Anyone know if:
>>
>> # POSIX semaphores and cross-process pthread mutexes are not # used
>> by default since they have less desirable behaviour when # e.g. a
>> process holding the mutex segfaults.
>>
>> is still applicable, at least for posix sems?
>>
>>
>> AFAIK, the Solaris-specific recovery logic for cross-process pthread
>> mutexes has been working reliably for a long time, but with the
>> current wind direction APR is choosing fcntl(), which has sysdef
>> implementations on that
>>
>>
>> ugh; "sysdef implications"
> 
> and quite often shows EDEADLOCK, even when you can prove there can't be
> one. Especially when starting to use more than one lock of that type
> (e.g. when SSL comes into the game).
> 
>> platform.
>>
>> no clues here about the POSIX semaphores
> 
> I would be much interested in an answer as well. Because of the
> EDEADLOCK problems I did suggest using the pthread based mutex on
> Solaris for a while to people and got no problem reports. But what
> experience do others have?
> 
> In a related thread on the Tomcat users list about mod_jk I wrote in
> February:
> 
>    I now did some searching and it turns out that the implementation of
>    pthread mutexes for Solaris 10 has very recently changed quite a bit.
>    So all speculations about improved pthread mutex behaviour
>    (especially for "robust" mutexes) in the last years might have become
>    obsolete.
> 
>    The new implementation is contained in Solaris kernel patch 137137-09
>    and most likely also in Solaris 10 Update 6 (10/08). I didn't check,
>    whether that update simply contains the kernel patch or the fix is
>    included independently.
> 
>    Some detail is logged in Sunsolve under the bug IDs
> 
>    6296770 2160259 6664275 6697344 6729759 6564706
> 
> Regards,
> 
> Rainer
> 
> 

-- 
View this message in context: http://www.nabble.com/Posix-sems-still-not-recommended--tp22789262p23208772.html
Sent from the APR Dev (Apache Portable Runtime) mailing list archive at Nabble.com.


RE: Posix sems still not recommended?

Posted by Daniel May <Da...@spryware.com>.
I have spent a considerable amount of time on this in the past few weeks, so let me elaborate on what I have found.

1. A few years ago, we implemented our own inter-process, shared mutex.  We did this using a pthread_mutex and storing the pthread_mutex_t in shared memory, setting the PTHREAD_PROCESS_SHARED attribute.  To overcome the problem where the owner of the mutex dies, we also store the pid of the current owner in the same shared memory segment along with the pthread_mutex_t, and try to detect dead owners and transfer ownership.

2. We recently discovered a race condition where the creation and destruction of the mutex was not atomic, between the creation of the shared memory and initialization of the pthread_mutex_t, there was a race condition.  We solved this by using a binary semaphore (sem_open(), etc.) to protect the creation and deletion.

3. We considered simply using the binary semaphore as a replacement for the inter-process pthread_mutex, but it was more than an order of magnitude slower and did not natively support recursion.

4. The MAC OSX implementation of pthreads does not support PTHREAD_PROCESS_SHARED, so we still do not have a favorable solution for OSX.

5. In reviewing the latest pthreads source code for RH Linux, it appears that the "robust" implementation handles the abandon problem much smoother, but this implementation is still rather new.

/Daniel


-----Original Message-----
From: Rainer Jung [mailto:rainer.jung@kippdata.de] 
Sent: Tuesday, March 31, 2009 2:00 AM
To: APR Development List
Subject: Re: Posix sems still not recommended?

On 30.03.2009 20:58, Jeff Trawick wrote:
> On Mon, Mar 30, 2009 at 2:33 PM, Jeff Trawick <trawick@gmail.com
> <ma...@gmail.com>> wrote:
>
>
>
> On Mon, Mar 30, 2009 at 2:07 PM, Jim Jagielski <jim@jagunet.com
> <ma...@jagunet.com>> wrote:
>
> Anyone know if:
>
> # POSIX semaphores and cross-process pthread mutexes are not # used
> by default since they have less desirable behaviour when # e.g. a
> process holding the mutex segfaults.
>
> is still applicable, at least for posix sems?
>
>
> AFAIK, the Solaris-specific recovery logic for cross-process pthread
> mutexes has been working reliably for a long time, but with the
> current wind direction APR is choosing fcntl(), which has sysdef
> implementations on that
>
>
> ugh; "sysdef implications"

and quite often shows EDEADLOCK, even when you can prove there can't be
one. Especially when starting to use more than one lock of that type
(e.g. when SSL comes into the game).

> platform.
>
> no clues here about the POSIX semaphores

I would be much interested in an answer as well. Because of the
EDEADLOCK problems I did suggest using the pthread based mutex on
Solaris for a while to people and got no problem reports. But what
experience do others have?

In a related thread on the Tomcat users list about mod_jk I wrote in
February:

   I now did some searching and it turns out that the implementation of
   pthread mutexes for Solaris 10 has very recently changed quite a bit.
   So all speculations about improved pthread mutex behaviour
   (especially for "robust" mutexes) in the last years might have become
   obsolete.

   The new implementation is contained in Solaris kernel patch 137137-09
   and most likely also in Solaris 10 Update 6 (10/08). I didn't check,
   whether that update simply contains the kernel patch or the fix is
   included independently.

   Some detail is logged in Sunsolve under the bug IDs

   6296770 2160259 6664275 6697344 6729759 6564706

Regards,

Rainer

Re: Posix sems still not recommended?

Posted by Dale Ghent <da...@elemental.org>.
On Mar 31, 2009, at 2:59 AM, Rainer Jung wrote:

> On 30.03.2009 20:58, Jeff Trawick wrote:
>> On Mon, Mar 30, 2009 at 2:33 PM, Jeff Trawick <trawick@gmail.com
>> <ma...@gmail.com>> wrote:
>>
>>
>>
>> On Mon, Mar 30, 2009 at 2:07 PM, Jim Jagielski <jim@jagunet.com
>> <ma...@jagunet.com>> wrote:
>>
>> Anyone know if:
>>
>> # POSIX semaphores and cross-process pthread mutexes are not # used
>> by default since they have less desirable behaviour when # e.g. a
>> process holding the mutex segfaults.
>>
>> is still applicable, at least for posix sems?
>>
>>
>> AFAIK, the Solaris-specific recovery logic for cross-process pthread
>> mutexes has been working reliably for a long time, but with the
>> current wind direction APR is choosing fcntl(), which has sysdef
>> implementations on that
>>
>>
>> ugh; "sysdef implications"
>
> and quite often shows EDEADLOCK, even when you can prove there can't  
> be
> one. Especially when starting to use more than one lock of that type
> (e.g. when SSL comes into the game).
>
>> platform.
>>
>> no clues here about the POSIX semaphores
>
> I would be much interested in an answer as well. Because of the
> EDEADLOCK problems I did suggest using the pthread based mutex on
> Solaris for a while to people and got no problem reports. But what
> experience do others have?
>
> In a related thread on the Tomcat users list about mod_jk I wrote in
> February:
>
>  I now did some searching and it turns out that the implementation of
>  pthread mutexes for Solaris 10 has very recently changed quite a bit.
>  So all speculations about improved pthread mutex behaviour
>  (especially for "robust" mutexes) in the last years might have become
>  obsolete.
>
>  The new implementation is contained in Solaris kernel patch 137137-09
>  and most likely also in Solaris 10 Update 6 (10/08). I didn't check,
>  whether that update simply contains the kernel patch or the fix is
>  included independently.
>
>  Some detail is logged in Sunsolve under the bug IDs
>
>  6296770 2160259 6664275 6697344 6729759 6564706

137137-09 (sparc) and 137138-09 (x86) are the kernel revs that ship  
with s10u6, so they're in there if indeed these items were rolled into  
that patch rev.

/dale

Re: Posix sems still not recommended?

Posted by Rainer Jung <ra...@kippdata.de>.
On 30.03.2009 20:58, Jeff Trawick wrote:
> On Mon, Mar 30, 2009 at 2:33 PM, Jeff Trawick <trawick@gmail.com
> <ma...@gmail.com>> wrote:
>
>
>
> On Mon, Mar 30, 2009 at 2:07 PM, Jim Jagielski <jim@jagunet.com
> <ma...@jagunet.com>> wrote:
>
> Anyone know if:
>
> # POSIX semaphores and cross-process pthread mutexes are not # used
> by default since they have less desirable behaviour when # e.g. a
> process holding the mutex segfaults.
>
> is still applicable, at least for posix sems?
>
>
> AFAIK, the Solaris-specific recovery logic for cross-process pthread
> mutexes has been working reliably for a long time, but with the
> current wind direction APR is choosing fcntl(), which has sysdef
> implementations on that
>
>
> ugh; "sysdef implications"

and quite often shows EDEADLOCK, even when you can prove there can't be
one. Especially when starting to use more than one lock of that type
(e.g. when SSL comes into the game).

> platform.
>
> no clues here about the POSIX semaphores

I would be much interested in an answer as well. Because of the
EDEADLOCK problems I did suggest using the pthread based mutex on
Solaris for a while to people and got no problem reports. But what
experience do others have?

In a related thread on the Tomcat users list about mod_jk I wrote in
February:

   I now did some searching and it turns out that the implementation of
   pthread mutexes for Solaris 10 has very recently changed quite a bit.
   So all speculations about improved pthread mutex behaviour
   (especially for "robust" mutexes) in the last years might have become
   obsolete.

   The new implementation is contained in Solaris kernel patch 137137-09
   and most likely also in Solaris 10 Update 6 (10/08). I didn't check,
   whether that update simply contains the kernel patch or the fix is
   included independently.

   Some detail is logged in Sunsolve under the bug IDs

   6296770 2160259 6664275 6697344 6729759 6564706

Regards,

Rainer

Re: Posix sems still not recommended?

Posted by Jeff Trawick <tr...@gmail.com>.
On Mon, Mar 30, 2009 at 2:33 PM, Jeff Trawick <tr...@gmail.com> wrote:

>
>
> On Mon, Mar 30, 2009 at 2:07 PM, Jim Jagielski <ji...@jagunet.com> wrote:
>
>> Anyone know if:
>>
>>  # POSIX semaphores and cross-process pthread mutexes are not
>>  # used by default since they have less desirable behaviour when
>>  # e.g. a process holding the mutex segfaults.
>>
>> is still applicable, at least for posix sems?
>
>
> AFAIK, the Solaris-specific recovery logic for cross-process pthread
> mutexes has been working reliably for a long time, but with the current wind
> direction APR is choosing fcntl(), which has sysdef implementations on that
>

ugh; "sysdef implications"


> platform.
>
> no clues here about the POSIX semaphores
>

Re: Posix sems still not recommended?

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Mar 30, 2009, at 2:33 PM, Jeff Trawick wrote:

>
>
> On Mon, Mar 30, 2009 at 2:07 PM, Jim Jagielski <ji...@jagunet.com>  
> wrote:
> Anyone know if:
>
>  # POSIX semaphores and cross-process pthread mutexes are not
>  # used by default since they have less desirable behaviour when
>  # e.g. a process holding the mutex segfaults.
>
> is still applicable, at least for posix sems?
>
> AFAIK, the Solaris-specific recovery logic for cross-process pthread  
> mutexes has been working reliably for a long time, but with the  
> current wind direction APR is choosing fcntl(), which has sysdef  
> implementations on that platform.
>
> no clues here about the POSIX semaphores

OS X has Posix (and that's why I added them) and I can't
recreate any sort of doomsday scenario by causing segfaults
at "inopportune" times.

Anyone opposed if we, at least for 2.0/trunk, allow both to
be defaults?

Re: Posix sems still not recommended?

Posted by Jeff Trawick <tr...@gmail.com>.
On Mon, Mar 30, 2009 at 2:07 PM, Jim Jagielski <ji...@jagunet.com> wrote:

> Anyone know if:
>
>  # POSIX semaphores and cross-process pthread mutexes are not
>  # used by default since they have less desirable behaviour when
>  # e.g. a process holding the mutex segfaults.
>
> is still applicable, at least for posix sems?


AFAIK, the Solaris-specific recovery logic for cross-process pthread mutexes
has been working reliably for a long time, but with the current wind
direction APR is choosing fcntl(), which has sysdef implementations on that
platform.

no clues here about the POSIX semaphores

-- 
Born in Roswell... married an alien...