You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by Manuel Teira <mt...@tid.es> on 2008/06/05 19:31:22 UTC
Problems with Mutex initialization
Hello.
It seems that the dreaded Mutex initialization assert is not completely
gone. After compiling the unit_test executable, it asserts while trying
to run it, in qpid::sys::Mutex::Mutex(), trying to use a non initialized
recursiveMutexAttr.
The abort stack indicates that it is happening in the
qpid::sys::DeletionManager::AllThreadStatuses int constructor, I suppose
that while creating the static allThreadStatuses object.
I suspect that it has something to do with static initialization order.
So, I replaced the QPID_POSIX_ASSERT_THROW_IF wrapping the
pthread_mutex_init call in qpid::sys::Mutex::Mutex(), to allow the
program to continue, with:
if (pthread_mutex_init(&mutex, recursiveMutexattr) != 0) {
std::cout << "Error initializing mutex: " << &mutex
<< ". " << std::strerror(errno) << std::endl;
}
I also added some logs in the initMutexattr function:
std::cout << "initMutexattr[" << pthread_self() << "](once: "
<< &onceControl << ", mutexattr: " << &mutexattr
<< ")" << std::endl;
in the Mutex constructor and in the AllThreadStatuses constructor to
verify that the mutex is being used before the recursiveMutexAttr gets
initialized:
initMutexattr[1](once: 0xffffffff7e14e300, mutexattr: 0xffffffff7e1535f8)
initMutexattr[1](once: 0xffffffff7e14e620, mutexattr: 0xffffffff7e1535d8)
Mutex::Mutex with mutex at 0xffffffff7e14e680 and pthread_mutex_attr_t
at 0x0000000100400758
Error initializing mutex: 0xffffffff7e14e680. Error 0
AllThreadsStatuses ctor. Mutex: 0xffffffff7e14e680
[snip lots of other initMutexattr and Mutex::Mutex nonrelated calls]
initMutexattr[1](once: 0x00000001003f7730, mutexattr: 0x0000000100400758)
I suspect that the static nature of AllThreadStatuses could be involved.
Any thought?
Best regards.
--
Manuel.
Re: Problems with Mutex initialization
Posted by Manuel Teira <mt...@tid.es>.
Alan Conway escribió:
> Manuel Teira wrote:
>
>> Hello.
>>
>> It seems that the dreaded Mutex initialization assert is not completely
>> gone. After compiling the unit_test executable, it asserts while trying
>> to run it, in qpid::sys::Mutex::Mutex(), trying to use a non initialized
>> recursiveMutexAttr.
>>
>> The abort stack indicates that it is happening in the
>> qpid::sys::DeletionManager::AllThreadStatuses int constructor, I suppose
>> that while creating the static allThreadStatuses object.
>>
>
> Can you post a stack trace and the perror message that went with the failure?
>
>
Yes. Sorry for the delay, but I didn't save the core file previous to
the change in the Mutex.h. And since this file is included from a lot of
places I had to recompile again the whole project, that takes near 90
minutes on the hardware I'm using. This is what I get now:
bash-3.00$ ./unit_test
initMutexattr[1](once: 0xffffffff7e14e100, mutexattr: 0xffffffff7e1533d0)
initMutexattr[1](once: 0xffffffff7e14e420, mutexattr: 0xffffffff7e1533e8)
Invalid argument
Assertion failed: 0, file ../../src/qpid/sys/posix/Mutex.h, line 138
Abort (core dumped)
bash-3.00$ pstack core | c++filt
core 'core' of 25632:
/export/home/devel/ws/DSLAP/qpid/trunk/qpid/cpp/src/tests/.libs/unit_t
ffffffff7cdca124 _lwp_kill (6, 0, ffffffff7cdad9d0, ffffffff7cee0000,
ffffffffffffffff, 1800) + 8
ffffffff7cd4620c abort (1, 1b8, 0, 199f00, 0, 0) + 118
ffffffff7cd464b0 _assert (1002d51b1, 1002d51b3, 8a, 0, 199bb8, 0) + 74
00000001000ee444 qpid::sys::Mutex::Mutex #Nvariant 1()
(ffffffff7e14e480, ffffffff7e110938, 0, ffffffff7defe620,
ffffffff7f604f40, 1) + bc
ffffffff7df01c34
qpid::sys::DeletionManager<qpid::sys::PollerHandle>::AllThreadsStatuses::AllThreadsStatuses
#Nvariant 1(int) (ffffffff7e14e480, 0, ffffffff7d910300,
8000000000000000, 1493e0, 80000000) + 24
ffffffff7defe620 __SLIP.INIT_J (0, 0, 0, 0, 0, 0) + 28
ffffffff7df01394 void __STATIC_CONSTRUCTOR() (0, ffffffff7ceeae40,
199648, ffffffff7d3086b4, ffffffff7bb005c0, ffffffff7bb00600) + 4c
ffffffff7df99958 _init (0, 0, ffffffff7f72cac0, ffffffff7f611144,
11f368, ffffffff7bc02030) + 498
ffffffff7f61114c call_init (ffffffff7f72f620, 1, ffffffff7df994c0,
ffffffff7ef01198, ffdfffff, ffffffff7f72cac0) + 1b0
ffffffff7f610614 setup (ffffffff7f72c2e8, ffffffff7f72fd90,
ffffffff7f72cac0, 0, ffffffff7bd011b8, ffffffff7f72c1c0) + 1358
ffffffff7f61ebb8 _setup (12e4b8, b00, 28e30, 100000040, 0,
ffffffff7ffffbd8) + 3a4
ffffffff7f604f10 _rt_boot (0, 0, 0, 0, 0, 0) + 88
0000000000000000 ???????? (0, 0, 0, 0, 0, 0)
This is everything I was able to get, using the pstack utility on the
generated core file, as dbx refuses to cooperate, just SIGSEGV itself
trying to inspect this core file. I didn't try to load the dbx core
inside dbx itself, fearing some relativity effect.
The error is an EINVAL, and trusting the previous logs, it's caused by
the pthread_mutexattr_t not being initialized previously.
>> I suspect that it has something to do with static initialization order.
>> So, I replaced the QPID_POSIX_ASSERT_THROW_IF wrapping the
>> pthread_mutex_init call in qpid::sys::Mutex::Mutex(), to allow the
>> program to continue, with:
>>
>> if (pthread_mutex_init(&mutex, recursiveMutexattr) != 0) {
>> std::cout << "Error initializing mutex: " << &mutex
>> << ". " << std::strerror(errno) << std::endl;
>> }
>>
>> I also added some logs in the initMutexattr function:
>> std::cout << "initMutexattr[" << pthread_self() << "](once: "
>> << &onceControl << ", mutexattr: " << &mutexattr
>> << ")" << std::endl;
>> in the Mutex constructor and in the AllThreadStatuses constructor to
>> verify that the mutex is being used before the recursiveMutexAttr gets
>> initialized:
>>
>> initMutexattr[1](once: 0xffffffff7e14e300, mutexattr: 0xffffffff7e1535f8)
>> initMutexattr[1](once: 0xffffffff7e14e620, mutexattr: 0xffffffff7e1535d8)
>> Mutex::Mutex with mutex at 0xffffffff7e14e680 and pthread_mutex_attr_t
>> at 0x0000000100400758
>> Error initializing mutex: 0xffffffff7e14e680. Error 0
>> AllThreadsStatuses ctor. Mutex: 0xffffffff7e14e680
>>
>> [snip lots of other initMutexattr and Mutex::Mutex nonrelated calls]
>> initMutexattr[1](once: 0x00000001003f7730, mutexattr: 0x0000000100400758)
>>
>>
>> I suspect that the static nature of AllThreadStatuses could be involved.
>> Any thought?
>>
>>
>
> Order of static initializers would be my favourite suspect also. I'll take a
> look when I have a stack trace.
> .
>
>
Re: Problems with Mutex initialization
Posted by Alan Conway <ac...@redhat.com>.
Manuel Teira wrote:
> Hello.
>
> It seems that the dreaded Mutex initialization assert is not completely
> gone. After compiling the unit_test executable, it asserts while trying
> to run it, in qpid::sys::Mutex::Mutex(), trying to use a non initialized
> recursiveMutexAttr.
>
> The abort stack indicates that it is happening in the
> qpid::sys::DeletionManager::AllThreadStatuses int constructor, I suppose
> that while creating the static allThreadStatuses object.
Can you post a stack trace and the perror message that went with the failure?
> I suspect that it has something to do with static initialization order.
> So, I replaced the QPID_POSIX_ASSERT_THROW_IF wrapping the
> pthread_mutex_init call in qpid::sys::Mutex::Mutex(), to allow the
> program to continue, with:
>
> if (pthread_mutex_init(&mutex, recursiveMutexattr) != 0) {
> std::cout << "Error initializing mutex: " << &mutex
> << ". " << std::strerror(errno) << std::endl;
> }
>
> I also added some logs in the initMutexattr function:
> std::cout << "initMutexattr[" << pthread_self() << "](once: "
> << &onceControl << ", mutexattr: " << &mutexattr
> << ")" << std::endl;
> in the Mutex constructor and in the AllThreadStatuses constructor to
> verify that the mutex is being used before the recursiveMutexAttr gets
> initialized:
>
> initMutexattr[1](once: 0xffffffff7e14e300, mutexattr: 0xffffffff7e1535f8)
> initMutexattr[1](once: 0xffffffff7e14e620, mutexattr: 0xffffffff7e1535d8)
> Mutex::Mutex with mutex at 0xffffffff7e14e680 and pthread_mutex_attr_t
> at 0x0000000100400758
> Error initializing mutex: 0xffffffff7e14e680. Error 0
> AllThreadsStatuses ctor. Mutex: 0xffffffff7e14e680
>
> [snip lots of other initMutexattr and Mutex::Mutex nonrelated calls]
> initMutexattr[1](once: 0x00000001003f7730, mutexattr: 0x0000000100400758)
>
>
> I suspect that the static nature of AllThreadStatuses could be involved.
> Any thought?
>
Order of static initializers would be my favourite suspect also. I'll take a
look when I have a stack trace.
Re: Problems with Mutex initialization
Posted by Manuel Teira <mt...@tid.es>.
Andrew Stitcher escribió:
> On Thu, 2008-06-05 at 19:31 +0200, Manuel Teira wrote:
>
>> ...
>> initMutexattr[1](once: 0xffffffff7e14e300, mutexattr: 0xffffffff7e1535f8)
>> initMutexattr[1](once: 0xffffffff7e14e620, mutexattr: 0xffffffff7e1535d8)
>> Mutex::Mutex with mutex at 0xffffffff7e14e680 and pthread_mutex_attr_t
>> at 0x0000000100400758
>> Error initializing mutex: 0xffffffff7e14e680. Error 0
>> AllThreadsStatuses ctor. Mutex: 0xffffffff7e14e680
>>
>> [snip lots of other initMutexattr and Mutex::Mutex nonrelated calls]
>> initMutexattr[1](once: 0x00000001003f7730, mutexattr: 0x0000000100400758)
>>
>>
>> I suspect that the static nature of AllThreadStatuses could be involved.
>> Any thought?
>>
>
> I'm pretty sure you are correct about the static initialisation order.
>
> One problem that your trace points out though is that we will get a
> _different_ initialisation for each file including Mutex.h due to the
> anonymous namespace used here - this wasn't the intention of this code.
> I guess this needs to be fixed by creating a Mutex.cpp to hold this
> anonymous namespace. However I'm not sure whether or not this is the
> cause of your problem,although it can't help.
>
Yes. My first thought was that the initMutexattr object was intended as
a singleton, hence shared by all the Mutex instances in the process and
saving memory and CPU. But since it goes in the .h file, every
compilation unit is inlining its own initMutexattr.
> Andrew
>
>
>
>
Re: Problems with Mutex initialization
Posted by Manuel Teira <mt...@tid.es>.
Alan Conway escribió:
> Andrew Stitcher wrote:
>
>> On Thu, 2008-06-05 at 19:31 +0200, Manuel Teira wrote:
>>
>>> ...
>>> initMutexattr[1](once: 0xffffffff7e14e300, mutexattr: 0xffffffff7e1535f8)
>>> initMutexattr[1](once: 0xffffffff7e14e620, mutexattr: 0xffffffff7e1535d8)
>>> Mutex::Mutex with mutex at 0xffffffff7e14e680 and pthread_mutex_attr_t
>>> at 0x0000000100400758
>>> Error initializing mutex: 0xffffffff7e14e680. Error 0
>>> AllThreadsStatuses ctor. Mutex: 0xffffffff7e14e680
>>>
>>> [snip lots of other initMutexattr and Mutex::Mutex nonrelated calls]
>>> initMutexattr[1](once: 0x00000001003f7730, mutexattr: 0x0000000100400758)
>>>
>>>
>>> I suspect that the static nature of AllThreadStatuses could be involved.
>>> Any thought?
>>>
>> I'm pretty sure you are correct about the static initialisation order.
>>
>> One problem that your trace points out though is that we will get a
>> _different_ initialisation for each file including Mutex.h due to the
>> anonymous namespace used here - this wasn't the intention of this code.
>> I guess this needs to be fixed by creating a Mutex.cpp to hold this
>> anonymous namespace. However I'm not sure whether or not this is the
>> cause of your problem,although it can't help.
>>
>>
>
> I think I may see the problem. Is all AllThreadsStatuses called (directly or
> indirectly) from static constructors in other compilation units?
>
AllThreadStatuses is initialized in the static zone of the poller
implementations, both EPollPoller and ECFPoller. I think this is the
only place it's used, but I'm not completely sure.
> An object that is called from static constructors in other compilation units has
> to be written in a rather special way - it has to be a POD and be POD
> initialized - i.e. no constructors or destructors and no members/bases with
> ctors/dtors.
>
> We have a PODMutex for exactly that situation but if AllThreadStatuses is indeed
> being called from other compilation units during static init then it needs a
> redesign - both the Mutex and std::vector members have constructors. It may need
> to be rewritten as a singleton.
>
> Andrew do you know the circumstances under which this is called? Manuel - can
> you post a stack trace? That should answer the question.
>
> Cheers,
> Alan.
> .
>
>
Re: Problems with Mutex initialization
Posted by Alan Conway <ac...@redhat.com>.
On Fri, 2008-06-27 at 14:05 +0200, Manuel Teira wrote:
> Alan Conway escribió:
> > On Wed, 2008-06-25 at 11:12 +0200, Manuel Teira wrote:
> >
> >> Hello. After a pair of weeks dealing with random, uninteresting stuff,
> >> I'm back to the solaris port effort. I would like to wake up this issue
> >> again, since it's a stopper for the solaris testing I'm trying to do.
> >>
> >> After all this discussion, what seems to be true, is:
> >> 1.-The static initialization order is not the desired one.
> >> 2.-Inlining the Mutex.h annonymous code in every Mutex usage, produces
> >> multiple instances of onceControl and mutexattr, whereas the intended
> >> behaviour was to have single instances, to be shared across the code.
> >> 3.-About the PODMutex discussion, it is not needed here, since the
> >> pthread_once_t usage is giving us the desired "one-shot" behaviour.
> >>
> >> It was also said, that fixing #2 could lead to get #1 fixed. Anyway,
> >> what is the recommended technique to "force" static initialization if
> >> having #2 fixed is not helping?
> >>
> >>
> >
> > Just committed a fix that removes the need for a global object with a
> > constructor so should fix any init order problems. Also took this code
> > out of line so it's genuinely once, not once-per-compilation-unit. Shout
> > if it doesn't fix the problem or creates any new ones.
> >
> Great, it got fixed!!
>
> Most of the tests are passing now, but running the federation tests the
> old dynamic_cast monster related with the Sun compiler bug arises again.
> The problem was due to AsynchConnector being implemented in terms of
> DispatchHandle, so, I had to change that inheritance to be public also.
>
> I would like to show you how I've handled this workaround, to see if
> this is an acceptable way to do it:
>
> class AsynchConnector :
> //Bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6710638
> //Should be private inheritance
> #ifdef __SUNPRO_CC
> public
> #else
> private
> #endif
> DispatchHandle {
> public:
> ...
>
> The same for AsynchIO. This way, we will only use that inheritance when
> using the Sun Studio compiler, at least while that bug doesn't get
> fixed. Perhaps we should vote for it to get fixed. ;-)
>
> Also, I would like to know what is the state of the JIRA issues I
> submitted for the solaris port (QPID-1132 and QPID-1133), I mean, did
> you have time to look at them, are there any problem with them,...?
>
> I have another bunch of changes, most related with inclusion
> differences. Perhaps they could be added without wrapping them in a
> #ifdef clause, but I'm not sure. My plan is to open a new JIRA for those
> changes, and expose them to your consideration. What do you think? (I'm
> talking about this kind of modifications) :
>
> -bash-3.00$ svn diff ../../src/qpid/sys/posix/AsynchIO.cpp
> Index: ../../src/qpid/sys/posix/AsynchIO.cpp
> ===================================================================
> --- ../../src/qpid/sys/posix/AsynchIO.cpp (revision 671815)
> +++ ../../src/qpid/sys/posix/AsynchIO.cpp (working copy)
> @@ -19,6 +19,10 @@
> *
> */
>
> +#ifdef HAVE_CONFIG_H
> +#include "config.h"
> +#endif
> +
> #include "qpid/sys/AsynchIO.h"
> #include "qpid/sys/Time.h"
>
> @@ -31,6 +35,11 @@
> #include <signal.h>
> #include <errno.h>
>
> +#ifdef SUNOS
> +//strerror, memmove prototypes
> +#include <string.h>
> +#endif
> +
> #include <boost/bind.hpp>
>
> using namespace qpid::sys;
>
>
> Do you agree with this way to handle it, or should we just include
> string.h directly? Perhaps somebody should test all these changes on a
> linux box, before deciding to add them with or without the #ifdef wrapping.
I would suggest including them directly. There are many places where we
are picking up system headers we need "by accident" because they are
included in other #include files. Making all the required #includes
explicit is good for portability.
I can commit these changes Monday, if I do stumble across any headers
that are not available on linux that is a good thing as it may highlight
a point were we need some more abstractions in qpid::sys.
Re: Problems with Mutex initialization
Posted by Alan Conway <ac...@redhat.com>.
On Fri, 2008-06-27 at 11:28 -0400, Andrew Stitcher wrote:
> On Fri, 2008-06-27 at 14:05 +0200, Manuel Teira wrote:
> > ...
> > class AsynchConnector :
> > //Bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6710638
> > //Should be private inheritance
> > #ifdef __SUNPRO_CC
> > public
> > #else
> > private
> > #endif
> > DispatchHandle {
> > public:
> > ...
> >
>
> I must say I find this terribly ugly, but I suppose it does point out
> the cause of the change.
It is ugly but I can't think of any less-ugly fix that preserves the
private inheritance on non-Sun platforms.
> So take out the #include "config.h" stuff and just include "string.h"
> directly. At very worst it can't cause problems on Linux.
>
> There are very few cases now where you need to do conditional includes
> along these lines.
And in those cases the conditional includes should be isolated in a
qpid/sys header file defining an abstraction that can be implemented
differently on different platforms.
Re: Problems with Mutex initialization
Posted by Andrew Stitcher <as...@redhat.com>.
On Fri, 2008-06-27 at 14:05 +0200, Manuel Teira wrote:
> ...
> class AsynchConnector :
> //Bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6710638
> //Should be private inheritance
> #ifdef __SUNPRO_CC
> public
> #else
> private
> #endif
> DispatchHandle {
> public:
> ...
>
I must say I find this terribly ugly, but I suppose it does point out
the cause of the change.
> I have another bunch of changes, most related with inclusion
> differences. Perhaps they could be added without wrapping them in a
> #ifdef clause, but I'm not sure. My plan is to open a new JIRA for those
> changes, and expose them to your consideration. What do you think? (I'm
> talking about this kind of modifications) :
>
> -bash-3.00$ svn diff ../../src/qpid/sys/posix/AsynchIO.cpp
> Index: ../../src/qpid/sys/posix/AsynchIO.cpp
> ===================================================================
> --- ../../src/qpid/sys/posix/AsynchIO.cpp (revision 671815)
> +++ ../../src/qpid/sys/posix/AsynchIO.cpp (working copy)
> @@ -19,6 +19,10 @@
> *
> */
>
> +#ifdef HAVE_CONFIG_H
> +#include "config.h"
> +#endif
> +
> #include "qpid/sys/AsynchIO.h"
> #include "qpid/sys/Time.h"
>
> @@ -31,6 +35,11 @@
> #include <signal.h>
> #include <errno.h>
>
> +#ifdef SUNOS
> +//strerror, memmove prototypes
> +#include <string.h>
> +#endif
> +
> #include <boost/bind.hpp>
I really think this kind of change is unnecessary and misguided. There
is actually no platform portability issue here. Simply, the includes are
incorrect.
It is easy to do if you (as I often do myself) just add #include lines
in response to compile errors. However the correct thing to do (not
often done) is to add include lines when the standard says they are
necessary and not rely on some random include file you included to get
definitions for you.
So take out the #include "config.h" stuff and just include "string.h"
directly. At very worst it can't cause problems on Linux.
There are very few cases now where you need to do conditional includes
along these lines.
Andrew
Re: Problems with Mutex initialization
Posted by Manuel Teira <mt...@tid.es>.
Alan Conway escribió:
> On Wed, 2008-06-25 at 11:12 +0200, Manuel Teira wrote:
>
>> Hello. After a pair of weeks dealing with random, uninteresting stuff,
>> I'm back to the solaris port effort. I would like to wake up this issue
>> again, since it's a stopper for the solaris testing I'm trying to do.
>>
>> After all this discussion, what seems to be true, is:
>> 1.-The static initialization order is not the desired one.
>> 2.-Inlining the Mutex.h annonymous code in every Mutex usage, produces
>> multiple instances of onceControl and mutexattr, whereas the intended
>> behaviour was to have single instances, to be shared across the code.
>> 3.-About the PODMutex discussion, it is not needed here, since the
>> pthread_once_t usage is giving us the desired "one-shot" behaviour.
>>
>> It was also said, that fixing #2 could lead to get #1 fixed. Anyway,
>> what is the recommended technique to "force" static initialization if
>> having #2 fixed is not helping?
>>
>>
>
> Just committed a fix that removes the need for a global object with a
> constructor so should fix any init order problems. Also took this code
> out of line so it's genuinely once, not once-per-compilation-unit. Shout
> if it doesn't fix the problem or creates any new ones.
>
Great, it got fixed!!
Most of the tests are passing now, but running the federation tests the
old dynamic_cast monster related with the Sun compiler bug arises again.
The problem was due to AsynchConnector being implemented in terms of
DispatchHandle, so, I had to change that inheritance to be public also.
I would like to show you how I've handled this workaround, to see if
this is an acceptable way to do it:
class AsynchConnector :
//Bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6710638
//Should be private inheritance
#ifdef __SUNPRO_CC
public
#else
private
#endif
DispatchHandle {
public:
...
The same for AsynchIO. This way, we will only use that inheritance when
using the Sun Studio compiler, at least while that bug doesn't get
fixed. Perhaps we should vote for it to get fixed. ;-)
Also, I would like to know what is the state of the JIRA issues I
submitted for the solaris port (QPID-1132 and QPID-1133), I mean, did
you have time to look at them, are there any problem with them,...?
I have another bunch of changes, most related with inclusion
differences. Perhaps they could be added without wrapping them in a
#ifdef clause, but I'm not sure. My plan is to open a new JIRA for those
changes, and expose them to your consideration. What do you think? (I'm
talking about this kind of modifications) :
-bash-3.00$ svn diff ../../src/qpid/sys/posix/AsynchIO.cpp
Index: ../../src/qpid/sys/posix/AsynchIO.cpp
===================================================================
--- ../../src/qpid/sys/posix/AsynchIO.cpp (revision 671815)
+++ ../../src/qpid/sys/posix/AsynchIO.cpp (working copy)
@@ -19,6 +19,10 @@
*
*/
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
#include "qpid/sys/AsynchIO.h"
#include "qpid/sys/Time.h"
@@ -31,6 +35,11 @@
#include <signal.h>
#include <errno.h>
+#ifdef SUNOS
+//strerror, memmove prototypes
+#include <string.h>
+#endif
+
#include <boost/bind.hpp>
using namespace qpid::sys;
Do you agree with this way to handle it, or should we just include
string.h directly? Perhaps somebody should test all these changes on a
linux box, before deciding to add them with or without the #ifdef wrapping.
Best regards.
--
Manuel.
> Cheers,
> Alan.
>
>
>
Re: Problems with Mutex initialization
Posted by Alan Conway <ac...@redhat.com>.
On Wed, 2008-06-25 at 11:12 +0200, Manuel Teira wrote:
> Hello. After a pair of weeks dealing with random, uninteresting stuff,
> I'm back to the solaris port effort. I would like to wake up this issue
> again, since it's a stopper for the solaris testing I'm trying to do.
>
> After all this discussion, what seems to be true, is:
> 1.-The static initialization order is not the desired one.
> 2.-Inlining the Mutex.h annonymous code in every Mutex usage, produces
> multiple instances of onceControl and mutexattr, whereas the intended
> behaviour was to have single instances, to be shared across the code.
> 3.-About the PODMutex discussion, it is not needed here, since the
> pthread_once_t usage is giving us the desired "one-shot" behaviour.
>
> It was also said, that fixing #2 could lead to get #1 fixed. Anyway,
> what is the recommended technique to "force" static initialization if
> having #2 fixed is not helping?
>
Just committed a fix that removes the need for a global object with a
constructor so should fix any init order problems. Also took this code
out of line so it's genuinely once, not once-per-compilation-unit. Shout
if it doesn't fix the problem or creates any new ones.
Cheers,
Alan.
Re: Problems with Mutex initialization
Posted by Manuel Teira <mt...@tid.es>.
Hello. After a pair of weeks dealing with random, uninteresting stuff,
I'm back to the solaris port effort. I would like to wake up this issue
again, since it's a stopper for the solaris testing I'm trying to do.
After all this discussion, what seems to be true, is:
1.-The static initialization order is not the desired one.
2.-Inlining the Mutex.h annonymous code in every Mutex usage, produces
multiple instances of onceControl and mutexattr, whereas the intended
behaviour was to have single instances, to be shared across the code.
3.-About the PODMutex discussion, it is not needed here, since the
pthread_once_t usage is giving us the desired "one-shot" behaviour.
It was also said, that fixing #2 could lead to get #1 fixed. Anyway,
what is the recommended technique to "force" static initialization if
having #2 fixed is not helping?
Best regards.
--
Manuel.
Alan Conway escribió:
> Andrew Stitcher wrote:
>
>> On Thu, 2008-06-05 at 14:39 -0400, Alan Conway wrote:
>>
>>> Andrew Stitcher wrote:
>>>
>>>> On Thu, 2008-06-05 at 19:31 +0200, Manuel Teira wrote:
>>>>
>>>>> ...
>>>>> initMutexattr[1](once: 0xffffffff7e14e300, mutexattr: 0xffffffff7e1535f8)
>>>>> initMutexattr[1](once: 0xffffffff7e14e620, mutexattr: 0xffffffff7e1535d8)
>>>>> Mutex::Mutex with mutex at 0xffffffff7e14e680 and pthread_mutex_attr_t
>>>>> at 0x0000000100400758
>>>>> Error initializing mutex: 0xffffffff7e14e680. Error 0
>>>>> AllThreadsStatuses ctor. Mutex: 0xffffffff7e14e680
>>>>>
>>>>> [snip lots of other initMutexattr and Mutex::Mutex nonrelated calls]
>>>>> initMutexattr[1](once: 0x00000001003f7730, mutexattr: 0x0000000100400758)
>>>>>
>>>>>
>>>>> I suspect that the static nature of AllThreadStatuses could be involved.
>>>>> Any thought?
>>>>>
>>>> I'm pretty sure you are correct about the static initialisation order.
>>>>
>>>> One problem that your trace points out though is that we will get a
>>>> _different_ initialisation for each file including Mutex.h due to the
>>>> anonymous namespace used here - this wasn't the intention of this code.
>>>> I guess this needs to be fixed by creating a Mutex.cpp to hold this
>>>> anonymous namespace. However I'm not sure whether or not this is the
>>>> cause of your problem,although it can't help.
>>>>
>>>>
>>> I think I may see the problem. Is all AllThreadsStatuses called (directly or
>>> indirectly) from static constructors in other compilation units?
>>>
>>> An object that is called from static constructors in other compilation units has
>>> to be written in a rather special way - it has to be a POD and be POD
>>> initialized - i.e. no constructors or destructors and no members/bases with
>>> ctors/dtors.
>>>
>> It's the mutexattr static constructor that is the problem - that needs
>> to be written as a singleton and as I said before moved into an
>> implementation file.
>>
>>
>>> We have a PODMutex for exactly that situation
>>>
>> PODMutex is currently useless (and isn't used anywhere as far as I know)
>> because it can't create a recursive mutex which is what Mutex is.
>>
>>
> Its not useless, and it doesn't need to be recursive. It is useful precisely for
> protecting the critical section of a singleton which may run in a static
> construtor. It should be used only for static mutex memebers or global mutex
> variables - e.g. if we need to put mutexatter in a singleton, we can protect the
> singleton with a PODMutex.
>
>
>> A piece of work we still need to do is to create a RecursiveMutex and
>> revert Mutex to non recursive then only make mutexes recursive that have
>> to be - then PODMutex might be useful again.
>>
>>
> POD mutex is not recursive and never will be. It's intended only for the
> restricted situation of a static initialized mutex.
>
>
>>> Andrew do you know the circumstances under which this is called? Manuel - can
>>> you post a stack trace? That should answer the question.
>>>
>> I think that fixing Mutex so that its contructor can be safely called
>> from anywhere in static initialisation is the way to go here. I think
>> that the other issues are consequences of this lack. The problem is
>> exacerbated because there are multiple initialisations of the
>> mutexInitAttr one for every implementation file which uses Mutex so it
>> almost bound to fail in this way - it's amazing we've not seen it
>> before.
>>
>
> I agree - and PODMutex is the tool we can use to acomplish that. I can give this
> a shot but probably not today, I'll put it on my list.
>
> An even better solution would be to drop use of recursive mutexes, which are
> evil, but I don't know how big a job that is. Most of the code is written with
> the intent that it not require a recursive mutex but there may be some places
> that will deadlock.
>
>
> Cheers,
> Alan.
> .
>
>
Re: Problems with Mutex initialization
Posted by Andrew Stitcher <as...@redhat.com>.
On Fri, 2008-06-06 at 09:12 -0400, Alan Conway wrote:
> ...
> Its not useless, and it doesn't need to be recursive. It is useful precisely for
> protecting the critical section of a singleton which may run in a static
> construtor. It should be used only for static mutex memebers or global mutex
> variables - e.g. if we need to put mutexatter in a singleton, we can protect the
> singleton with a PODMutex.
Ah I understand its intent now - this isn't needed in this case because
we use pthread_once to achieve the same end ie.it gives exclusion and
ensures that the initialisation is only called once.
> I agree - and PODMutex is the tool we can use to acomplish that. I can give this
> a shot but probably not today, I'll put it on my list.
I don't think using any kind of mutex can help change the static
initialisation order! We don't need any further mutual exclusion of the
initialisation as above.
>
> An even better solution would be to drop use of recursive mutexes, which are
> evil, but I don't know how big a job that is. Most of the code is written with
> the intent that it not require a recursive mutex but there may be some places
> that will deadlock.
+1
As an interim measure only using recursive mutexes where necessary will
help to see what we have to rework.
Andrew
Re: Problems with Mutex initialization
Posted by Alan Conway <ac...@redhat.com>.
Andrew Stitcher wrote:
> On Thu, 2008-06-05 at 14:39 -0400, Alan Conway wrote:
>> Andrew Stitcher wrote:
>>> On Thu, 2008-06-05 at 19:31 +0200, Manuel Teira wrote:
>>>> ...
>>>> initMutexattr[1](once: 0xffffffff7e14e300, mutexattr: 0xffffffff7e1535f8)
>>>> initMutexattr[1](once: 0xffffffff7e14e620, mutexattr: 0xffffffff7e1535d8)
>>>> Mutex::Mutex with mutex at 0xffffffff7e14e680 and pthread_mutex_attr_t
>>>> at 0x0000000100400758
>>>> Error initializing mutex: 0xffffffff7e14e680. Error 0
>>>> AllThreadsStatuses ctor. Mutex: 0xffffffff7e14e680
>>>>
>>>> [snip lots of other initMutexattr and Mutex::Mutex nonrelated calls]
>>>> initMutexattr[1](once: 0x00000001003f7730, mutexattr: 0x0000000100400758)
>>>>
>>>>
>>>> I suspect that the static nature of AllThreadStatuses could be involved.
>>>> Any thought?
>>> I'm pretty sure you are correct about the static initialisation order.
>>>
>>> One problem that your trace points out though is that we will get a
>>> _different_ initialisation for each file including Mutex.h due to the
>>> anonymous namespace used here - this wasn't the intention of this code.
>>> I guess this needs to be fixed by creating a Mutex.cpp to hold this
>>> anonymous namespace. However I'm not sure whether or not this is the
>>> cause of your problem,although it can't help.
>>>
>> I think I may see the problem. Is all AllThreadsStatuses called (directly or
>> indirectly) from static constructors in other compilation units?
>>
>> An object that is called from static constructors in other compilation units has
>> to be written in a rather special way - it has to be a POD and be POD
>> initialized - i.e. no constructors or destructors and no members/bases with
>> ctors/dtors.
>
> It's the mutexattr static constructor that is the problem - that needs
> to be written as a singleton and as I said before moved into an
> implementation file.
>
>> We have a PODMutex for exactly that situation
>
> PODMutex is currently useless (and isn't used anywhere as far as I know)
> because it can't create a recursive mutex which is what Mutex is.
>
Its not useless, and it doesn't need to be recursive. It is useful precisely for
protecting the critical section of a singleton which may run in a static
construtor. It should be used only for static mutex memebers or global mutex
variables - e.g. if we need to put mutexatter in a singleton, we can protect the
singleton with a PODMutex.
> A piece of work we still need to do is to create a RecursiveMutex and
> revert Mutex to non recursive then only make mutexes recursive that have
> to be - then PODMutex might be useful again.
>
POD mutex is not recursive and never will be. It's intended only for the
restricted situation of a static initialized mutex.
>> Andrew do you know the circumstances under which this is called? Manuel - can
>> you post a stack trace? That should answer the question.
>
> I think that fixing Mutex so that its contructor can be safely called
> from anywhere in static initialisation is the way to go here. I think
> that the other issues are consequences of this lack. The problem is
> exacerbated because there are multiple initialisations of the
> mutexInitAttr one for every implementation file which uses Mutex so it
> almost bound to fail in this way - it's amazing we've not seen it
> before.
I agree - and PODMutex is the tool we can use to acomplish that. I can give this
a shot but probably not today, I'll put it on my list.
An even better solution would be to drop use of recursive mutexes, which are
evil, but I don't know how big a job that is. Most of the code is written with
the intent that it not require a recursive mutex but there may be some places
that will deadlock.
Cheers,
Alan.
Re: Problems with Mutex initialization
Posted by Manuel Teira <mt...@tid.es>.
Andrew Stitcher escribió:
> On Thu, 2008-06-05 at 14:39 -0400, Alan Conway wrote:
>
>> Andrew Stitcher wrote:
>>
>>> On Thu, 2008-06-05 at 19:31 +0200, Manuel Teira wrote:
>>>
>>>> ...
>>>> initMutexattr[1](once: 0xffffffff7e14e300, mutexattr: 0xffffffff7e1535f8)
>>>> initMutexattr[1](once: 0xffffffff7e14e620, mutexattr: 0xffffffff7e1535d8)
>>>> Mutex::Mutex with mutex at 0xffffffff7e14e680 and pthread_mutex_attr_t
>>>> at 0x0000000100400758
>>>> Error initializing mutex: 0xffffffff7e14e680. Error 0
>>>> AllThreadsStatuses ctor. Mutex: 0xffffffff7e14e680
>>>>
>>>> [snip lots of other initMutexattr and Mutex::Mutex nonrelated calls]
>>>> initMutexattr[1](once: 0x00000001003f7730, mutexattr: 0x0000000100400758)
>>>>
>>>>
>>>> I suspect that the static nature of AllThreadStatuses could be involved.
>>>> Any thought?
>>>>
>>> I'm pretty sure you are correct about the static initialisation order.
>>>
>>> One problem that your trace points out though is that we will get a
>>> _different_ initialisation for each file including Mutex.h due to the
>>> anonymous namespace used here - this wasn't the intention of this code.
>>> I guess this needs to be fixed by creating a Mutex.cpp to hold this
>>> anonymous namespace. However I'm not sure whether or not this is the
>>> cause of your problem,although it can't help.
>>>
>>>
>> I think I may see the problem. Is all AllThreadsStatuses called (directly or
>> indirectly) from static constructors in other compilation units?
>>
>> An object that is called from static constructors in other compilation units has
>> to be written in a rather special way - it has to be a POD and be POD
>> initialized - i.e. no constructors or destructors and no members/bases with
>> ctors/dtors.
>>
>
> It's the mutexattr static constructor that is the problem - that needs
> to be written as a singleton and as I said before moved into an
> implementation file.
>
>
>> We have a PODMutex for exactly that situation
>>
>
> PODMutex is currently useless (and isn't used anywhere as far as I know)
> because it can't create a recursive mutex which is what Mutex is.
>
> A piece of work we still need to do is to create a RecursiveMutex and
> revert Mutex to non recursive then only make mutexes recursive that have
> to be - then PODMutex might be useful again.
>
>
>> Andrew do you know the circumstances under which this is called? Manuel - can
>> you post a stack trace? That should answer the question.
>>
>
> I think that fixing Mutex so that its contructor can be safely called
> from anywhere in static initialisation is the way to go here. I think
> that the other issues are consequences of this lack. The problem is
> exacerbated because there are multiple initialisations of the
> mutexInitAttr one for every implementation file which uses Mutex so it
> almost bound to fail in this way - it's amazing we've not seen it
> before.
>
I think it was for the same reason that Mutex and RWLock were able to
share the pthread_once_t control without error in Linux. Non initialized
pthread_mutexattr_t are probably valid for the pthread_mutex_init call
in the Linux version (perhaps a zero inizialized struct is enough, and
the compiler is supplying that), but I think that the locks must be
failing randomly at least in the initial stage of the process.
Best regards.
--
Manuel.
> Andrew
>
>
>
>> Cheers,
>> Alan.
>>
>
> .
>
>
Re: Problems with Mutex initialization
Posted by Andrew Stitcher <as...@redhat.com>.
On Thu, 2008-06-05 at 14:39 -0400, Alan Conway wrote:
> Andrew Stitcher wrote:
> > On Thu, 2008-06-05 at 19:31 +0200, Manuel Teira wrote:
> >> ...
> >> initMutexattr[1](once: 0xffffffff7e14e300, mutexattr: 0xffffffff7e1535f8)
> >> initMutexattr[1](once: 0xffffffff7e14e620, mutexattr: 0xffffffff7e1535d8)
> >> Mutex::Mutex with mutex at 0xffffffff7e14e680 and pthread_mutex_attr_t
> >> at 0x0000000100400758
> >> Error initializing mutex: 0xffffffff7e14e680. Error 0
> >> AllThreadsStatuses ctor. Mutex: 0xffffffff7e14e680
> >>
> >> [snip lots of other initMutexattr and Mutex::Mutex nonrelated calls]
> >> initMutexattr[1](once: 0x00000001003f7730, mutexattr: 0x0000000100400758)
> >>
> >>
> >> I suspect that the static nature of AllThreadStatuses could be involved.
> >> Any thought?
> >
> > I'm pretty sure you are correct about the static initialisation order.
> >
> > One problem that your trace points out though is that we will get a
> > _different_ initialisation for each file including Mutex.h due to the
> > anonymous namespace used here - this wasn't the intention of this code.
> > I guess this needs to be fixed by creating a Mutex.cpp to hold this
> > anonymous namespace. However I'm not sure whether or not this is the
> > cause of your problem,although it can't help.
> >
>
> I think I may see the problem. Is all AllThreadsStatuses called (directly or
> indirectly) from static constructors in other compilation units?
>
> An object that is called from static constructors in other compilation units has
> to be written in a rather special way - it has to be a POD and be POD
> initialized - i.e. no constructors or destructors and no members/bases with
> ctors/dtors.
It's the mutexattr static constructor that is the problem - that needs
to be written as a singleton and as I said before moved into an
implementation file.
>
> We have a PODMutex for exactly that situation
PODMutex is currently useless (and isn't used anywhere as far as I know)
because it can't create a recursive mutex which is what Mutex is.
A piece of work we still need to do is to create a RecursiveMutex and
revert Mutex to non recursive then only make mutexes recursive that have
to be - then PODMutex might be useful again.
>
> Andrew do you know the circumstances under which this is called? Manuel - can
> you post a stack trace? That should answer the question.
I think that fixing Mutex so that its contructor can be safely called
from anywhere in static initialisation is the way to go here. I think
that the other issues are consequences of this lack. The problem is
exacerbated because there are multiple initialisations of the
mutexInitAttr one for every implementation file which uses Mutex so it
almost bound to fail in this way - it's amazing we've not seen it
before.
Andrew
>
> Cheers,
> Alan.
Re: Problems with Mutex initialization
Posted by Alan Conway <ac...@redhat.com>.
Andrew Stitcher wrote:
> On Thu, 2008-06-05 at 19:31 +0200, Manuel Teira wrote:
>> ...
>> initMutexattr[1](once: 0xffffffff7e14e300, mutexattr: 0xffffffff7e1535f8)
>> initMutexattr[1](once: 0xffffffff7e14e620, mutexattr: 0xffffffff7e1535d8)
>> Mutex::Mutex with mutex at 0xffffffff7e14e680 and pthread_mutex_attr_t
>> at 0x0000000100400758
>> Error initializing mutex: 0xffffffff7e14e680. Error 0
>> AllThreadsStatuses ctor. Mutex: 0xffffffff7e14e680
>>
>> [snip lots of other initMutexattr and Mutex::Mutex nonrelated calls]
>> initMutexattr[1](once: 0x00000001003f7730, mutexattr: 0x0000000100400758)
>>
>>
>> I suspect that the static nature of AllThreadStatuses could be involved.
>> Any thought?
>
> I'm pretty sure you are correct about the static initialisation order.
>
> One problem that your trace points out though is that we will get a
> _different_ initialisation for each file including Mutex.h due to the
> anonymous namespace used here - this wasn't the intention of this code.
> I guess this needs to be fixed by creating a Mutex.cpp to hold this
> anonymous namespace. However I'm not sure whether or not this is the
> cause of your problem,although it can't help.
>
I think I may see the problem. Is all AllThreadsStatuses called (directly or
indirectly) from static constructors in other compilation units?
An object that is called from static constructors in other compilation units has
to be written in a rather special way - it has to be a POD and be POD
initialized - i.e. no constructors or destructors and no members/bases with
ctors/dtors.
We have a PODMutex for exactly that situation but if AllThreadStatuses is indeed
being called from other compilation units during static init then it needs a
redesign - both the Mutex and std::vector members have constructors. It may need
to be rewritten as a singleton.
Andrew do you know the circumstances under which this is called? Manuel - can
you post a stack trace? That should answer the question.
Cheers,
Alan.
Re: Problems with Mutex initialization
Posted by Andrew Stitcher <as...@redhat.com>.
On Thu, 2008-06-05 at 19:31 +0200, Manuel Teira wrote:
> ...
> initMutexattr[1](once: 0xffffffff7e14e300, mutexattr: 0xffffffff7e1535f8)
> initMutexattr[1](once: 0xffffffff7e14e620, mutexattr: 0xffffffff7e1535d8)
> Mutex::Mutex with mutex at 0xffffffff7e14e680 and pthread_mutex_attr_t
> at 0x0000000100400758
> Error initializing mutex: 0xffffffff7e14e680. Error 0
> AllThreadsStatuses ctor. Mutex: 0xffffffff7e14e680
>
> [snip lots of other initMutexattr and Mutex::Mutex nonrelated calls]
> initMutexattr[1](once: 0x00000001003f7730, mutexattr: 0x0000000100400758)
>
>
> I suspect that the static nature of AllThreadStatuses could be involved.
> Any thought?
I'm pretty sure you are correct about the static initialisation order.
One problem that your trace points out though is that we will get a
_different_ initialisation for each file including Mutex.h due to the
anonymous namespace used here - this wasn't the intention of this code.
I guess this needs to be fixed by creating a Mutex.cpp to hold this
anonymous namespace. However I'm not sure whether or not this is the
cause of your problem,although it can't help.
Andrew