You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stdcxx.apache.org by Martin Sebor <ms...@gmail.com> on 2009/08/14 22:24:40 UTC

Soalris 10 10/2008 SPARC changes (was: Re: 4.2.2 release)

Thanks for the heads up and the patches! I'll review apply them
if possible/necessary before the release.

I took a quick glance at a few of the diff files and I wonder if
you could help me better understand some of the changes in case
they could be applied unconditionally.

For instance, in ctype.cpp.38.diff, it seems as though the second
hunk would be a good change to make regardless (when long long is
available). Ditto for the third hunk in locale_body.37.diff (minus
the pragmas, of course), and similarly in locale_classic.40.diff
and messages.cpp.41.diff.

The approach I'm thinking of using is the one you applied in
use_facet.h, i.e., defining, say, _RWSTD_ALIGN_MAX_T to unsigned
long long, and using it in all the aligned buffers, along with
unsigned char for the data.

Other than these, can you also help me understand the changes in
messages.cpp.41.diff (starting with the third hunk on line 92)?

Thanks again,
Martin

Stefan Teleman wrote:
> On Fri, Aug 14, 2009 at 15:18, Martin Sebor<ms...@gmail.com> wrote:
>> I think we should think about cutting a 4.2.2 release sometime this
>> month. It's been embarrassingly long since 4.2.1. Farid (or anyone
>> else), do you have anything that you'd like included in it?
>>
>> Martin
> 
> Hi.
> 
> Solaris 10 10/2008 SPARC has introduced a binary incompatible change
> in the POSIX and Solaris threads implementation:
> 
> http://docs.sun.com/app/docs/doc/820-5245/chapter2-1000?a=view
> 
> <QUOTE>
> 
> Objects of type mutex_t and pthread_mutex_t must start at 8-byte
> aligned addresses. Applications that do not satisfy this requirement
> fail. The following error message is displayed:
> 
> *** _THREAD_ERROR_DETECTION: lock usage error detected ***
> ...
> "mutex is misaligned"
> OR:
> "condvar is misaligned"
> 
> </QUOTE>
> 
> In reality, the run-time performance is much worse than the errata
> above claims: misaligned mutexes or conditional variables cause the
> program to spuriously SEGV in sometimes hard to reproduce ways [
> Heisenbug ].
> 
> You can view full details of this bug/change here:
> 
> http://bugs.opensolaris.org/view_bug.do?bug_id=6729759
> 
> To make a long story short, Solaris Kernel Update 137111-01 introduced
> an ABI incompatible implementation restriction, requiring that mutexes
> and conditional variables must be 8-byte aligned. This restriction has
> never been documented, nor has it ever been enforced, until Solaris 10
> 10/2008 [ Solaris Kernel Update 137111-01 ].
> 
> The consequence of this KU is that, the multi-threaded 32-bit SPARC
> version of the Apache Standard C++ Library [ 4.2.1 ] will no longer
> work, and will fail at run-time with seemingly unexplainable crashes [
> the exact same build will work on Solaris versions prior to Kernel
> Update 137111-01 ].
> 
> This problem is not specific to the Apache Standard C++ Library: it
> will occur with any 32-bit SPARCV8 binaries which do not align mutexes
> or conditional variables on an 8 byte boundary.
> 
> I have created a set of patches for the Apache Standard C++ Library,
> Version 4.2.1, for this problem:
> 
> http://s247136804.onlinehome.us/stdcxx-upstream/4.2.1/
> 
> You can download the tarball with all the patches from the same URL:
> 
> http://s247136804.onlinehome.us/stdcxx-upstream/4.2.1/stdcxx-upstream-patches.tar.bz2
> 
> These patches force an 8-byte alignment for all objects which contain
> a mutex or a conditional variable, and that only for SPARC. With these
> patches, all the tests perform as expected.
> 
> The patch 22.locale.numpunct.cpp.43.diff is not related to the SPARCV8
> ABI change -- it is simply an avoidance of a SEGV in case the variable
> first_non_c == NULL [ Solaris sprintf(3C) SEGV's on NULL char*
> arguments ].
> 
> --Stefan
> 


Re: Soalris 10 10/2008 SPARC changes

Posted by Martin Sebor <ms...@gmail.com>.
Stefan Teleman wrote:
> On Fri, Aug 14, 2009 at 16:24, Martin Sebor<ms...@gmail.com> wrote:
>> Thanks for the heads up and the patches! I'll review apply them
>> if possible/necessary before the release.
>>
>> I took a quick glance at a few of the diff files and I wonder if
>> you could help me better understand some of the changes in case
>> they could be applied unconditionally.
>>
>> For instance, in ctype.cpp.38.diff, it seems as though the second
>> hunk would be a good change to make regardless (when long long is
>> available). Ditto for the third hunk in locale_body.37.diff (minus
>> the pragmas, of course), and similarly in locale_classic.40.diff
>> and messages.cpp.41.diff.
>>
>> The approach I'm thinking of using is the one you applied in
>> use_facet.h, i.e., defining, say, _RWSTD_ALIGN_MAX_T to unsigned
>> long long, and using it in all the aligned buffers, along with
>> unsigned char for the data.
>>
>> Other than these, can you also help me understand the changes in
>> messages.cpp.41.diff (starting with the third hunk on line 92)?
> 
> That's my bad mistake -- i created a diff for src/messages.cpp from an
> older version of messages.cpp (hacked while i was trying to figure out
> why it was misbehaving so badly on 32-bit SPARCv8).
> 
> I updated the tarball and the messages.cpp.41.diff itself with a valid
> patch now. Sorry for the noise.

No problem. I'll take a look.

> 
> About
> 
> int ret = catclose (pcat_data->catd);
> 
> // ...
> 
> cat_closed = ret == 0;
> 
> My understanding of 22.2.7.1.2, p5 (catalog must be valid) is that  --
> if catclose(3C) fails, it means the catalog was not valid -- therefore
> the C++ library must throw. So, cat_closed is now true if and only if
> catclose(3C) succeeded.  Or is this too restrictive ?

You're right that we should check the value returned from catclose().
I think the facet guarantees that the catalog descriptor is valid so
I think we're safe in that regard. The case where I think there is
a problem is when catclose() fails because of a signal (with EINTR).
That would make __rw_catclose() and the facet dtor leak the catd.
Let me see about fixing that. Thanks for pointing it out!


> 
> About the other patches -- and whether or not they should be applied
> for other platforms besides Solaris/SPARC: i tried not to disturb any
> other ISA's/compiler combinations. If you'd like i can re-write the
> patches for 4.2.2 using the more generic approach you described.

Understood.

Thanks again!
Martin

> 
> --Stefan
> 


Re: Soalris 10 10/2008 SPARC changes (was: Re: 4.2.2 release)

Posted by Stefan Teleman <st...@gmail.com>.
On Fri, Aug 14, 2009 at 16:24, Martin Sebor<ms...@gmail.com> wrote:
> Thanks for the heads up and the patches! I'll review apply them
> if possible/necessary before the release.
>
> I took a quick glance at a few of the diff files and I wonder if
> you could help me better understand some of the changes in case
> they could be applied unconditionally.
>
> For instance, in ctype.cpp.38.diff, it seems as though the second
> hunk would be a good change to make regardless (when long long is
> available). Ditto for the third hunk in locale_body.37.diff (minus
> the pragmas, of course), and similarly in locale_classic.40.diff
> and messages.cpp.41.diff.
>
> The approach I'm thinking of using is the one you applied in
> use_facet.h, i.e., defining, say, _RWSTD_ALIGN_MAX_T to unsigned
> long long, and using it in all the aligned buffers, along with
> unsigned char for the data.
>
> Other than these, can you also help me understand the changes in
> messages.cpp.41.diff (starting with the third hunk on line 92)?

That's my bad mistake -- i created a diff for src/messages.cpp from an
older version of messages.cpp (hacked while i was trying to figure out
why it was misbehaving so badly on 32-bit SPARCv8).

I updated the tarball and the messages.cpp.41.diff itself with a valid
patch now. Sorry for the noise.

About

int ret = catclose (pcat_data->catd);

// ...

cat_closed = ret == 0;

My understanding of 22.2.7.1.2, p5 (catalog must be valid) is that  --
if catclose(3C) fails, it means the catalog was not valid -- therefore
the C++ library must throw. So, cat_closed is now true if and only if
catclose(3C) succeeded.  Or is this too restrictive ?

About the other patches -- and whether or not they should be applied
for other platforms besides Solaris/SPARC: i tried not to disturb any
other ISA's/compiler combinations. If you'd like i can re-write the
patches for 4.2.2 using the more generic approach you described.

--Stefan

-- 
Stefan Teleman
stefan.teleman@gmail.com