You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stdcxx.apache.org by Liviu Nicoara <ni...@hates.ms> on 2012/09/15 19:02:41 UTC

STDCXX-1066 [was: Re: STDCXX forks]

On 9/1/12 1:52 PM, Stefan Teleman wrote:> On Sat, Sep 1, 2012 at 12:15 PM, >
 > I opened yesterday STDCXX-1066:
 >
 > https://issues.apache.org/jira/browse/STDCXX-1056
 >
 > about the pthread_mutex_t/pthread_cond_t alignment on SPARCV8. I'll
 > have patches done this weekend. Achtung: the patchset is very large
 > and touches a very large number of files. It's strange that I didn't
 > get an email about STDCXX-1066.
 >

Hi Stefan,

I have read through the patches attached to the incident, then I briefly read 
about the SunPro pragma align and pack. Two questions:

1. AFAICT, the use of the packing pragma may interfere with a user's setting of 
the same value. I.e., a user sets the packing in their sources and then, 
directly or not, includes an STDCXX header. It seems to me that in such a 
situation, our setting of the packing value would interfere with the rest of the 
user's translation unit, since there is no way to `restore' the previous packing 
value.

Something along the lines of:

// user source file

#pragma pack (X) // X != 8

#include <iostream>

struct UserDef
{
     // different alignment than X ?
     // ...
};

Is my understanding correct?

2. The patches are against 4.2.1, but the change would be binary incompatible 
with the already released 4.2.1 branch. Do you plan to have this fix in 4.3.x?

Thanks.

Liviu

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Liviu Nicoara <ni...@hates.ms>.
On 9/23/12 3:48 PM, Stefan Teleman wrote:
> On Sun, Sep 23, 2012 at 3:26 PM, Liviu Nicoara <ni...@hates.ms> wrote:
>
>> To be honest it's quite bizarre that you cannot share that with us. Is it
>> really a trade secret? How can that be the case if Oracle customers are also
>> required to perform the same alignment, perhaps using the same techniques
>> like you showed in the patch?
>
> That's the problem. I don't know what is and what is not a trade
> secret, or copyrighted information, or "unauthorized intellectual
> property disclosure" anymore.  I think it's in the eye of the
> beholder. At Sun it was very clear.
>
> Believe it or not, I had to get written approval from the Legal
> Counsel's Office in order to be able to share these patches. And that
> in spite of the fact that these patches are published, and have
> already been published for *years*.

That clarifies things. Thanks.

Liviu


Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Stefan Teleman <st...@gmail.com>.
On Sun, Sep 23, 2012 at 3:26 PM, Liviu Nicoara <ni...@hates.ms> wrote:

> To be honest it's quite bizarre that you cannot share that with us. Is it
> really a trade secret? How can that be the case if Oracle customers are also
> required to perform the same alignment, perhaps using the same techniques
> like you showed in the patch?

That's the problem. I don't know what is and what is not a trade
secret, or copyrighted information, or "unauthorized intellectual
property disclosure" anymore.  I think it's in the eye of the
beholder. At Sun it was very clear.

Believe it or not, I had to get written approval from the Legal
Counsel's Office in order to be able to share these patches. And that
in spite of the fact that these patches are published, and have
already been published for *years*.

IANAL and I don't want to play one on TV. ;-)

--Stefan

-- 
Stefan Teleman
KDE e.V.
stefan.teleman@gmail.com

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Liviu Nicoara <ni...@hates.ms>.
On 9/23/12 2:02 PM, Stefan Teleman wrote:
> On Sun, Sep 23, 2012 at 10:35 AM, Liviu Nicoara <ni...@hates.ms> wrote:
>
>
>> 2. The issue only exists in MT builds, should there be a guard in configs?
>
> Yes, good point. The reason they aren't there is because we don't
> actually provide a non-MT stdcxx at all in Solaris. I'll fix this.

I remember you mentioned it.

>
>> 4. I see rw/_mutex.h has alignment pragmas for both __rw_mutex_base class
>> and its mutex member; same for __rw_static_mutex and its static member, etc.
>> How does that work?
>
> It works. ;-) And it actually acts as a space saver. Wink-wink.

Nudge-nudge.

To be honest it's quite bizarre that you cannot share that with us. Is it really 
a trade secret? How can that be the case if Oracle customers are also required 
to perform the same alignment, perhaps using the same techniques like you showed 
in the patch?

>
> But I don't think the _C_mutex member is static. In rw/_mutex.h,
> _RWSTD_MUTEX_T is #defined as:
>
> #include <pthread.h>
> // [ ... snip ... ]
> #  define _RWSTD_MUTEX_T                pthread_mutex_t
>
> (for the definition Solaris cares about, which is POSIX).
>
> So, in
>
> class _RWSTD_EXPORT __rw_mutex_base
> {
> public:
>
> // [ ... snip ... ]
>
>      _RWSTD_MUTEX_T _C_mutex;
> };
>
> it looks like it's not declared static.

I meant the static member of __rw_static_mutex.

>
>> 5. Why is __rw_guard aligned explicitly? I see it only contains a pointer to
>> a mutex object.
>
> So that the __rw_mutex_base pointer ends up 8-byte aligned. There's a
> lot of juju going on here.
>
>> 6. The docs mention that the pragma must use the mangled variables names but I don't see that in the patch.
>
> Yes, a few things are a bit different. ;-) I wish I didn't have to be
> as vague and secretive about these things as I have to be.

Well, that's a pity. Thanks anyway.

Liviu

RE: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Travis Vitek <Tr...@roguewave.com>.
Comments below...

> -----Original Message-----
> From: Stefan Teleman
> Sent: Monday, September 24, 2012 5:46 AM
> Subject: Re: STDCXX-1066 [was: Re: STDCXX forks]
> 
> On Mon, Sep 24, 2012 at 8:21 AM, Liviu Nicoara <ni...@hates.ms>
> wrote:
> 
> > In the light of your inability to answer the simplest questions about
> > the correctness and usefulness of this patch, I propose we strike the
> > patch in its entirety.
> 
> Let me make something very clear to you: what I am doing here is a
> courtesy to the stdcxx project.

Thank you for your contribution. There is a good chance that without your activity on this project over the last few months that it would be in the attic.

> There is no requirement in my job
> description to waste hours arguing with you in pointless squabbles
> over email. Nor is there a requirement in the APL V2.0 which would
> somehow compel us to redistribute our source code changes.

The problem here is that code changes are expected to pass review. They must follow established conventions, solve the problem in a reasonable and portable way, provide a test case (where one doesn't already exist), as well as make sense to everyone who is working with the product.

Several of the changes included don't make sense to the rest of us. Unless we manage to figure them out on our own or you manage to explain them to us, then the changes should probably be excluded from the code.

For example, the changes to src/exception.cpp don't make sense to me. Here is the relevant diff...

  +#if defined(_RWSTD_STRICT_SPARCV8_MUTEX_ALIGNMENT)
  +#  pragma pack(8)
  +#  pragma align 8(__rw_aligned_buffer)
  +#endif
  +
       // facet must never be destroyed
       static __rw_aligned_buffer<_Msgs> msgs;

It seems that we trying to give 8-byte alignment to a class of type __rw_aligned_buffer. The point of __rw_aligned_buffer is to give us a block of properly aligned data, and if it isn't aligned, then it is broken. So, why are we using a pragma for alignment here (and potentially in other places) when it seems that there is a bug in __rw_aligned_buffer that we should be addressing? Of course, if this is a problem with aligned buffer, we need a testcase.

Another example are the changes to src/ctype.cpp

  +#if defined(_RWSTD_STRICT_SPARCV8_MUTEX_ALIGNMENT)
  +#  pragma pack(8)
  +#  pragma align 8(f)
  +#  pragma align 8(pf)
  +#endif
           static union {
  +#if defined(_RWSTD_STRICT_SPARCV8_MUTEX_ALIGNMENT)
  +            unsigned long long align_;
  +            unsigned char data_ [sizeof (_STD::ctype<char>)];
  +#else
               void *align_;
               char  data_ [sizeof (_STD::ctype<char>)];
  +#endif
           } f;
           static __rw_facet* const pf =
               new (&f) _STD::ctype<char>(__rw_classic_tab, false, ref);
           pfacet = pf;
  +#if defined(_RWSTD_STRICT_SPARCV8_MUTEX_ALIGNMENT)
  +#  pragma pack(0)
  +#endif

This change seems to give alignment to `f' in two different ways, and it seems to be applying alignment to the pointer (the issue Liviu has brought up). It seems like the simpler fix would be to unconditionally add union members to `f' so that we get the proper alignment (much like we try to do in __rw_aligned_buffer), or an even better solution would be to use __rw_aligned_buffer.

> 
> > We could and should re-work the instances in the library where
> > we might use mutex and condition objects that are misaligned wrt the
> > mentioned kernel update.
> 
> Yeah, why don't you go ahead and do that. Just like you fixed stdcxx-
> 1056.
> 

There is no need for comments like this.

Stefan, if you are feeling singled out and attacked because your patches aren't accepted without question, it might be a good idea to go back and look through the mailing list archives. We have _all_ been in this position. Nearly every time I posted a patch for the first year working on stdcxx was like this, and I've been doing C++ software for 15 years.

The company where I work currently has a review process that is just as rigorous as what you're seeing with stdcxx now. The process isn't there to attack anyone personally, but to ensure the quality and maintainability of the library for years to come.

Travis



Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Stefan Teleman <st...@gmail.com>.
On Mon, Sep 24, 2012 at 8:21 AM, Liviu Nicoara <ni...@hates.ms> wrote:

> In the light of your inability to answer the simplest questions about the
> correctness and usefulness of this patch, I propose we strike the patch in
> its entirety.

Let me make something very clear to you: what I am doing here is a
courtesy to the stdcxx project. There is no requirement in my job
description to waste hours arguing with you in pointless squabbles
over email. Nor is there a requirement in the APL V2.0 which would
somehow compel us to redistribute our source code changes.

 > We could and should re-work the instances in the library where
> we might use mutex and condition objects that are misaligned wrt the
> mentioned kernel update.

Yeah, why don't you go ahead and do that. Just like you fixed stdcxx-1056.

--Stefan

-- 
Stefan Teleman
KDE e.V.
stefan.teleman@gmail.com

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Liviu Nicoara <ni...@hates.ms>.
On 09/24/12 00:09, Stefan Teleman wrote:
> On Sun, Sep 23, 2012 at 7:25 PM, Liviu Nicoara <ni...@hates.ms> wrote:
>
>> I am not asking for any other implementation and I am not looking to change
>> anything. I wish you could explain it to us, but in the absence of trade
>> secret details I will take an explanation for the questions above.
>
> Sorry, no.
>
> This will not be another replay of the stdcxx-1056 email discussion,
> which was a three week waste of my time.
>
> The patch is provided "AS IS". No further testing and no further
> comments. I do have more important things to do.

In the light of your inability to answer the simplest questions about the correctness and usefulness of this patch, I propose we strike the patch in its entirety. We could and should re-work the instances in the library where we might use mutex and condition objects that are misaligned wrt the mentioned kernel update.

Liviu


Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Stefan Teleman <st...@gmail.com>.
On Sun, Sep 23, 2012 at 7:25 PM, Liviu Nicoara <ni...@hates.ms> wrote:

> I am not asking for any other implementation and I am not looking to change
> anything. I wish you could explain it to us, but in the absence of trade
> secret details I will take an explanation for the questions above.

Sorry, no.

This will not be another replay of the stdcxx-1056 email discussion,
which was a three week waste of my time.

The patch is provided "AS IS". No further testing and no further
comments. I do have more important things to do.

-- 
Stefan Teleman
KDE e.V.
stefan.teleman@gmail.com

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Liviu Nicoara <ni...@hates.ms>.
On 9/23/12 6:47 PM, Stefan Teleman wrote:
> On Sun, Sep 23, 2012 at 6:19 PM, Liviu Nicoara <ni...@hates.ms> wrote:
>> On 9/23/12 5:23 PM, Stefan Teleman wrote:
>>>
>>> On Sun, Sep 23, 2012 at 4:58 PM, Liviu Nicoara <ni...@hates.ms> wrote:
>>>
>>>> Stefan, I stumbled upon this: http://tinyurl.com/ceet6ec and this:
>>>> http://tinyurl.com/c4h9mgl
>>>
>>>
>>> The first URL is Fujitsu. It doesn't mention anything about the
>>> side-effects of KU-137111. It's just a description on how to apply
>>> kernel patches.
>>
>>
>> Hold on. If you go in the bottom half of that page you will see:
>>
>> *2) The following shows an example of programming that causes the
>>            above problem.
>>
>>          <In the case where the problem occurs>
>>            ----------------------------------------------------------------
>>            int       *ip;
>>            mutex_t   *mp;
>>            ip = (int *) malloc(sizeof (int) + sizeof (mutex_t));
>>            mp = (mutex_t *) (ip + 1);
>>                               /* The address is used with a modification */
>>            ----------------------------------------------------------------
>>
>>          <In the case where the problem does not occur -1>
>>            ----------------------------------------------------------------
>>            mutex_t   mp;                          /* Obtained statically */
>>            ----------------------------------------------------------------
>>
>>          <In the case where the problem does not occur -2>
>>            ----------------------------------------------------------------
>>            mutex_t   *mp;
>>            mp = (mutex_t *) malloc(sizeof (mutex_t));
>>                         /* The address is used without any modifications */
>>            ----------------------------------------------------------------
>
> First of all, this is C. And it's using the mutex_t Solaris type directly.
>
> We're in C++, where the pthread_mutex_t type is a data member of a
> class.

You're saying that the alignment of objects of type S:

struct S {
     pthread_mutex_t mutex_;
};

can be less strict than the alignment of the pthread_mutex_t member? Say, for 
this example:

int main () {
     S s;
     printf ("%p\n", &s);
}

I could see an address that is not a multiple of 8? What does it print on your 
SPARC V8 machine?

> In some cases it's a pointer (such as __rw_guard).

The alignment of a pthread_mutex_t* object has no bearing on the alignment of 
the pthread_mutex_t object it points to. Why would you align the pointer to an 
object to the alignment requirements for the actual object?

> In some other cases it's part of a derived type.

Could you please give us a small example where the alignment of a mutex member 
in a sub-object of a derived-type object is not a multiple of 8?

>
>> I am not saying I don't believe you. But you have to give us something more
>> than "trust me, I know what I'm doing". You have to admit that the patch
>> looks funny, e.g., in that it does not follow the published documentation
>> for Solaris Studio 12.3.
>
> It doesn't look funny to me. It looks like it's working.
>
> Exactly what is it that I should give you, other than a tested,
> working, already in production patch? A different implementation which
> doesn't "look funny"?

I am not asking for any other implementation and I am not looking to change 
anything. I wish you could explain it to us, but in the absence of trade secret 
details I will take an explanation for the questions above.

Thanks!

Liviu


Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Stefan Teleman <st...@gmail.com>.
On Sun, Sep 23, 2012 at 6:19 PM, Liviu Nicoara <ni...@hates.ms> wrote:
> On 9/23/12 5:23 PM, Stefan Teleman wrote:
>>
>> On Sun, Sep 23, 2012 at 4:58 PM, Liviu Nicoara <ni...@hates.ms> wrote:
>>
>>> Stefan, I stumbled upon this: http://tinyurl.com/ceet6ec and this:
>>> http://tinyurl.com/c4h9mgl
>>
>>
>> The first URL is Fujitsu. It doesn't mention anything about the
>> side-effects of KU-137111. It's just a description on how to apply
>> kernel patches.
>
>
> Hold on. If you go in the bottom half of that page you will see:
>
> *2) The following shows an example of programming that causes the
>           above problem.
>
>         <In the case where the problem occurs>
>           ----------------------------------------------------------------
>           int       *ip;
>           mutex_t   *mp;
>           ip = (int *) malloc(sizeof (int) + sizeof (mutex_t));
>           mp = (mutex_t *) (ip + 1);
>                              /* The address is used with a modification */
>           ----------------------------------------------------------------
>
>         <In the case where the problem does not occur -1>
>           ----------------------------------------------------------------
>           mutex_t   mp;                          /* Obtained statically */
>           ----------------------------------------------------------------
>
>         <In the case where the problem does not occur -2>
>           ----------------------------------------------------------------
>           mutex_t   *mp;
>           mp = (mutex_t *) malloc(sizeof (mutex_t));
>                        /* The address is used without any modifications */
>           ----------------------------------------------------------------

First of all, this is C. And it's using the mutex_t Solaris type directly.

We're in C++, where the pthread_mutex_t type is a data member of a
class. In some cases it's a pointer (such as __rw_guard). In some
other cases it's part of a derived type.

> I am not saying I don't believe you. But you have to give us something more
> than "trust me, I know what I'm doing". You have to admit that the patch
> looks funny, e.g., in that it does not follow the published documentation
> for Solaris Studio 12.3.

It doesn't look funny to me. It looks like it's working.

Exactly what is it that I should give you, other than a tested,
working, already in production patch? A different implementation which
doesn't "look funny"?

Even if such an implementation existed, we're not going to re-test it.
This has been tested once, and validated.

--Stefan

-- 
Stefan Teleman
KDE e.V.
stefan.teleman@gmail.com

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Liviu Nicoara <ni...@hates.ms>.
On 9/23/12 5:23 PM, Stefan Teleman wrote:
> On Sun, Sep 23, 2012 at 4:58 PM, Liviu Nicoara <ni...@hates.ms> wrote:
>
>> Stefan, I stumbled upon this: http://tinyurl.com/ceet6ec and this:
>> http://tinyurl.com/c4h9mgl
>
> The first URL is Fujitsu. It doesn't mention anything about the
> side-effects of KU-137111. It's just a description on how to apply
> kernel patches.

Hold on. If you go in the bottom half of that page you will see:

*2) The following shows an example of programming that causes the
           above problem.

         <In the case where the problem occurs>
           ----------------------------------------------------------------
           int       *ip;
           mutex_t   *mp;
           ip = (int *) malloc(sizeof (int) + sizeof (mutex_t));
           mp = (mutex_t *) (ip + 1);
                              /* The address is used with a modification */
           ----------------------------------------------------------------

         <In the case where the problem does not occur -1>
           ----------------------------------------------------------------
           mutex_t   mp;                          /* Obtained statically */
           ----------------------------------------------------------------

         <In the case where the problem does not occur -2>
           ----------------------------------------------------------------
           mutex_t   *mp;
           mp = (mutex_t *) malloc(sizeof (mutex_t));
                        /* The address is used without any modifications */
           ----------------------------------------------------------------

>
> What I can share with you is:
>
> 1. The SPARC alignment patches as I have submitted them here are
> present in the official production stdcxx packages released with
> Solaris 10 and Solaris 11.
>
> 2. These patches did not make into the Solaris source code tree and a
> Solaris product without extensive internal code review. Not to mention
> testing.

I am not saying I don't believe you. But you have to give us something more than 
"trust me, I know what I'm doing". You have to admit that the patch looks funny, 
e.g., in that it does not follow the published documentation for Solaris Studio 
12.3.

Thanks,
Liviu

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Liviu Nicoara <ni...@hates.ms>.
On 9/23/12 5:50 PM, Stefan Teleman wrote:
> On Sun, Sep 23, 2012 at 5:23 PM, Stefan Teleman
> <st...@gmail.com> wrote:
>
>> The second URL says this:
>>
>> <QUOTE>
>> Due to a change in the implementation of the userland mutexes
>> introduced by CR 6296770 in KU 137111-01, objects of type mutex_t and
>> pthread_mutex_t must start at 8-byte aligned addresses. If this
>> requirement is not satisfied, all non-compliant applications on
>> Solaris/SPARC may fail with the signal SEGV with a callstack similar
>> to the following one or with similar callstacks containing the
>> function mutex_trylock_process.
>>
>>    \*_atomic_cas_64(0x141f2c, 0x0, 0xff000000, 0x1651, 0xff000000, 0x466d90)
>>    set_lock_byte64(0x0, 0x1651, 0xff000000, 0x0, 0xfec82a00, 0x0)
>>    fast_process_lock(0x141f24, 0x0, 0x1, 0x1, 0x0, 0xfeae5780)
>>
>> </QUOTE>
>
> Here's a link to an official datatype alignment table for SPARCV8:
>
> http://docs.oracle.com/cd/E19205-01/819-5267/bkbkl/index.html
>
> The interesting table is:
>
> Table B–2 Storage Sizes and Default Alignments in Bytes

I see nothing really outstanding here. What is it that I should pay attention to?

Thanks,
Liviu

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Stefan Teleman <st...@gmail.com>.
On Sun, Sep 23, 2012 at 5:23 PM, Stefan Teleman
<st...@gmail.com> wrote:

> The second URL says this:
>
> <QUOTE>
> Due to a change in the implementation of the userland mutexes
> introduced by CR 6296770 in KU 137111-01, objects of type mutex_t and
> pthread_mutex_t must start at 8-byte aligned addresses. If this
> requirement is not satisfied, all non-compliant applications on
> Solaris/SPARC may fail with the signal SEGV with a callstack similar
> to the following one or with similar callstacks containing the
> function mutex_trylock_process.
>
>   \*_atomic_cas_64(0x141f2c, 0x0, 0xff000000, 0x1651, 0xff000000, 0x466d90)
>   set_lock_byte64(0x0, 0x1651, 0xff000000, 0x0, 0xfec82a00, 0x0)
>   fast_process_lock(0x141f24, 0x0, 0x1, 0x1, 0x0, 0xfeae5780)
>
> </QUOTE>

Here's a link to an official datatype alignment table for SPARCV8:

http://docs.oracle.com/cd/E19205-01/819-5267/bkbkl/index.html

The interesting table is:

Table B–2 Storage Sizes and Default Alignments in Bytes

--Stefan

-- 
Stefan Teleman
KDE e.V.
stefan.teleman@gmail.com

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Stefan Teleman <st...@gmail.com>.
On Sun, Sep 23, 2012 at 4:58 PM, Liviu Nicoara <ni...@hates.ms> wrote:

> Stefan, I stumbled upon this: http://tinyurl.com/ceet6ec and this:
> http://tinyurl.com/c4h9mgl

The first URL is Fujitsu. It doesn't mention anything about the
side-effects of KU-137111. It's just a description on how to apply
kernel patches.

The second URL says this:

<QUOTE>
Due to a change in the implementation of the userland mutexes
introduced by CR 6296770 in KU 137111-01, objects of type mutex_t and
pthread_mutex_t must start at 8-byte aligned addresses. If this
requirement is not satisfied, all non-compliant applications on
Solaris/SPARC may fail with the signal SEGV with a callstack similar
to the following one or with similar callstacks containing the
function mutex_trylock_process.

  \*_atomic_cas_64(0x141f2c, 0x0, 0xff000000, 0x1651, 0xff000000, 0x466d90)
  set_lock_byte64(0x0, 0x1651, 0xff000000, 0x0, 0xfec82a00, 0x0)
  fast_process_lock(0x141f24, 0x0, 0x1, 0x1, 0x0, 0xfeae5780)

</QUOTE>

I think that's pretty clear.

What I can share with you is:

1. The SPARC alignment patches as I have submitted them here are
present in the official production stdcxx packages released with
Solaris 10 and Solaris 11.

2. These patches did not make into the Solaris source code tree and a
Solaris product without extensive internal code review. Not to mention
testing.

--Stefan

-- 
Stefan Teleman
KDE e.V.
stefan.teleman@gmail.com

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Liviu Nicoara <ni...@hates.ms>.
On 9/23/12 2:02 PM, Stefan Teleman wrote:
> On Sun, Sep 23, 2012 at 10:35 AM, Liviu Nicoara <ni...@hates.ms> wrote:
>
> [...]
>> 4. I see rw/_mutex.h has alignment pragmas for both __rw_mutex_base class
>> and its mutex member; same for __rw_static_mutex and its static member, etc.
>> How does that work?
>
> It works. ;-) And it actually acts as a space saver. Wink-wink.

Stefan, I stumbled upon this: http://tinyurl.com/ceet6ec and this: 
http://tinyurl.com/c4h9mgl

Both (but esp. the first one) seem to indicate that classes like __rw_mutex_base:

struct __rw_mutex_base
{
     mutex_t _C_mutex;
};

and __rw_static_mutex:

template< typename T >
struct __rw_static_mutex
{
     static mutex_t _C_mutex;
};

do not need special alignment pragmas (btw, I did not find any in the Solaris 
Studio for SPARC headers anywhere). It is puzzling, esp. since you cannot share 
the details with us. How do you suggest we clarify this?

Thanks!

Liviu


Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Stefan Teleman <st...@gmail.com>.
On Sun, Sep 23, 2012 at 10:35 AM, Liviu Nicoara <ni...@hates.ms> wrote:

> I can't think of a valid scenario, either. I guess the point is moot.

I really don't think we should worry about this particular scenario.
For example if someone really wants to experiment with saying

#pragma align(2)

before #including any system header files, and then testing if their
program still works, more power to them. My bet is that this will not,
ever, work, under any circumstances.

> A few more questions, if you will, as I am going through the changes:
>
> 1. I see similarities with 1040, should/would you close that one?

Oh I completely forgot about that one. Yes I will close it as
duplicate of this one, because these patches attached here are in
production and over-tested. I don't even remember if the old patches
are identical or not.

> 2. The issue only exists in MT builds, should there be a guard in configs?

Yes, good point. The reason they aren't there is because we don't
actually provide a non-MT stdcxx at all in Solaris. I'll fix this.

> 3. The align reference docs talk only about aligning variables, not types.
> Is that different on SPARC?

It can be. ;-)

> 4. I see rw/_mutex.h has alignment pragmas for both __rw_mutex_base class
> and its mutex member; same for __rw_static_mutex and its static member, etc.
> How does that work?

It works. ;-) And it actually acts as a space saver. Wink-wink.

But I don't think the _C_mutex member is static. In rw/_mutex.h,
_RWSTD_MUTEX_T is #defined as:

#include <pthread.h>
// [ ... snip ... ]
#  define _RWSTD_MUTEX_T                pthread_mutex_t

(for the definition Solaris cares about, which is POSIX).

So, in

class _RWSTD_EXPORT __rw_mutex_base
{
public:

// [ ... snip ... ]

    _RWSTD_MUTEX_T _C_mutex;
};

it looks like it's not declared static.

> 5. Why is __rw_guard aligned explicitly? I see it only contains a pointer to
> a mutex object.

So that the __rw_mutex_base pointer ends up 8-byte aligned. There's a
lot of juju going on here.

> 6. The docs mention that the pragma must use the mangled variables names but I don't see that in the patch.

Yes, a few things are a bit different. ;-) I wish I didn't have to be
as vague and secretive about these things as I have to be.

--Stefan

-- 
Stefan Teleman
KDE e.V.
stefan.teleman@gmail.com

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Liviu Nicoara <ni...@hates.ms>.
On 09/16/12 12:03, Stefan Teleman wrote:
> On Sat, Sep 15, 2012 at 5:47 PM, Liviu Nicoara <ni...@hates.ms> wrote:
>
>> I merely wanted to point out that restoring the default packing is not the
>> same as restoring the packing to the previous value in effect.
>>
>> Given this, I thought about an alternative way of forcing this alignment,
>> e.g., via a union, aligned on an appropriate type. I see an advantage here
>> in that most of the changes would occur where we define the 'primitive'
>> mutex and condition wrappers, saving a few pre-processor conditionals and
>> pragmas along the way.
>
> I understood what you were saying. I just don't understand under what
> _sane_ circumstances a program would #include a system library header
> file with a previously set packing to something other than default. Or
> would expect a non-default packing to work during a function call to a
> sytem library. That's an insane configuration on any operating system
> that I know of, not just on SPARCV8.

I can't think of a valid scenario, either. I guess the point is moot.

A few more questions, if you will, as I am going through the changes:

1. I see similarities with 1040, should/would you close that one?
2. The issue only exists in MT builds, should there be a guard in configs?
3. The align reference docs talk only about aligning variables, not types. Is that different on SPARC?
4. I see rw/_mutex.h has alignment pragmas for both __rw_mutex_base class and its mutex member; same for __rw_static_mutex and its static member, etc. How does that work?
5. Why is __rw_guard aligned explicitly? I see it only contains a pointer to a mutex object.
6. The docs mention that the pragma must use the mangled variables names but I don't see that in the patch.

Thanks!

Liviu




Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Stefan Teleman <st...@gmail.com>.
On Sat, Sep 15, 2012 at 5:47 PM, Liviu Nicoara <ni...@hates.ms> wrote:

> I merely wanted to point out that restoring the default packing is not the
> same as restoring the packing to the previous value in effect.
>
> Given this, I thought about an alternative way of forcing this alignment,
> e.g., via a union, aligned on an appropriate type. I see an advantage here
> in that most of the changes would occur where we define the 'primitive'
> mutex and condition wrappers, saving a few pre-processor conditionals and
> pragmas along the way.

I understood what you were saying. I just don't understand under what
_sane_ circumstances a program would #include a system library header
file with a previously set packing to something other than default. Or
would expect a non-default packing to work during a function call to a
sytem library. That's an insane configuration on any operating system
that I know of, not just on SPARCV8.

In order to make any change to the packing/alignment patches now I
would have to re-run _all_ the validation tests for stdcxx on SPARC,
on Solaris 10 and Solaris 11. I'm pretty sure that is not workable at
this point.

--Stefan

-- 
Stefan Teleman
KDE e.V.
stefan.teleman@gmail.com

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Liviu Nicoara <ni...@hates.ms>.
On 9/15/12 5:19 PM, Stefan Teleman wrote:
> On Sat, Sep 15, 2012 at 4:57 PM, Liviu Nicoara <ni...@hates.ms> wrote:
>
>> Yes, but it restores the default packing, not an arbitrary one, potentially
>> set by the user prior to including our headers. Say, the user sets 2, the
>> default is 4 and we set 8. When we set it to default it goes back to 4,
>> instead of the expected 2. Did I get this right?
>
> This is true, but leaving some arbirary pragma pack(N) (for N != 0) in
> effect for the duration of a program, and expecting it to work sounds
> like a very defective programming approach to me. It will certainly
> not work on SPARC at all.  if the program needs to pack something in a
> certain specific way, it need to do so for that specific something
> only. Otherwise the side-effects of globally setting a non-default
> packing will destroy the program anyway.

I merely wanted to point out that restoring the default packing is not the same 
as restoring the packing to the previous value in effect.

Given this, I thought about an alternative way of forcing this alignment, e.g., 
via a union, aligned on an appropriate type. I see an advantage here in that 
most of the changes would occur where we define the 'primitive' mutex and 
condition wrappers, saving a few pre-processor conditionals and pragmas along 
the way.

What do you think?

Liviu

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Stefan Teleman <st...@gmail.com>.
On Sat, Sep 15, 2012 at 4:57 PM, Liviu Nicoara <ni...@hates.ms> wrote:

> Yes, but it restores the default packing, not an arbitrary one, potentially
> set by the user prior to including our headers. Say, the user sets 2, the
> default is 4 and we set 8. When we set it to default it goes back to 4,
> instead of the expected 2. Did I get this right?

This is true, but leaving some arbirary pragma pack(N) (for N != 0) in
effect for the duration of a program, and expecting it to work sounds
like a very defective programming approach to me. It will certainly
not work on SPARC at all. if the program needs to pack something in a
certain specific way, it need to do so for that specific something
only. Otherwise the side-effects of globally setting a non-default
packing will destroy the program anyway.

--Stefan

-- 
Stefan Teleman
KDE e.V.
stefan.teleman@gmail.com

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Liviu Nicoara <ni...@hates.ms>.
On 9/15/12 2:57 PM, Stefan Teleman wrote:
> On Sat, Sep 15, 2012 at 1:02 PM, Liviu Nicoara <ni...@hates.ms> wrote:
>
>> I have read through the patches attached to the incident, then I briefly
>> read about the SunPro pragma align and pack. Two questions:
>>
>> 1. AFAICT, the use of the packing pragma may interfere with a user's setting
>> of the same value. I.e., a user sets the packing in their sources and then,
>> directly or not, includes an STDCXX header. It seems to me that in such a
>> situation, our setting of the packing value would interfere with the rest of
>> the user's translation unit, since there is no way to `restore' the previous
>> packing value.
>>
>> Something along the lines of:
>>
>> // user source file
>>
>> #pragma pack (X) // X != 8
>>
>> #include <iostream>
>>
>> struct UserDef
>> {
>>      // different alignment than X ?
>>      // ...
>> };
>>
>> Is my understanding correct?
>
> Yes, you are indeed correct, Sir. :-)
>
> However, for every #pragma pack(8) there should be a corresponding
> subsequent #pragma pack(0). If there isn't, that's a patch bug.
> #pragma pack(0) restores the default alignment. So, there should be
> (there *must* be) no silent packing side-effects in user programs.

Yes, but it restores the default packing, not an arbitrary one, potentially set 
by the user prior to including our headers. Say, the user sets 2, the default is 
4 and we set 8. When we set it to default it goes back to 4, instead of the 
expected 2. Did I get this right?

>
>> 2. The patches are against 4.2.1, but the change would be binary
>> incompatible with the already released 4.2.1 branch. Do you plan to have
>> this fix in 4.3.x?
>
> Yes, definitely for 4.2.x and 4.3.x. I sent them for 4.2.1 just
> because that's what I have handy.

I see now. I tried to apply them to 4.2.x and some chunks failed, and I thought 
I was not applying them where they were intended to go.

Thanks!

Liviu

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Stefan Teleman <st...@gmail.com>.
On Sat, Sep 15, 2012 at 2:57 PM, Stefan Teleman
<st...@gmail.com> wrote:

> #pragma pack(0) restores the default alignment.

s/alignment/packing/g

--Stefan

-- 
Stefan Teleman
KDE e.V.
stefan.teleman@gmail.com

Re: STDCXX-1066 [was: Re: STDCXX forks]

Posted by Stefan Teleman <st...@gmail.com>.
On Sat, Sep 15, 2012 at 1:02 PM, Liviu Nicoara <ni...@hates.ms> wrote:

> I have read through the patches attached to the incident, then I briefly
> read about the SunPro pragma align and pack. Two questions:
>
> 1. AFAICT, the use of the packing pragma may interfere with a user's setting
> of the same value. I.e., a user sets the packing in their sources and then,
> directly or not, includes an STDCXX header. It seems to me that in such a
> situation, our setting of the packing value would interfere with the rest of
> the user's translation unit, since there is no way to `restore' the previous
> packing value.
>
> Something along the lines of:
>
> // user source file
>
> #pragma pack (X) // X != 8
>
> #include <iostream>
>
> struct UserDef
> {
>     // different alignment than X ?
>     // ...
> };
>
> Is my understanding correct?

Yes, you are indeed correct, Sir. :-)

However, for every #pragma pack(8) there should be a corresponding
subsequent #pragma pack(0). If there isn't, that's a patch bug.
#pragma pack(0) restores the default alignment. So, there should be
(there *must* be) no silent packing side-effects in user programs.

> 2. The patches are against 4.2.1, but the change would be binary
> incompatible with the already released 4.2.1 branch. Do you plan to have
> this fix in 4.3.x?

Yes, definitely for 4.2.x and 4.3.x. I sent them for 4.2.1 just
because that's what I have handy.

--Stefan

-- 
Stefan Teleman
KDE e.V.
stefan.teleman@gmail.com