You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Alan Modra <am...@gmail.com> on 2014/08/09 15:55:09 UTC

random-test failure on powerpc64le and x86_64

Using current git sources compiled with gcc-4.9.1 or mainline gcc at
-O3 results in failure of random-test.

On powerpc64le we see
PASS:  random-test 1: random delta test
svn_tests: E200006: mismatch at position xxxxx
FAIL:  random-test 2: random combine delta test

x86_64 gives
PASS:  random-test 1: random delta test
svn_tests: E200006: Test crashed (run in debugger with '--allow-segfaults')
FAIL:  random-test 2: random combine delta test

After some digging, I narrowed the failure down to
subversion/libsvn_delta/text_delta.c, and the specific -O3 options
causing the failure to -ftree-loop-vectorize -fvect-cost-model=dynamic. 
At first I thought I'd found a vectorizer bug, but the real problem is
a bug in the source, specifically this line in patterning_copy:

      *(apr_uint32_t *)(target) = *(apr_uint32_t *)(source);

Quoting from ISO/IEC 9899:1999
6.3.2.3 Pointers
...
7 A pointer to an object or incomplete type may be converted to a
pointer to a different object or incomplete type. If the resulting
pointer is not correctly aligned for the pointed-to type, the behavior
is undefined.

So here we have undefined behaviour if "source" and "target" are not
4-byte aligned..  Fixed as follows.

Index: subversion/libsvn_delta/text_delta.c
===================================================================
--- subversion/libsvn_delta/text_delta.c	(revision 1616755)
+++ subversion/libsvn_delta/text_delta.c	(working copy)
@@ -669,14 +669,15 @@
 
 #if SVN_UNALIGNED_ACCESS_IS_OK
 
-  if (source + sizeof(apr_uint32_t) <= target)
+  typedef apr_uint32_t __attribute__ ((aligned (1))) unaligned32;
+  if (source + sizeof(unaligned32) <= target)
     {
       /* Source and target are at least 4 bytes apart, so we can copy in
        * 4-byte chunks.  */
-      for (; source + sizeof(apr_uint32_t) <= end;
-           source += sizeof(apr_uint32_t),
-           target += sizeof(apr_uint32_t))
-      *(apr_uint32_t *)(target) = *(apr_uint32_t *)(source);
+      for (; source + sizeof(unaligned32) <= end;
+           source += sizeof(unaligned32),
+           target += sizeof(unaligned32))
+      *(unaligned32 *)target = *(unaligned32 *)source;
     }
 
 #endif

-- 
Alan Modra
Australia Development Lab, IBM

Re: random-test failure on powerpc64le and x86_64

Posted by Branko Čibej <br...@wandisco.com>.
On 11.08.2014 01:38, Alan Modra wrote:
> On Sat, Aug 09, 2014 at 08:08:21PM +0200, Branko Čibej wrote:
>> The way to fix this is to make sure that the macro
>> SVN_UNALIGNED_ACCESS_IS_OK gives the correct answer; and it's OK if
>> that answer compiler-specific, not just platform-specific. In other
>> words, it's fine if the macro gives a different answer depending on
>> GCC vectorizer options.
> If it were up to me I'd revert r956593 entirely.  Thinking you can
> optimise memcpy is just plain wrong-headed,

I fully agree this is the case today. It probably wasn't when the code
was written. Still, it makes sense to look at all our
unaligned-access-specific code again.

>  and this is the second bug
> found in patterning_copy().  Hmm, actually if you want to optimise
> patterning_copy() then something that might make sense is to detect
> the overlap case and copy the original repeating sequence multiple
> times, rather than copying what has just been written.  That is likely
> to give better cache performance especially on processors that suffer
> from load-hit-store stalls.

You're probably right. Back when I wrote the original code (which, IIRC,
didn't use any magic unaligned accesses ...), I was more concerned about
being able to read what I'd written. :)

-- Brane


-- 
Branko Čibej | Director of Subversion
WANdisco | Realising the impossibilities of Big Data
e. brane@wandisco.com

Re: random-test failure on powerpc64le and x86_64

Posted by Alan Modra <am...@gmail.com>.
On Sat, Aug 09, 2014 at 08:08:21PM +0200, Branko Čibej wrote:
> On 09.08.2014 15:55, Alan Modra wrote:
> > Using current git sources compiled with gcc-4.9.1 or mainline gcc at
> > -O3 results in failure of random-test.
> >
> > On powerpc64le we see
> > PASS:  random-test 1: random delta test
> > svn_tests: E200006: mismatch at position xxxxx
> > FAIL:  random-test 2: random combine delta test
> >
> > x86_64 gives
> > PASS:  random-test 1: random delta test
> > svn_tests: E200006: Test crashed (run in debugger with '--allow-segfaults')
> > FAIL:  random-test 2: random combine delta test
> >
> > After some digging, I narrowed the failure down to
> > subversion/libsvn_delta/text_delta.c, and the specific -O3 options
> > causing the failure to -ftree-loop-vectorize -fvect-cost-model=dynamic. 
> > At first I thought I'd found a vectorizer bug, but the real problem is
> > a bug in the source, specifically this line in patterning_copy:
> >
> >       *(apr_uint32_t *)(target) = *(apr_uint32_t *)(source);
> >
> > Quoting from ISO/IEC 9899:1999
> > 6.3.2.3 Pointers
> > ...
> > 7 A pointer to an object or incomplete type may be converted to a
> > pointer to a different object or incomplete type. If the resulting
> > pointer is not correctly aligned for the pointed-to type, the behavior
> > is undefined.
> >
> > So here we have undefined behaviour if "source" and "target" are not
> > 4-byte aligned..  Fixed as follows.
> 
> Nope. This code won't fly because it's not portable across compilers and
> platforms.

In that case, consider my email a bug report.  I've done the hard work
debugging and analyzing the problem.  Fixing it is trivial, for
instance by using
  typedef struct _unaligned { char u[4]; } unaligned32;
in the patch I posted rather than
  typedef apr_uint32_t __attribute__ ((aligned (1))) unaligned32;

> The way to fix this is to make sure that the macro
> SVN_UNALIGNED_ACCESS_IS_OK gives the correct answer; and it's OK if
> that answer compiler-specific, not just platform-specific. In other
> words, it's fine if the macro gives a different answer depending on
> GCC vectorizer options.

If it were up to me I'd revert r956593 entirely.  Thinking you can
optimise memcpy is just plain wrong-headed, and this is the second bug
found in patterning_copy().  Hmm, actually if you want to optimise
patterning_copy() then something that might make sense is to detect
the overlap case and copy the original repeating sequence multiple
times, rather than copying what has just been written.  That is likely
to give better cache performance especially on processors that suffer
from load-hit-store stalls.

-- 
Alan Modra
Australia Development Lab, IBM

Re: random-test failure on powerpc64le and x86_64

Posted by Branko Čibej <br...@wandisco.com>.
On 09.08.2014 15:55, Alan Modra wrote:
> Using current git sources compiled with gcc-4.9.1 or mainline gcc at
> -O3 results in failure of random-test.
>
> On powerpc64le we see
> PASS:  random-test 1: random delta test
> svn_tests: E200006: mismatch at position xxxxx
> FAIL:  random-test 2: random combine delta test
>
> x86_64 gives
> PASS:  random-test 1: random delta test
> svn_tests: E200006: Test crashed (run in debugger with '--allow-segfaults')
> FAIL:  random-test 2: random combine delta test
>
> After some digging, I narrowed the failure down to
> subversion/libsvn_delta/text_delta.c, and the specific -O3 options
> causing the failure to -ftree-loop-vectorize -fvect-cost-model=dynamic. 
> At first I thought I'd found a vectorizer bug, but the real problem is
> a bug in the source, specifically this line in patterning_copy:
>
>       *(apr_uint32_t *)(target) = *(apr_uint32_t *)(source);
>
> Quoting from ISO/IEC 9899:1999
> 6.3.2.3 Pointers
> ...
> 7 A pointer to an object or incomplete type may be converted to a
> pointer to a different object or incomplete type. If the resulting
> pointer is not correctly aligned for the pointed-to type, the behavior
> is undefined.
>
> So here we have undefined behaviour if "source" and "target" are not
> 4-byte aligned..  Fixed as follows.

Nope. This code won't fly because it's not portable across compilers and
platforms.

The way to fix this is to make sure that the macro
SVN_UNALIGNED_ACCESS_IS_OK gives the correct answer; and it's OK if that
answer compiler-specific, not just platform-specific. In other words,
it's fine if the macro gives a different answer depending on GCC
vectorizer options.

-- Brane


-- 
Branko Čibej | Director of Subversion
WANdisco | Realising the impossibilities of Big Data
e. brane@wandisco.com

Re: random-test failure on powerpc64le and x86_64

Posted by Alan Modra <am...@gmail.com>.
On Sun, Aug 17, 2014 at 12:59:21PM +0100, Branko Čibej wrote:
> On 09.08.2014 14:55, Alan Modra wrote:
> > Using current git sources compiled with gcc-4.9.1 or mainline gcc at
> > -O3 results in failure of random-test.
> >
> > On powerpc64le we see
> > PASS:  random-test 1: random delta test
> > svn_tests: E200006: mismatch at position xxxxx
> > FAIL:  random-test 2: random combine delta test
> >
> > x86_64 gives
> > PASS:  random-test 1: random delta test
> > svn_tests: E200006: Test crashed (run in debugger with '--allow-segfaults')
> > FAIL:  random-test 2: random combine delta test
> 
> Would you care to test this again with the change I made in r1618472?
> Thanks!

Looks good.

-- 
Alan Modra
Australia Development Lab, IBM

Re: random-test failure on powerpc64le and x86_64

Posted by Branko Čibej <br...@wandisco.com>.
On 09.08.2014 14:55, Alan Modra wrote:
> Using current git sources compiled with gcc-4.9.1 or mainline gcc at
> -O3 results in failure of random-test.
>
> On powerpc64le we see
> PASS:  random-test 1: random delta test
> svn_tests: E200006: mismatch at position xxxxx
> FAIL:  random-test 2: random combine delta test
>
> x86_64 gives
> PASS:  random-test 1: random delta test
> svn_tests: E200006: Test crashed (run in debugger with '--allow-segfaults')
> FAIL:  random-test 2: random combine delta test

Would you care to test this again with the change I made in r1618472?
Thanks!

(I did not remove fast_memcpy, on purpose; whilst I agree the code
should just use memcpy, I prefer to take these one at a time.)

-- Brane


-- 
Branko Čibej | Director of Subversion
WANdisco | Realising the impossibilities of Big Data
e. brane@wandisco.com