You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Alan Burlison (JIRA)" <ji...@apache.org> on 2015/06/26 12:48:04 UTC

[jira] [Created] (MAPREDUCE-6417) MapReduceClient's primitives.h is toxic and should be extirpated

Alan Burlison created MAPREDUCE-6417:
----------------------------------------

             Summary: MapReduceClient's primitives.h is toxic and should be extirpated
                 Key: MAPREDUCE-6417
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6417
             Project: Hadoop Map/Reduce
          Issue Type: Sub-task
          Components: client
    Affects Versions: 2.7.0
            Reporter: Alan Burlison
            Priority: Blocker


MapReduceClient's primitives.h attempts to provide optimised versions of standard library memory copy and comparison functions. It has been the subject of several portability-related bugs:

* HADOOP-11505 hadoop-mapreduce-client-nativetask uses bswap where be32toh is needed, doesn't work on non-x86
* HADOOP-11665 Provide and unify cross platform byteorder support in native code
* MAPREDUCE-6397 MAPREDUCE makes many endian-dependent assumptions
* HADOOP-11484 hadoop-mapreduce-client-nativetask fails to build on ARM AARCH64 due to x86 asm statements

At present it only works on x86 and ARM64 as it lacks definitions for bswap and bswap64 for any platforms other than those.

However it has even more serious problems on non-866 architectures, for example on SPARC simple_memcpy simply doesn't work at all:

{code}
$cat bang.cc
#include <string.h>
#define SIMPLE_MEMCPY
#include "primitives.h"
int main(int argc, char **argv)
{
    char b1[9];
    char b2[9];
    simple_memcpy(b2, b1, sizeof(b1));
}
$ gcc -o bang bang.cc && ./bang
Bus Error (core dumped)
{code}
That's because simple_memcpy does pointer fiddling that results in misaligned accesses, which are illegal on SPARC.

fmemcmp is also broken. Even if a definition of bswap is provided,on big-endian architectures the result is simply wrong because of it's unconditional use of bswap:

{code}
$ cat thud.cc
#include <stdio.h>
#include <strings.h>
#include "primitives.h"
int main(int argc, char **argv)
{
    char a[] = { 0,1,2,0 };
    char b[] = { 0,2,1,0 };
    printf("%lld %d\n", fmemcmp(a, b, sizeof(a), memcmp(a, b, sizeof(a))));
}
$ g++ -o thud thud.cc && ./thud
65280 -1
{code}

And in addition fmemcmp suffers from the same misalignment issues as simple_memcpy and coredumps on SPARC when asked to compare odd-sized buffers.

primitives.h should simply be deleted and replaced with the standard memory copy & compare functions, or with thin wrappers around them where the APIs are different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)