You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Dan Burkert (JIRA)" <ji...@apache.org> on 2017/07/18 21:36:00 UTC

[jira] [Updated] (KUDU-2068) Kudu/Centos 7/devtoolset miscompilation

     [ https://issues.apache.org/jira/browse/KUDU-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dan Burkert updated KUDU-2068:
------------------------------
    Description: 
There are a number of issues related to building Kudu on Centos/RHEL 7 with devtoolset, and on Centos/RHEL 7 machines with devtoolset installed (but not enabled).

A couple of bits of background info:

1) RHEL 7 ships with system GCC/libstdcxx version 4.8.
2) devtoolset-3 ships with GCC/libstdcxx version 4.9.
3) devtoolset-6 ships with GCC/libstdcxx version 6.2.
4) GCC incompatibly changed the {{unordered_set}} and {{unordered_map}} ABIs between libstdcxx version 4.8 and 4.9.  In 4.8, {{sizeof(unordered_set<int>)}} is 48 bytes, whereas in 4.9 and above it is 56 bytes (and similarly for {{unordered_map}}).
5) Clang will automatically search for devtoolset installations, and if found, use those headers as opposed to the system headers.
6) Clang 4.0 (the current version bundled by Kudu) will find devtoolset versions [up to devtoolset-4|https://github.com/llvm-mirror/clang/blob/release_40/lib/Driver/ToolChains.cpp#L1445-L1449]. The next version of Clang (Clang 5) will find devtoolset version [up to devtoolset-6|https://github.com/llvm-mirror/clang/blob/9a973f3ee99d42a283cafc26407f081daaa8ac21/lib/Driver/ToolChains/Gnu.cpp#L1717-L1720].
7) G++ compilers provided by a devtoolset will use the libstdcxx headers provided by that devtoolset.
8) Among other things, Kudu uses its bundled clang to pre-compile C++ source files into clang bitcode. One of the classes included in this pre-compilation is kudu::Schema, which includes an {{unordered_set}} field.

As a result, with certain configurations, the precompiled codegen will cause crashes at runtime:

Kudu compiled with system gcc on a Centos 7 machine with devtoolset-3 installed: Kudu will be compiled against the system headers, where {{unordered_set}} is 48 bytes. The precompiled code will be compiled by clang against the devtoolset-3 headers, where {{unordered_set}} is 56 bytes.  This results in runtime crashes when calling in to codegenerated functions (codegen-test segfaults reliably).

Kudu compiled with devtoolset-6 gcc on Centos7: Kudu will be compiled against the devtoolset-6 headers, where {{unordered_set}} is 54 bytes. The precompiled code will be compiled by clang against the system headers, where {{unordered_set}} is 48 bytes (clang 4.0 will not find the devtoolset-6 installation). This results in runtime crashes when calling in to code generated functions (codegen-test segfaults reliably).

Passing the {{\-gcc-toolchain}} flag to clang with a value appropriate for the currently enabled g++ when precompiling sources will fix this issue, but I haven't found a clean way to figure out how to determine the 'correct' value as part of a script.  For system gcc the value should be {{/usr}}, and for devtoolset-6 the value should be {{/opt/rh/devtoolset-6/root/usr}}.  This corresponds to the {{--prefix}} flag value under the configure flags that {{gcc -v}} spits out, so maybe we can parse it from that.

Another option is to replace {{unordered_set}}/{{unordered_map}} fields in any objects passed through codegen boundaries, but obviously this is brittle (and there may be other types out there whose ABI is similarly unstable).

Finally, I'll note that this issue is a good reason why we should maintain the no-c++11 types in public-APIs rule that we currently follow with the client.  Also related: [devtoolset ABI guarantees|https://access.redhat.com/documentation/en-US/Red_Hat_Developer_Toolset/6/html/User_Guide/sect-GCC-CPP.html#sect-GCC-CPP-Compatibility].

  was:
There are a number of issues related to building Kudu on Centos/RHEL 7 with devtoolset, and on Centos/RHEL 7 machines with devtoolset installed (but not enabled).

A couple of bits of background info:

1) RHEL 7 ships with system GCC/libstdcxx version 4.8.
2) devtoolset-3 ships with GCC/libstdcxx version 4.9.
3) devtoolset-6 ships with GCC/libstdcxx version 6.2.
4) GCC incompatibly changed the {{unordered_set}} and {{unordered_map}} ABIs between libstdcxx version 4.7 and 4.8.  In 4.7, {{sizeof(unordered_set<int>)}} is 48 bytes, whereas in 4.8 and above it is 56 bytes (and similarly for {{unordered_map}}).
5) Clang will automatically search for devtoolset installations, and if found, use those headers as opposed to the system headers.
6) Clang 4.0 (the current version bundled by Kudu) will find devtoolset versions [up to devtoolset-4|https://github.com/llvm-mirror/clang/blob/release_40/lib/Driver/ToolChains.cpp#L1445-L1449]. The next version of Clang (Clang 5) will find devtoolset version [up to devtoolset-6|https://github.com/llvm-mirror/clang/blob/9a973f3ee99d42a283cafc26407f081daaa8ac21/lib/Driver/ToolChains/Gnu.cpp#L1717-L1720].
7) G++ compilers provided by a devtoolset will use the libstdcxx headers provided by that devtoolset.
8) Among other things, Kudu uses its bundled clang to pre-compile C++ source files into clang bitcode. One of the classes included in this pre-compilation is kudu::Schema, which includes an {{unordered_set}} field.

As a result, with certain configurations, the precompiled codegen will cause crashes at runtime:

Kudu compiled with system gcc on a Centos 7 machine with devtoolset-3 installed: Kudu will be compiled against the system headers, where {{unordered_set}} is 48 bytes. The precompiled code will be compiled by clang against the devtoolset-3 headers, where {{unordered_set}} is 56 bytes.  This results in runtime crashes when calling in to codegenerated functions (codegen-test segfaults reliably).

Kudu compiled with devtoolset-6 gcc on Centos7: Kudu will be compiled against the devtoolset-6 headers, where {{unordered_set}} is 54 bytes. The precompiled code will be compiled by clang against the system headers, where {{unordered_set}} is 48 bytes (clang 4.0 will not find the devtoolset-6 installation). This results in runtime crashes when calling in to code generated functions (codegen-test segfaults reliably).

Passing the {{\-gcc-toolchain}} flag to clang with a value appropriate for the currently enabled g++ when precompiling sources will fix this issue, but I haven't found a clean way to figure out how to determine the 'correct' value as part of a script.  For system gcc the value should be {{/usr}}, and for devtoolset-6 the value should be {{/opt/rh/devtoolset-6/root/usr}}.  This corresponds to the {{--prefix}} flag value under the configure flags that {{gcc -v}} spits out, so maybe we can parse it from that.

Another option is to replace {{unordered_set}}/{{unordered_map}} fields in any objects passed through codegen boundaries, but obviously this is brittle (and there may be other types out there whose ABI is similarly unstable).

Finally, I'll note that this issue is a good reason why we should maintain the no-c++11 types in public-APIs rule that we currently follow with the client.  Also related: [devtoolset ABI guarantees|https://access.redhat.com/documentation/en-US/Red_Hat_Developer_Toolset/6/html/User_Guide/sect-GCC-CPP.html#sect-GCC-CPP-Compatibility].


> Kudu/Centos 7/devtoolset miscompilation
> ---------------------------------------
>
>                 Key: KUDU-2068
>                 URL: https://issues.apache.org/jira/browse/KUDU-2068
>             Project: Kudu
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 1.4.0
>            Reporter: Dan Burkert
>
> There are a number of issues related to building Kudu on Centos/RHEL 7 with devtoolset, and on Centos/RHEL 7 machines with devtoolset installed (but not enabled).
> A couple of bits of background info:
> 1) RHEL 7 ships with system GCC/libstdcxx version 4.8.
> 2) devtoolset-3 ships with GCC/libstdcxx version 4.9.
> 3) devtoolset-6 ships with GCC/libstdcxx version 6.2.
> 4) GCC incompatibly changed the {{unordered_set}} and {{unordered_map}} ABIs between libstdcxx version 4.8 and 4.9.  In 4.8, {{sizeof(unordered_set<int>)}} is 48 bytes, whereas in 4.9 and above it is 56 bytes (and similarly for {{unordered_map}}).
> 5) Clang will automatically search for devtoolset installations, and if found, use those headers as opposed to the system headers.
> 6) Clang 4.0 (the current version bundled by Kudu) will find devtoolset versions [up to devtoolset-4|https://github.com/llvm-mirror/clang/blob/release_40/lib/Driver/ToolChains.cpp#L1445-L1449]. The next version of Clang (Clang 5) will find devtoolset version [up to devtoolset-6|https://github.com/llvm-mirror/clang/blob/9a973f3ee99d42a283cafc26407f081daaa8ac21/lib/Driver/ToolChains/Gnu.cpp#L1717-L1720].
> 7) G++ compilers provided by a devtoolset will use the libstdcxx headers provided by that devtoolset.
> 8) Among other things, Kudu uses its bundled clang to pre-compile C++ source files into clang bitcode. One of the classes included in this pre-compilation is kudu::Schema, which includes an {{unordered_set}} field.
> As a result, with certain configurations, the precompiled codegen will cause crashes at runtime:
> Kudu compiled with system gcc on a Centos 7 machine with devtoolset-3 installed: Kudu will be compiled against the system headers, where {{unordered_set}} is 48 bytes. The precompiled code will be compiled by clang against the devtoolset-3 headers, where {{unordered_set}} is 56 bytes.  This results in runtime crashes when calling in to codegenerated functions (codegen-test segfaults reliably).
> Kudu compiled with devtoolset-6 gcc on Centos7: Kudu will be compiled against the devtoolset-6 headers, where {{unordered_set}} is 54 bytes. The precompiled code will be compiled by clang against the system headers, where {{unordered_set}} is 48 bytes (clang 4.0 will not find the devtoolset-6 installation). This results in runtime crashes when calling in to code generated functions (codegen-test segfaults reliably).
> Passing the {{\-gcc-toolchain}} flag to clang with a value appropriate for the currently enabled g++ when precompiling sources will fix this issue, but I haven't found a clean way to figure out how to determine the 'correct' value as part of a script.  For system gcc the value should be {{/usr}}, and for devtoolset-6 the value should be {{/opt/rh/devtoolset-6/root/usr}}.  This corresponds to the {{--prefix}} flag value under the configure flags that {{gcc -v}} spits out, so maybe we can parse it from that.
> Another option is to replace {{unordered_set}}/{{unordered_map}} fields in any objects passed through codegen boundaries, but obviously this is brittle (and there may be other types out there whose ABI is similarly unstable).
> Finally, I'll note that this issue is a good reason why we should maintain the no-c++11 types in public-APIs rule that we currently follow with the client.  Also related: [devtoolset ABI guarantees|https://access.redhat.com/documentation/en-US/Red_Hat_Developer_Toolset/6/html/User_Guide/sect-GCC-CPP.html#sect-GCC-CPP-Compatibility].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)