You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Dániel Bakai (Jira)" <ji...@apache.org> on 2020/04/30 04:36:00 UTC

[jira] [Commented] (MINIFICPP-824) MacOS build uses RocksDB from brew, if installed, causing segfaults

    [ https://issues.apache.org/jira/browse/MINIFICPP-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096123#comment-17096123 ] 

Dániel Bakai commented on MINIFICPP-824:
----------------------------------------

[~aboda] [~aldrin]

The fix for this is not trivial. The issue is if that a single third party dependency is included from /usr/include (or /usr/local/include), it will "pollute" the include paths, and headers installed to the system will get priority over our headers, if /usr/include comes before our include paths in the compile command.

When we are defining the transitive include usage requirements for a target, we are populating the INTERFACE_INCLUDE_DIRECTORIES property of that target. This could normally be done with `target_include_directories(<targetname> INTERFACE /path/to/headers)`, but unfortunately until about 2 years ago this did not work for IMPORTED targets because of an oversight in CMake, so currently, to keep supporting older CMake versions, we are manually appending to this list with `set_property(TARGET <targetname> APPEND PROPERTY INTERFACE_INCLUDE_DIRECTORIES /path/to/headers)` - but the end result is the same. Now, `target_include_directories` has a BEFORE directive, which places the include path at the beginning of the INTERFACE_INCLUDE_DIRECTORIES list of the target, not the end. But this does not solve our problem, because this is not a magically transitive property - that is, this will just mean that it will place the include path to the beginning of the include paths list transitively required by our IMPORTED target (libfoobar). When we link with libfoobar (let's assume we are linking minifi), we will inherit its list of INTERFACE_INCLUDE_DIRECTORIES, but that sublist will just be placed in our list of include directories with all other ones inherited from other targets we depend on - the BEFORE directive in libfoobar's INTERFACE_INCLUDE_DIRECTORIES won't influence the placement of that sublist in our include directories list, it just influences the place of the include directory in libfoobar's sublist. So if another target's sublist comes before us that requires /usr/include - and we can't really influence the order in which these sublists will be added - then every further include path sublist will have a lesser priority, meaning that every further third party header will be sourced from /usr/include, if it exists there.

To solve this, I see 3 main options, each with its advantages and disadvantages:

1. Install every bundled third party's include directory in a way that is incompatible with the ones normally installed on the system, and include them that way.
For example, when we are building libfoobar, it would normally install its headers into `libfoobar-install/include/foobar`. We would be adding `-I libfoobar-install/include/` to our compile command, and include the header with `#include <foobar/foobar.h>`.
Instead of this, we can do the following: make libfoobar install its headers into `libfoobar-install/include/thirdparty/foobar`, add `-I libfoobar-install/include/` to our compile command, and include the header with `#include <thirdparty/foobar/foobar.h>`.
This has the advantage of:
 - not having to deal with CMake's include path order in any way: there is no chance of a collision with headers installed on the system.
The disadvantages of this approach is:
 - we have to change every bundled third party build to install the include directories in a special way. This means either patching the third parties, or manually moving the includes after installation.
 - we would lose the ability to easily change between bundled and system dependencies - as the include directives are different. This can be worked around by creating a "proxy" include if we want to use a system dependency, and adding it to our include path in that case - this would just be a "thirdparty/foobar/foobar.h" header file that would include the real "foobar/foobar.h" header in the system case.

2. Install every bundled third party's includes to the same directory and force CMake to use that directory as the first include path.
Currently, each bundled third party has its own install directory. libfoobar libs and headers will be installed to lifoobar-install/lib and libfoobar-install/include, respectively, while libbuzz will be installed to libbuzz-install/lib and libbuzz-install/include.
In our compile commands we have all of these include directories added as include paths.
Instead of this, we can make all third parties install their headers to the same directory - lets call it thirdpary-include/ -, thereby creating our very own /usr/include.
After this, we can force CMake to use that singular include path as the first include path for every command (for example, by adding it to the compiler flags). This is much easier to do with a single directory than with multiple, conditionally existing and used include directories.
This has the advantage of:
 - third parties almost without exception support installing headers to a different place (this is easier then #1, as there we would have to change to structure of how headers get installed, in this case we only have to change the location).
 - we override CMake's include path order resolution, but in a minimally invasive way
This has the disadvantage of:
 - we have to make sure we force this directory to the be the very first in all cases

3. Use only bundled dependencies AND sysroot.
If we only use bundled dependencies, there will be no need to add the real /usr/include to our include paths. We will still have to use system headers, which are in /usr/include, but if we use a sysroot, we will use the system headers from the sysroot's /usr/include, which, unlike the real /usr/include on our system does not contain installed third parties.
This has the advantage of:
 - not having to deal with dependencies installed on the system at all
This has the disadvantage of:
 - completely prevents us from using installed third parties. Even if we don't, some extensions might still want to do it, and they would still be broken then.
 - we have to have a sysroot - macOS compilations already use one by default, Windows is not really affected by this whole issue, but we would have to create one on Linux. It would be generally advantageous to do so, because it gives better control over what type and level of system APIs (libc version, for example) we end up using, but it is not a trivial undertaking.


It seems to me that option #2 is the most feasible right now.

> MacOS build uses RocksDB from brew, if installed, causing segfaults
> -------------------------------------------------------------------
>
>                 Key: MINIFICPP-824
>                 URL: https://issues.apache.org/jira/browse/MINIFICPP-824
>             Project: Apache NiFi MiNiFi C++
>          Issue Type: Bug
>            Reporter: Dániel Bakai
>            Assignee: Dániel Bakai
>            Priority: Major
>
> If RocksDB is installed from brew, libminifi/test/rocksdb-tests/RepoTests segfaults, because of a wrong call to rocksdb::DBImpl's destructor (instead of the destructor, an another member function is called).
> It seems like somehow the shipped and the system versions mix up, probably causing an ODR violation and ultimately the segfault. (Or the version shipped in brew is _very_ incompatible.)
> After RocksDB was uninstalled from brew, and minifi was rebuilt, the test ran correctly.
> This might affect other platforms as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)