You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Krisztian Szucs (JIRA)" <ji...@apache.org> on 2019/07/09 14:32:00 UTC

[jira] [Updated] (ARROW-5886) [Python][Packaging] Manylinux1/2010 compliance issue with libz

     [ https://issues.apache.org/jira/browse/ARROW-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Krisztian Szucs updated ARROW-5886:
-----------------------------------
    Summary: [Python][Packaging] Manylinux1/2010 compliance issue with libz  (was: [Python][Packaging] Manylinux1/2010 complience issue with libz)

> [Python][Packaging] Manylinux1/2010 compliance issue with libz
> --------------------------------------------------------------
>
>                 Key: ARROW-5886
>                 URL: https://issues.apache.org/jira/browse/ARROW-5886
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Packaging, Python
>    Affects Versions: 0.14.0
>            Reporter: Krisztian Szucs
>            Priority: Major
>
> So we statically link liblz4 in the manylinux1 wheels
> {code}
> # ldd pyarrow-manylinux1/libarrow.so.14 | grep z
>         libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fc28cef4000)
> {code}
> but dynamically in the manylinux2010 wheels
> {code}
> # ldd pyarrow-manylinux2010/libarrow.so.14 | grep z
>         liblz4.so.1 => not found  (already deleted to reproduce the issue)
>         libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f56f7440000)
> {code}
> this what this PR resolves.
> What I'm finding strange, that auditwheel seems to bundle libz for manylinux1:
> {code}
> # ls -lah pyarrow-manylinux1/*z*so.*
> -rwxr-xr-x 1 root root 115K Jun 29 00:14 pyarrow-manylinux1/libz-7f57503f.so.1.2.11
> {code}
> while ldd still uses the system libz:
> {code}
> # ldd pyarrow-manylinux1/libarrow.so.14 | grep z
>         libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f91fcf3f000)
> {code}
> For manylinux2010 we also have liblz4:
> {code}
> #  ls -lah pyarrow-manylinux2010/*z*so.*
> -rwxr-xr-x 1 root root 191K Jun 28 23:38 pyarrow-manylinux2010/liblz4-8cb8bdde.so.1.8.3
> -rwxr-xr-x 1 root root 115K Jun 28 23:38 pyarrow-manylinux2010/libz-c69b9943.so.1.2.11
> {code}
> and ldd similarly tries to load the system libs:
> {code}
> # ldd pyarrow-manylinux2010/libarrow.so.14 | grep z
>         liblz4.so.1 => not found
>         libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fd72764e000)
> {code}
> Inspecting manylinux1 with `LD_DEBUG=files,libs ldd libarrow.so.14` it seems like to search the right path, but cannot find the hashed version of libz `libz-7f57503f.so.1.2.11`
> {code}
>        463:     file=libz.so.1 [0];  needed by ./libarrow.so.14 [0]
>        463:     find library=libz.so.1 [0]; searching
>        463:      search path=/tmp/pyarrow-manylinux1/.          (RPATH from file ./libarrow.so.14)
>        463:       trying file=/tmp/pyarrow-manylinux1/./libz.so.1
>        463:      search cache=/etc/ld.so.cache
>        463:       trying file=/lib/x86_64-linux-gnu/libz.so.1
> {code}
> There is no `libz.so.1` just `libz-7f57503f.so.1.2.11`.
> Similarly for manylinux2010 and libz:
> {code}
>        470:     file=libz.so.1 [0];  needed by ./libarrow.so.14 [0]
>        470:     find library=libz.so.1 [0]; searching
>        470:      search path=/tmp/pyarrow-manylinux2010/.               (RPATH from file ./libarrow.so.14)
>        470:       trying file=/tmp/pyarrow-manylinux2010/./libz.so.1
>        470:      search cache=/etc/ld.so.cache
>        470:       trying file=/lib/x86_64-linux-gnu/libz.so.1
> {code}
> for liblz4 (again, I've deleted the system one):
> {code}
>        470:     file=liblz4.so.1 [0];  needed by ./libarrow.so.14 [0]
>        470:     find library=liblz4.so.1 [0]; searching
>        470:      search path=/tmp/pyarrow-manylinux2010/.               (RPATH from file ./libarrow.so.14)
>        470:       trying file=/tmp/pyarrow-manylinux2010/./liblz4.so.1
>        470:      search cache=/etc/ld.so.cache
>        470:      search path=/lib/x86_64-linux-gnu/tls/x86_64:/lib/x86_64-linux-gnu/tls:/lib/x86_64-linux-gnu/x86_64:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu/tls/x86_64:/usr/lib/x86_64-linux-gnu/tls:/usr/lib/x86_64-linux-gnu/x86_6$
> :/usr/lib/x86_64-linux-gnu:/lib/tls/x86_64:/lib/tls:/lib/x86_64:/lib:/usr/lib/tls/x86_64:/usr/lib/tls:/usr/lib/x86_64:/usr/lib          (system search path)
> {code}
> There are no `libz.so.1` nor `liblz4.so.1`, just `libz-c69b9943.so.1.2.11` and `liblz4-8cb8bdde.so.1.8.3`
> According to https://www.python.org/dev/peps/pep-0571/ `liblz4` nor `libz` are part of the whitelist, and while these are bundled with the wheel, seemingly cannot be found - perhaps because of the hash in the library name?
> I've tried to inspect the wheels with `auditwheel show` with version `2` and `1.10`, both says the following:
> {code}
> # auditwheel show pyarrow-0.14.0-cp37-cp37m-manylinux2010_x86_64.whl
> pyarrow-0.14.0-cp37-cp37m-manylinux2010_x86_64.whl is consistent with
> the following platform tag: "linux_x86_64".
> The wheel references external versioned symbols in these system-
> provided shared libraries: libgcc_s.so.1 with versions {'GCC_3.3',
> 'GCC_3.4', 'GCC_3.0'}, libpthread.so.0 with versions {'GLIBC_2.3.3',
> 'GLIBC_2.12', 'GLIBC_2.2.5', 'GLIBC_2.3.2'}, libc.so.6 with versions
> {'GLIBC_2.4', 'GLIBC_2.6', 'GLIBC_2.2.5', 'GLIBC_2.7', 'GLIBC_2.3.4',
> 'GLIBC_2.3.2', 'GLIBC_2.3'}, libstdc++.so.6 with versions
> {'CXXABI_1.3', 'GLIBCXX_3.4.10', 'GLIBCXX_3.4.9', 'GLIBCXX_3.4.11',
> 'GLIBCXX_3.4.5', 'GLIBCXX_3.4', 'CXXABI_1.3.2', 'CXXABI_1.3.3'},
> librt.so.1 with versions {'GLIBC_2.2.5'}, libm.so.6 with versions
> {'GLIBC_2.2.5'}, libdl.so.2 with versions {'GLIBC_2.2.5'}, libz.so.1
> with versions {'ZLIB_1.2.0'}
> This constrains the platform tag to "manylinux2010_x86_64". In order
> to achieve a more compatible tag, you would need to recompile a new
> wheel from source on a system with earlier versions of these
> libraries, such as a recent manylinux image.
> {code}
> {code}
> # auditwheel show pyarrow-0.14.0-cp37-cp37m-manylinux1_x86_64.whl
> pyarrow-0.14.0-cp37-cp37m-manylinux1_x86_64.whl is consistent with the
> following platform tag: "linux_x86_64".
> The wheel references external versioned symbols in these system-
> provided shared libraries: libgcc_s.so.1 with versions {'GCC_3.4',
> 'GCC_3.0', 'GCC_3.3'}, libc.so.6 with versions {'GLIBC_2.3',
> 'GLIBC_2.2.5', 'GLIBC_2.3.4', 'GLIBC_2.4', 'GLIBC_2.3.2'},
> libstdc++.so.6 with versions {'CXXABI_1.3', 'GLIBCXX_3.4.5',
> 'GLIBCXX_3.4'}, librt.so.1 with versions {'GLIBC_2.2.5'}, libm.so.6
> with versions {'GLIBC_2.2.5'}, libpthread.so.0 with versions
> {'GLIBC_2.3.3', 'GLIBC_2.3.2', 'GLIBC_2.2.5'}, libdl.so.2 with
> versions {'GLIBC_2.2.5'}, libz.so.1 with versions {'ZLIB_1.2.0'}
> The following external shared libraries are required by the wheel:
> {
>     "libc.so.6": "/lib/x86_64-linux-gnu/libc-2.24.so",
>     "libcrypt.so.1": "/lib/x86_64-linux-gnu/libcrypt-2.24.so",
>     "libdl.so.2": "/lib/x86_64-linux-gnu/libdl-2.24.so",
>     "libgcc_s.so.1": "/lib/x86_64-linux-gnu/libgcc_s.so.1",
>     "libm.so.6": "/lib/x86_64-linux-gnu/libm-2.24.so",
>     "libnsl.so.1": "/lib/x86_64-linux-gnu/libnsl-2.24.so",
>     "libpthread.so.0": "/lib/x86_64-linux-gnu/libpthread-2.24.so",
>     "librt.so.1": "/lib/x86_64-linux-gnu/librt-2.24.so",
>     "libstdc++.so.6": "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22",
>     "libutil.so.1": "/lib/x86_64-linux-gnu/libutil-2.24.so",
>     "libz.so.1": "/lib/x86_64-linux-gnu/libz.so.1.2.8"
> }
> In order to achieve the tag platform tag "manylinux2010_x86_64" the
> following shared library dependencies will need to be eliminated:
> libz.so.1
> In order to achieve the tag platform tag "manylinux1_x86_64" the
> following shared library dependencies will need to be eliminated:
> libz.so.1
> {code}
> I think there are more todo left with the wheels. IMO the manylinux1 wheels are not compliant because of `libz` and the manylinux2010 wheels are not compliant because of both `libz` and `liblz4` (but incorrectly reported by auditwheel?).
> We also need to ensure to run {{auditwheel show}} on the produced wheels in the manylinux-test script https://github.com/apache/arrow/blob/master/dev/tasks/python-wheels/manylinux-test.sh



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)