You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "clamydo (via GitHub)" <gi...@apache.org> on 2023/05/01 11:32:20 UTC

[GitHub] [arrow] clamydo opened a new issue, #35381: self-compiled libarrow.a contains multiple undefined symbol

clamydo opened a new issue, #35381:
URL: https://github.com/apache/arrow/issues/35381

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   I am compiling Arrow 11 myself as a static library using a CMake script (see below). It produces a `libarrow.a` that contains a lot of undefined symbols in the `arrow::`-namespace when inspecting with 'nm'. Some symbols are undefined **and** multiply defined. Example output:
   ```bash
    nm -C libarrow.a | grep "arrow::float32"
                    U arrow::float32()
   0000000000000280 b guard variable for arrow::float32()::result
   00000000000115a0 T arrow::float32()
   00000000000016d9 t arrow::float32() [clone .cold]
   00000000000115a0 t arrow::float32() [clone .localalias]
   0000000000000290 b arrow::float32()::result
                    U arrow::float32()
                    U arrow::float32()
                    U arrow::float32()
                    U arrow::float32()
                    U arrow::float32()
                    U arrow::float32()
                    U arrow::float32()
                    U arrow::float32()
                    U arrow::float32()
   ```
   
   How can this be? Not sure if a bug or I am doing something wrong.
   
   Notably, I am compiling with cxx11-ABI
   
   `CMakeLists.txt`
   ```
   cmake_minimum_required(VERSION 3.16)
   include(ExternalProject)
   
   
   ExternalProject_Add(Arrow
       DOWNLOAD_EXTRACT_TIMESTAMP TRUE
       URL https://github.com/apache/arrow/archive/refs/tags/apache-arrow-11.0.0.tar.gz
       URL_HASH SHA1=978e2ae160f8f1ddb680619e2de13d2e78b71467
       SOURCE_SUBDIR cpp
       CMAKE_ARGS
           -DCMAKE_BUILD_TYPE=Release
           -DCMAKE_INSTALL_LIBDIR=lib
           -DCMAKE_INSTALL_PREFIX=./install
           -DCMAKE_POSITION_INDEPENDENT_CODE=ON
           -DCMAKE_CXX_FLAGS='-D_GLIBCXX_USE_CXX11_ABI=1'
           -DARROW_BUILD_STATIC=ON
           -DARROW_FILESYSTEM=ON
           -DARROW_S3=ON
           -DARROW_DATASET=ON
           -DARROW_PARQUET=ON
           -DARROW_JSON=ON
           -DARROW_CSV=ON
           -DARROW_IPC=ON
           -DARROW_WITH_BZ2=ON
           -DARROW_WITH_ZLIB=ON
           -DARROW_WITH_ZSTD=ON
           -DARROW_WITH_LZ4=ON
           -DARROW_WITH_BROTLI=ON
           -DARROW_WITH_SNAPPY=ON
           -DARROW_COMPUTE=ON
   )
   ```
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] clamydo commented on issue #35381: [C++] self-compiled libarrow.a contains multiple undefined symbol

Posted by "clamydo (via GitHub)" <gi...@apache.org>.
clamydo commented on issue #35381:
URL: https://github.com/apache/arrow/issues/35381#issuecomment-1530949559

   Sure. I have a bigger project that requires cxx11 ABI, therefore I'm compiling myself and link arrow and pyarrow-cpp statically. But in the big project this leads do hard to debug segmentation faults when using from python with vanilla pyarrow.
   
   Hence, I am writing a MVP C++ Python Extension to get an attack vector. With that MVP I get an undefined symbol error (pointing to `arrow::float32()` when importing the python module.
   
   ```
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   ImportError: test.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN5arrow7float32Ev
   ```
   
   ```python
   from setuptools import Extension, setup
   
   setup(
       ext_modules=[
           Extension(
               name="test", 
               include_dirs = [
                   'include',
               ],
               extra_compile_args=['-std=c++17'],
               extra_link_args=[
                   '-l:libarrow_bundled_dependencies.a',
                   '-l:libarrow.a',
                   '-l:libarrow_python.a',
               ],
               library_dirs = [
                   'lib',
               ]           ],
   
               sources=["test.cpp"], 
           ),
       ]
   )
   ```
   
   ```c++
   #define PY_SSIZE_T_CLEAN
   #include <Python.h>
   #include <arrow/api.h>
   #include <arrow/python/pyarrow.h>
   #include <stdio.h>
   
   static auto load_arrow_table(PyObject *pyobj, PyObject *args) -> PyObject * {
       auto tbl = arrow::py::unwrap_table(pyobj);
       Py_INCREF(Py_None);
       return Py_None;
   }
   
   static PyMethodDef TestMethods[] = {
       {"load_arrow_table", load_arrow_table, METH_VARARGS, "Some description"},
       {NULL, NULL, 0, NULL} /* Sentinel */
   };
   
   static struct PyModuleDef testmodule = {
       PyModuleDef_HEAD_INIT, "test", /* name of module */
       NULL,                          /* module documentation, may be NULL */
       -1,                            /* size of per-interpreter state of the module,
                                         or -1 if the module keeps state in global variables. */
       TestMethods};
   
   PyMODINIT_FUNC PyInit_test(void) { return PyModule_Create(&testmodule); }
   
   auto main(int argc, char *argv[]) -> int {
       // must be called to initialize pyarrow
       arrow::py::import_pyarrow();
   
       wchar_t *program = Py_DecodeLocale(argv[0], NULL);
       if (program == NULL) {
           fprintf(stderr, "Fatal error: cannot decode argv[0]\n");
           exit(1);
       }
   
       /* Add a built-in module, before Py_Initialize */
       if (PyImport_AppendInittab("test", PyInit_test) == -1) {
           fprintf(stderr, "Error: could not extend in-built modules table\n");
           exit(1);
       }
   
       /* Pass argv[0] to the Python interpreter */
       Py_SetProgramName(program);
   
       /* Initialize the Python interpreter.  Required.
          If this step fails, it will be a fatal error. */
       Py_Initialize();
   
       /* Optionally import the module; alternatively,
          import can be deferred until the embedded script
          imports it. */
       PyObject *pmodule = PyImport_ImportModule("test");
       if (!pmodule) {
           PyErr_Print();
           fprintf(stderr, "Error: could not import module 'test'\n");
       }
   
       PyMem_RawFree(program);
       return 0;
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou closed issue #35381: [C++] self-compiled libarrow.a contains multiple undefined symbol

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou closed issue #35381: [C++] self-compiled libarrow.a contains multiple undefined symbol
URL: https://github.com/apache/arrow/issues/35381


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on issue #35381: [C++] self-compiled libarrow.a contains multiple undefined symbol

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on issue #35381:
URL: https://github.com/apache/arrow/issues/35381#issuecomment-1530965228

   Thanks.
   
   Could you try the following order?
   
   ```python
               extra_link_args=[
                   '-l:libarrow_python.a',
                   '-l:libarrow.a',
                   '-l:libarrow_bundled_dependencies.a',
               ],
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on issue #35381: [C++] self-compiled libarrow.a contains multiple undefined symbol

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on issue #35381:
URL: https://github.com/apache/arrow/issues/35381#issuecomment-1531001524

   In C++, link order is important.
   We must specify libraries from depended libraries to dependent libraries.
   In this case, `libarrow_python.a` depends on `libarrow.a` and `libarrow.a` depends on `libarrow_bundled_dependencies.a`.
   So we must use the `libarrow_python.a`, `libarrow.a` and `libarrow_bundled_dependencies.a` order.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on issue #35381: [C++] self-compiled libarrow.a contains multiple undefined symbol

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on issue #35381:
URL: https://github.com/apache/arrow/issues/35381#issuecomment-1530259845

   Could you share what problem you faced too?
   Did you get a link error with the `libarrow.a`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] clamydo commented on issue #35381: [C++] self-compiled libarrow.a contains multiple undefined symbol

Posted by "clamydo (via GitHub)" <gi...@apache.org>.
clamydo commented on issue #35381:
URL: https://github.com/apache/arrow/issues/35381#issuecomment-1530976623

   Cool, the python import stopped complaining now! Thank you!
   
   Can you please elaborate?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org