You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Marc Allen <al...@gmail.com> on 2020/04/29 15:49:40 UTC

import_pyarrow() in c++ test code, segfault

Hi all,

I am debugging a segfault in c++ that populates an arrow table with the
intent of returning the table back to python. This is using arrow-0.15.1. I
wrote a unit test to try to isolate the issue, and I'm seeing that a
segfault occurs whenever I call "arrow::py::import_pyarrow()". A short
example reproduces in my environment, if I compile this and execute it from
the command line:

#include <arrow/python/pyarrow.h>
int main() {
    arrow::py::import_pyarrow();
    return 0;
}

The backtrace looks like this:

#0  0x00007ffff3fe87ff in PyList_New (size=size@entry=0) at
Objects/listobject.c:179
#1  0x00007ffff40b4207 in PyImport_Import
(module_name=module_name@entry=0x7ffff7f74030)
at Python/import.c:1910
#2  0x00007ffff40b432c in PyImport_ImportModule (name=name@entry=0x7ffff7f73030
"datetime") at Python/import.c:1389
#3  0x00007ffff4011118 in PyCapsule_Import (name=0x7ffff6abdb60
"datetime.datetime_CAPI", no_block=0) at Objects/capsule.c:220
#4  0x00007ffff6a2f4cc in arrow::py::internal::InitDatetime() () from
libarrow_python.so.15
#5  0x00007ffff6aabeaa in arrow::py::import_pyarrow() () from
libarrow_python.so.15
#6  0x0000000000401f8b in main () at src/lib/data/data_main.cc:11

However, I have another similar "hello world" that works, but only if I use
it from within a python context, i.e. loading and executing the function
from a Python repl. This leads me to believe that perhaps it is not
possible to call import_pyarrow() in a pure C++ environment that doesn't
originate from Python. Is that correct? And, is there a way to check if I'm
in a supported environment for using pyarrow so that I could handle this
situation more gracefully in my code?

-Marc

Re: import_pyarrow() in c++ test code, segfault

Posted by Wes McKinney <we...@gmail.com>.
hi Marc,

You haven't initialized the Python interpreter -- you need to do that
before you call import_pyarrow. You may need to hold the GIL when
calling the function also but I'm not sure

- Wes

On Wed, Apr 29, 2020 at 10:50 AM Marc Allen <al...@gmail.com> wrote:
>
> Hi all,
>
> I am debugging a segfault in c++ that populates an arrow table with the intent of returning the table back to python. This is using arrow-0.15.1. I wrote a unit test to try to isolate the issue, and I'm seeing that a segfault occurs whenever I call "arrow::py::import_pyarrow()". A short example reproduces in my environment, if I compile this and execute it from the command line:
>
> #include <arrow/python/pyarrow.h>
> int main() {
>     arrow::py::import_pyarrow();
>     return 0;
> }
>
> The backtrace looks like this:
>
> #0  0x00007ffff3fe87ff in PyList_New (size=size@entry=0) at Objects/listobject.c:179
> #1  0x00007ffff40b4207 in PyImport_Import (module_name=module_name@entry=0x7ffff7f74030) at Python/import.c:1910
> #2  0x00007ffff40b432c in PyImport_ImportModule (name=name@entry=0x7ffff7f73030 "datetime") at Python/import.c:1389
> #3  0x00007ffff4011118 in PyCapsule_Import (name=0x7ffff6abdb60 "datetime.datetime_CAPI", no_block=0) at Objects/capsule.c:220
> #4  0x00007ffff6a2f4cc in arrow::py::internal::InitDatetime() () from libarrow_python.so.15
> #5  0x00007ffff6aabeaa in arrow::py::import_pyarrow() () from libarrow_python.so.15
> #6  0x0000000000401f8b in main () at src/lib/data/data_main.cc:11
>
> However, I have another similar "hello world" that works, but only if I use it from within a python context, i.e. loading and executing the function from a Python repl. This leads me to believe that perhaps it is not possible to call import_pyarrow() in a pure C++ environment that doesn't originate from Python. Is that correct? And, is there a way to check if I'm in a supported environment for using pyarrow so that I could handle this situation more gracefully in my code?
>
> -Marc