You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Krisztian Szucs (Jira)" <ji...@apache.org> on 2020/10/06 20:18:00 UTC

[jira] [Resolved] (ARROW-10192) [C++][Python] Segfault when converting nested struct array with dictionary field to pandas series

     [ https://issues.apache.org/jira/browse/ARROW-10192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Krisztian Szucs resolved ARROW-10192.
-------------------------------------
    Resolution: Fixed

Issue resolved by pull request 8361
[https://github.com/apache/arrow/pull/8361]

> [C++][Python] Segfault when converting nested struct array with dictionary field to pandas series
> -------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-10192
>                 URL: https://issues.apache.org/jira/browse/ARROW-10192
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>            Reporter: Krisztian Szucs
>            Assignee: Antoine Pitrou
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.0.0
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Reproducer:
> {code:python}
> def test_struct_array_with_dictionary_field_to_pandas():
>     ty = pa.struct([
>         pa.field('dict', pa.dictionary(pa.int64(), pa.int32())),
>     ])
>     data = [
>         {'dict': -1859762450}
>     ]
>     arr = pa.array(data, type=ty)
>     arr.to_pandas()
> {code}
> Raises SIGSTOP:
> {code}
> * thread #1, stop reason = signal SIGSTOP
>   * frame #0: 0x00007fff6e2b733a libsystem_kernel.dylib`__pthread_kill + 10
>     frame #1: 0x00007fff6e373e60 libsystem_pthread.dylib`pthread_kill + 430
>     frame #2: 0x00007fff6e1ce93e libsystem_c.dylib`raise + 26
>     frame #3: 0x00007fff6e3685fd libsystem_platform.dylib`_sigtramp + 29
>     frame #4: 0x000000011517adfd libarrow_python.200.0.0.dylib`arrow::py::ConvertStruct(options=0x00007f84fc5a0230, data=0x00007f84fc59ef18, out_values=0x00007f84fc53d140) at arrow_to_pandas.cc:685:54
>     frame #5: 0x000000011514c642 libarrow_python.200.0.0.dylib`arrow::py::ObjectWriterVisitor::Visit(this=0x00007ffee06a1a88, type=0x00007f84fc5a00e8) at arrow_to_pandas.cc:1031:12
>     frame #6: 0x00000001151499c4 libarrow_python.200.0.0.dylib`arrow::Status arrow::VisitTypeInline<arrow::py::ObjectWriterVisitor>(type=0x00007f84fc5a00e8, visitor=0x00007ffee06a1a88) at visitor_inline.h:88:5
>     frame #7: 0x0000000115149305 libarrow_python.200.0.0.dylib`arrow::py::ObjectWriter::CopyInto(this=0x00007f84fc5a0228, data=std::__1::shared_ptr<arrow::ChunkedArray>::element_type @ 0x00007f84fc59ef18 strong=2 weak=1, rel_placement=0) at arrow_to_pand
> as.cc:1055:12
> {code}
> {code:cpp}
> frame #4: 0x000000011517adfd libarrow_python.200.0.0.dylib`arrow::py::ConvertStruct(options=0x00007f84fc5a0230, data=0x00007f84fc59ef18, out_values=0x00007f84fc53d140) at arrow_to_pandas.cc:685:54
>    682            if (!arr->field(static_cast<int>(field_idx))->IsNull(i)) {
>    683              // Value exists in child array, obtain it
>    684              auto array = reinterpret_cast<PyArrayObject*>(fields_data[field_idx].obj());
> -> 685              auto ptr = reinterpret_cast<const char*>(PyArray_GETPTR1(array, i));
>    686              field_value.reset(PyArray_GETITEM(array, ptr));
>    687              RETURN_IF_PYERROR();
>    688            } else {
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)