You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "David Li (Jira)" <ji...@apache.org> on 2021/09/27 13:50:00 UTC

[jira] [Commented] (ARROW-14129) [C++] An empty dictionary array crashes on `unique` and `value_counts`.

    [ https://issues.apache.org/jira/browse/ARROW-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17420752#comment-17420752 ] 

David Li commented on ARROW-14129:
----------------------------------

I can confirm this occurs on master, part of the traceback:
{noformat}
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff6e56921 in __GI_abort () at abort.c:79
#2  0x00007ffff467d7ae in arrow::util::CerrLog::~CerrLog (this=0x555555ddf980)
    at /home/lidavidm/Code/upstream/arrow-14129/cpp/src/arrow/util/logging.cc:72
#3  0x00007ffff467d7e9 in arrow::util::CerrLog::~CerrLog (this=0x555555ddf980)
    at /home/lidavidm/Code/upstream/arrow-14129/cpp/src/arrow/util/logging.cc:66
#4  0x00007ffff467d400 in arrow::util::ArrowLog::~ArrowLog (this=0x7fffffffac80)
    at /home/lidavidm/Code/upstream/arrow-14129/cpp/src/arrow/util/logging.cc:250
#5  0x00007ffff3bed3c1 in arrow::DictionaryArray::DictionaryArray (this=0x555555f088f0, data=...)
    at /home/lidavidm/Code/upstream/arrow-14129/cpp/src/arrow/array/array_dict.cc:83
#6  0x00007ffff3d3b561 in __gnu_cxx::new_allocator<arrow::DictionaryArray>::construct<arrow::DictionaryArray, std::shared_ptr<arrow::ArrayData> const&> (this=0x7fffffffadf0, __p=0x555555f088f0, __args=...)
    at /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/ext/new_allocator.h:136
#7  0x00007ffff3d3b51d in std::allocator_traits<std::allocator<arrow::DictionaryArray> >::construct<arrow::DictionaryArray, std::shared_ptr<arrow::ArrayData> const&> (__a=..., __p=0x555555f088f0, __args=...)
    at /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/alloc_traits.h:475
#8  0x00007ffff3d3b4a1 in std::_Sp_counted_ptr_inplace<arrow::DictionaryArray, std::allocator<arrow::DictionaryArray>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<std::shared_ptr<arrow::ArrayData> const&> (this=0x555555f088e0, 
    __a=..., __args=...) at /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:526
#9  0x00007ffff3d3b370 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<arrow::DictionaryArray, std::allocator<arrow::DictionaryArray>, std::shared_ptr<arrow::ArrayData> const&> (this=0x7fffffffafc8, __a=..., __args=...)
    at /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:637
#10 0x00007ffff3d3b286 in std::__shared_ptr<arrow::DictionaryArray, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<arrow::DictionaryArray>, std::shared_ptr<arrow::ArrayData> const&> (this=0x7fffffffafc0, __tag=..., __a=..., 
    __args=...) at /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr_base.h:1294
#11 0x00007ffff3d3b21d in std::shared_ptr<arrow::DictionaryArray>::shared_ptr<std::allocator<arrow::DictionaryArray>, std::shared_ptr<arrow::ArrayData> const&> (this=0x7fffffffafc0, __tag=..., __a=..., __args=...)
    at /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr.h:344
#12 0x00007ffff3d3b1cf in std::allocate_shared<arrow::DictionaryArray, std::allocator<arrow::DictionaryArray>, std::shared_ptr<arrow::ArrayData> const&> (__a=..., __args=...)
    at /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr.h:690
#13 0x00007ffff3d3b150 in std::make_shared<arrow::DictionaryArray, std::shared_ptr<arrow::ArrayData> const&> (__args=...)
    at /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/shared_ptr.h:706
#14 0x00007ffff3d19e82 in arrow::(anonymous namespace)::ArrayDataWrapper::Visit<arrow::DictionaryType> (
    this=0x7fffffffb250) at /home/lidavidm/Code/upstream/arrow-14129/cpp/src/arrow/array/util.cc:66
#15 0x00007ffff3d09eed in arrow::VisitTypeInline<arrow::(anonymous namespace)::ArrayDataWrapper> (type=..., 
    visitor=0x7fffffffb250) at /home/lidavidm/Code/upstream/arrow-14129/cpp/src/arrow/visitor_inline.h:90
#16 0x00007ffff3d094d8 in arrow::MakeArray (data=...)
    at /home/lidavidm/Code/upstream/arrow-14129/cpp/src/arrow/array/util.cc:311
#17 0x00007ffff645fcae in __pyx_f_7pyarrow_3lib_wrap_datum (__pyx_v_datum=...)
    at /home/lidavidm/Code/upstream/arrow-14129/python/build/temp.linux-x86_64-3.9/lib.cpp:68653
#18 0x00007fff6a7927cc in __pyx_pf_7pyarrow_8_compute_8Function_6call (__pyx_v_self=0x7ffff67a34f0, 
    __pyx_v_args=0x7fff6a9b14c0, __pyx_v_options=0x555555914c80 <_Py_NoneStruct>, 
    __pyx_v_memory_pool=0x555555914c80 <_Py_NoneStruct>)
    at /home/lidavidm/Code/upstream/arrow-14129/python/build/temp.linux-x86_64-3.9/_compute.cpp:10347
#19 0x00007fff6a792473 in __pyx_pw_7pyarrow_8_compute_8Function_7call (__pyx_v_self=0x7ffff67a34f0, 
    __pyx_args=0x7ffff7f072b0, __pyx_kwds=0x7ffff7f03240)
    at /home/lidavidm/Code/upstream/arrow-14129/python/build/temp.linux-x86_64-3.9/_compute.cpp:10182
#20 0x00005555556b3f85 in cfunction_call ()
    at /home/conda/feedstock_root/build_artifacts/python-split_1625973859697/work/Objects/methodobject.c:539
#21 0x00007fff6a7eccc5 in __Pyx_PyObject_Call (func=0x7ffff255df40, arg=0x7ffff7f072b0, kw=0x7ffff7f03240)
    at /home/lidavidm/Code/upstream/arrow-14129/python/build/temp.linux-x86_64-3.9/_compute.cpp:44364
#22 0x00007fff6a796f0c in __pyx_pf_7pyarrow_8_compute_6call_function (__pyx_self=0x0, __pyx_v_name=0x7ffff683b8b0, 
    __pyx_v_args=0x7fff6a9b14c0, __pyx_v_options=0x555555914c80 <_Py_NoneStruct>, 
    __pyx_v_memory_pool=0x555555914c80 <_Py_NoneStruct>)
    at /home/lidavidm/Code/upstream/arrow-14129/python/build/temp.linux-x86_64-3.9/_compute.cpp:12425
#23 0x00007fff6a796ad0 in __pyx_pw_7pyarrow_8_compute_7call_function (__pyx_self=0x0, __pyx_args=0x7fff6a8b8f00, 
    __pyx_kwds=0x0) at /home/lidavidm/Code/upstream/arrow-14129/python/build/temp.linux-x86_64-3.9/_compute.cpp:12359 {noformat}

> [C++] An empty dictionary array crashes on `unique` and `value_counts`.
> -----------------------------------------------------------------------
>
>                 Key: ARROW-14129
>                 URL: https://issues.apache.org/jira/browse/ARROW-14129
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>    Affects Versions: 5.0.0
>            Reporter: A. Coady
>            Assignee: David Li
>            Priority: Critical
>
> {code:python}
> import pyarrow as pa
> arr = pa.array(range(3)).dictionary_encode()
> assert not arr[:0]
> arr[:0].unique() # Check failed: (data->dictionary) != (nullptr) 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)