You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "David Li (Jira)" <ji...@apache.org> on 2021/03/19 15:41:00 UTC

[jira] [Commented] (ARROW-12011) [Python] Crashes and incorrect results when converting large integers to dates

    [ https://issues.apache.org/jira/browse/ARROW-12011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304974#comment-17304974 ] 

David Li commented on ARROW-12011:
----------------------------------

Thanks for the report! I can confirm this happens on the main branch (commit 43d00e9629fe34dc40c78ea96c008de186726a39).

In all cases, it's because the given date either overflows or is an invalid value for the underlying C++ date type. I'm not sure if we should disallow these values entirely, since the format (as far as I can see) says nothing about the range of valid values, and the underlying value is valid, if extreme - but at least you'd expect it to not crash when printing. I see [~bkietz] and [~jorisvandenbossche] have looked at similar issues before - what do you think?

Trimmed backtrace for the crash. The main issue is that the date::year_month_day value is invalid (in particular, the year is invalid, it's -32768).
{noformat}
(gdb) bt
#0  0x00007ffff6e54fb7 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff6e56921 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff40b3892 in __gnu_cxx::__verbose_terminate_handler () at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/src/gcc/libstdc++-v3/libsupc++/vterminate.cc:95
#3  0x00007ffff40b1f69 in __cxxabiv1::__terminate (handler=<optimized out>) at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:48
#4  0x00007ffff40b1fab in std::terminate () at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:58
#5  0x00007ffff40b2194 in __cxxabiv1::__cxa_throw (obj=obj@entry=0x555555de36d0, tinfo=tinfo@entry=0x7ffff416d1a8 <typeinfo for std::__ios_failure>, dest=dest@entry=0x7ffff40d11d4 <std::__ios_failure::~__ios_failure()>)
    at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:95
#6  0x00007ffff40af3a2 in std::__throw_ios_failure (__s=__s@entry=0x7ffff412e067 "basic_ios::clear")
    at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/src/gcc/libstdc++-v3/src/c++11/cxx11-ios_failure.cc:115
#7  0x00007ffff40eb0aa in std::basic_ios<char, std::char_traits<char> >::clear (this=<optimized out>, __state=<optimized out>)
    at /home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/build/build-cc-gcc-final/x86_64-conda-linux-gnu/libstdc++-v3/include/bits/ios_base.h:166
#8  0x00007ffff5289af3 in arrow_vendored::date::to_stream<char, std::char_traits<char>, std::chrono::duration<long, std::ratio<1l, 1l> > > (os=..., fmt=0x7ffff5cf063d "F", fds=..., abbrev=0x7fffffffb170, offset_sec=0x7fffffffb168)
    at /home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/vendored/datetime/date.h:5078
#9  0x00007ffff527678c in arrow_vendored::date::to_stream<char, std::char_traits<char>, std::chrono::duration<int, std::ratio<86400l, 1l> > > (os=..., fmt=0x7ffff5cf063c "%F", tp=...)
    at /home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/vendored/datetime/date.h:5995
#10 0x00007ffff52718f4 in arrow_vendored::date::format<char, std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<int, std::ratio<86400l, 1l> > > > (fmt=0x7ffff5cf063c "%F", tp=...)
    at /home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/vendored/datetime/date.h:6021
#11 0x00007ffff5353770 in arrow::ArrayPrinter::FormatDateTime<std::chrono::duration<int, std::ratio<86400l, 1l> > > (this=0x7fffffffb610, fmt=0x7ffff5cf063c "%F", value=-1448879500, add_epoch=true)
    at /home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/pretty_print.cc:398
#12 0x00007ffff5350d86 in std::enable_if<std::is_base_of<arrow::DateType, arrow::NumericArray<arrow::Date32Type>::TypeClass>::value, arrow::Status>::type arrow::ArrayPrinter::WriteDataValues<arrow::NumericArray<arrow::Date32Type> >(arrow::NumericArray<arrow::Date32Type> const&)::{lambda(long)#1}::operator()(long) const (this=0x7fffffffb610, i=0) at /home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/pretty_print.cc:170
#13 0x00007ffff535395b in arrow::ArrayPrinter::WriteValues<std::enable_if<std::is_base_of<arrow::DateType, arrow::NumericArray<arrow::Date32Type>::TypeClass>::value, arrow::Status>::type arrow::ArrayPrinter::WriteDataValues<arrow::NumericArray<arrow::Date32Type> >(arrow::NumericArray<arrow::Date32Type> const&)::{lambda(long)#1}>(arrow::Array const&, std::enable_if<std::is_base_of<arrow::DateType, arrow::NumericArray<arrow::Date32Type>::TypeClass>::value, arrow::Status>::type arrow::ArrayPrinter::WriteDataValues<arrow::NumericArray<arrow::Date32Type> >(arrow::NumericArray<arrow::Date32Type> const&)::{lambda(long)#1}&&) (this=0x7fffffffb610, array=..., func=...)
    at /home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/pretty_print.cc:137
#14 0x00007ffff5350dd5 in arrow::ArrayPrinter::WriteDataValues<arrow::NumericArray<arrow::Date32Type> > (this=0x7fffffffb610, array=...) at /home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/pretty_print.cc:170
#15 0x00007ffff534ee4f in arrow::ArrayPrinter::Visit<arrow::NumericArray<arrow::Date32Type> > (this=0x7fffffffb610, array=...) at /home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/pretty_print.cc:314
#16 0x00007ffff534ccd1 in arrow::VisitArrayInline<arrow::ArrayPrinter> (array=..., visitor=0x7fffffffb610) at /home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/visitor_inline.h:126
#17 0x00007ffff534b352 in arrow::ArrayPrinter::Print (this=0x7fffffffb610, array=...) at /home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/pretty_print.cc:389
 {noformat}

> [Python] Crashes and incorrect results when converting large integers to dates
> ------------------------------------------------------------------------------
>
>                 Key: ARROW-12011
>                 URL: https://issues.apache.org/jira/browse/ARROW-12011
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 3.0.0
>         Environment: OS: Windows 10 Pro (Version 20H2)
> CPU: AMD Ryzen 5 1600 Six-Core Processor 3.20 GHz
> Python: 3.8.8 AMD64
> pyarrow is latest version installed with pip
>            Reporter: Tim Evans
>            Priority: Major
>
> Running this code snippet will cause a crash. This happens for a range of numbers around this one as well:
>  
> {code:java}
> import pyarrow
> date = pyarrow.array([-1448879500], pyarrow.date32())
> print(date)
> {code}
> I don't know where this crash is coming from, so it might be in the C++ code rather than the Python bindings.
> For other extreme numbers you get the wrong result. It looks like something is overflowing. Here is the input and result for a few different examples:
>  * -2000000000 -> 31179-12-27
>  * -1000000000 -> 16574-12-29
>  * 2000000000 -> -27240-01-06
>  * 1000000000 -> -12635-01-03
> I would prefer if these gave errors rather than silently overflowing.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)