You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2021/04/28 13:56:00 UTC

[jira] [Commented] (ARROW-12539) [C++] Unable to read date64 or date32 in specific format

    [ https://issues.apache.org/jira/browse/ARROW-12539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17334749#comment-17334749 ] 

Joris Van den Bossche commented on ARROW-12539:
-----------------------------------------------

Related, it would also be good (and probably a prerequisite for having it in CSV) to be able to cast strings to date (which works for timestamp, but not for date):

{code}
In [3]: pa.array(["2012-01-01"]).cast(pa.timestamp('ms'))
Out[3]: 
<pyarrow.lib.TimestampArray object at 0x7fae22d778e0>
[
  2012-01-01 00:00:00.000
]

In [4]: pa.array(["2012-01-01"]).cast(pa.date32())
...
ArrowNotImplementedError: Unsupported cast from string to date32 using function cast_date32
{code}

> [C++] Unable to read date64 or date32 in specific format
> --------------------------------------------------------
>
>                 Key: ARROW-12539
>                 URL: https://issues.apache.org/jira/browse/ARROW-12539
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>    Affects Versions: 3.0.0
>            Reporter: Stephen Bias
>            Priority: Major
>              Labels: csv, date
>
> when importing csv data with dates in the format {{"%d-%b-%y"}} or {{"%d-%b-%Y"}} an error is given in conversion:
> example:
> {code:python}
> import pyarrow as pa
> from pyarrow import csv 
> data = b"a,b\n1,15-OCT-15\n2,18-JUN-90\n"
> tp = ["%d-%b-%y"]
> try:
>     schema_d64 = pa.schema([pa.field("a", pa.int64()), pa.field("b", pa.date64())])
>     co_d64 = csv.ConvertOptions(timestamp_parsers=tp, column_types=schema_d64)
>     a_d64 = csv.read_csv(pa.py_buffer(data), convert_options=co_d64)
> except Exception as e:
>     print(e)
> try:
>     schema_d32 = pa.schema([pa.field("a", pa.int64()), pa.field("b", pa.date32())])
>     co_d32 = csv.ConvertOptions(timestamp_parsers=tp, column_types=schema_d32)
>     a_d32 = csv.read_csv(pa.py_buffer(data), convert_options=co_d32)
> except Exception as e:
>     print(e){code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)