You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Stephen Bias (Jira)" <ji...@apache.org> on 2021/04/26 10:42:00 UTC

[jira] [Created] (ARROW-12539) Unable to read date64 or date32 in specific format

Stephen Bias created ARROW-12539:
------------------------------------

             Summary: Unable to read date64 or date32 in specific format
                 Key: ARROW-12539
                 URL: https://issues.apache.org/jira/browse/ARROW-12539
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 3.0.0
            Reporter: Stephen Bias


when importing csv data with dates in the format `%d-%b-%y` or `%d-%b-%Y` an error is given in conversion:

`pyarrow.lib.ArrowInvalid: In CSV column #1: CSV conversion error to date64[ms]: invalid value '15-JAN-16'`

 

example:

```
import pyarrow as pa
from pyarrow import csv

data = b"a,b\n1,15-OCT-15\n2,18-JUN-90\n"
tp = ["%d-%b-%y"]

try:
    schema_d64 = pa.schema([pa.field("a", pa.int64()), pa.field("b", pa.date64())])
    co_d64 = csv.ConvertOptions(timestamp_parsers=tp, column_types=schema_d64)
    a_d64 = csv.read_csv(pa.py_buffer(data), convert_options=co_d64)
except Exception as e:
    print(e)

try:
    schema_d32 = pa.schema([pa.field("a", pa.int64()), pa.field("b", pa.date32())])
    co_d32 = csv.ConvertOptions(timestamp_parsers=tp, column_types=schema_d32)
    a_d32 = csv.read_csv(pa.py_buffer(data), convert_options=co_d32)
except Exception as e:
    print(e)
```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)