You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Gary (Jira)" <ji...@apache.org> on 2020/09/03 14:06:00 UTC

[jira] [Created] (ARROW-9907) [Python] Failed to parse string into timestamp

Gary created ARROW-9907:
---------------------------

             Summary: [Python] Failed to parse string into timestamp
                 Key: ARROW-9907
                 URL: https://issues.apache.org/jira/browse/ARROW-9907
             Project: Apache Arrow
          Issue Type: Bug
            Reporter: Gary


Hi,

Not sure if I am missing something, but I am unable to get pyarrow to parse my datetimes that are being inferred as strings, to be timestamps.

My strings are arriving in CSVs with this format: '2015-01-09 00:00:00.000'

I have tried:
```
convert_ops = csv.ConvertOptions(timestamp_parsers=['%Y-%m-%d %H:%M:%S.%f])
df = csv.read_csv('path_to_csv', convert_options=convert_opts)
print(df.schema)
```


This yields no change and has my columns with these formatted timestamps still showing as strings.

Additionally, I have tried casting as well:
```
dfschema = pa.schema([
('date_column', pa.timestamp('ms'))
])
df = csv.read_csv('path_to_csv')
df.cast(target_schema=dfschema)
```

This way yields the error: "pyarrow.lib.ArrowInvalid: Failed to parse string: 2015-01-09 00:00:00.000"

I am using pyarrow=1.0.1 on a linux docker container.

I tried to send an email to the users email list but it keeps returning a Mailer Daemon error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)