You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Alessandro Molina (Jira)" <ji...@apache.org> on 2022/04/06 14:25:00 UTC

[jira] [Updated] (ARROW-13625) [C++][CSV] Timestamp parsing should accept any valid ISO 8601 without requiring custom parse strings

     [ https://issues.apache.org/jira/browse/ARROW-13625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alessandro Molina updated ARROW-13625:
--------------------------------------
    Fix Version/s:     (was: 7.0.0)

> [C++][CSV] Timestamp parsing should accept any valid ISO 8601 without requiring custom parse strings
> ----------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-13625
>                 URL: https://issues.apache.org/jira/browse/ARROW-13625
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Neal Richardson
>            Priority: Major
>
> I was trying to read in some git logs and got this parse error for a column I had declared as timestamp type:
> Error: Invalid: In CSV column #0: CSV conversion error to timestamp[s]: invalid value '2021-08-11T17:39:50-04:00'
> This is valid ISO 8601 and is what git log produces with the {{I}} "strict ISO 8601 format" option (https://git-scm.com/docs/pretty-formats). 
> I see mentioned on ARROW-10343 that timezone indicators are not supported--is that still true? And I recognize that it's not trivial because a timestamp array has to have the same timezone for all values, so if some rows in this CSV had different timezones listed, we would have to handle that (converting everything to UTC is probably the most useful thing but technically loses information).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)