You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Jonathan Keane (Jira)" <ji...@apache.org> on 2022/03/02 14:03:00 UTC

[jira] [Resolved] (ARROW-15599) [R] Convert a column as a sub-second timestamp from CSV file with the `T` col type option

     [ https://issues.apache.org/jira/browse/ARROW-15599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Keane resolved ARROW-15599.
------------------------------------
    Fix Version/s: 8.0.0
       Resolution: Fixed

Issue resolved by pull request 12474
[https://github.com/apache/arrow/pull/12474]

> [R] Convert a column as a sub-second timestamp from CSV file with the `T` col type option
> -----------------------------------------------------------------------------------------
>
>                 Key: ARROW-15599
>                 URL: https://issues.apache.org/jira/browse/ARROW-15599
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>    Affects Versions: 6.0.1
>         Environment: R version 4.1.2 (2021-11-01)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 20.04.3 LTS
>            Reporter: SHIMA Tatsuya
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 8.0.0
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I tried to read the csv column type as timestamp, but I could only get it to work well when `col_types` was not specified.
> I'm sorry if I missed something and this is the expected behavior. (It would be great if you could add an example with `col_types` in the documentation.)
> {code:r}
> library(arrow)
> #>
> #> Attaching package: 'arrow'
> #> The following object is masked from 'package:utils':
> #>
> #>     timestamp
> t_string <- tibble::tibble(
>   x = "2018-10-07 19:04:05.005"
> )
> write_csv_arrow(t_string, "tmp.csv")
> read_csv_arrow(
>   "tmp.csv",
>   as_data_frame = FALSE
> )
> #> Table
> #> 1 rows x 1 columns
> #> $x <timestamp[ns]>
> read_csv_arrow(
>   "tmp.csv",
>   col_names = "x",
>   col_types = "?",
>   skip = 1,
>   as_data_frame = FALSE
> )
> #> Table
> #> 1 rows x 1 columns
> #> $x <timestamp[ns]>
> read_csv_arrow(
>   "tmp.csv",
>   col_names = "x",
>   col_types = "T",
>   skip = 1,
>   as_data_frame = FALSE
> )
> #> Error: Invalid: In CSV column #0: CSV conversion error to timestamp[s]: invalid value '2018-10-07 19:04:05.005'
> read_csv_arrow(
>   "tmp.csv",
>   col_names = "x",
>   col_types = "T",
>   as_data_frame = FALSE,
>   skip = 1,
>   timestamp_parsers = "%Y-%m-%d %H:%M:%S"
> )
> #> Error: Invalid: In CSV column #0: CSV conversion error to timestamp[s]: invalid value '2018-10-07 19:04:05.005'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)