You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@arrow.apache.org by "SHIMA Tatsuya (Jira)" <ji...@apache.org> on 2022/10/01 09:44:00 UTC

[jira] [Resolved] (ARROW-17429) [R] Error messages are not helpful of read_csv_arrow with col_types option

     [ https://issues.apache.org/jira/browse/ARROW-17429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

SHIMA Tatsuya resolved ARROW-17429.
-----------------------------------
    Resolution: Fixed

Seems fixed by ARROW-17355

> [R] Error messages are not helpful of read_csv_arrow with col_types option
> --------------------------------------------------------------------------
>
>                 Key: ARROW-17429
>                 URL: https://issues.apache.org/jira/browse/ARROW-17429
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>            Reporter: SHIMA Tatsuya
>            Assignee: SHIMA Tatsuya
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> The error message displayed when a non-convertible type is specified does not seem to help in the development version.
> {code:r}
> tbl <- tibble::tibble(time = c("1970-01-01T12:00:00+12:00"))
> csv_file <- tempfile()
> on.exit(unlink(csv_file))
> write.csv(tbl, csv_file, row.names = FALSE)
> arrow::read_csv_arrow(csv_file, col_types = "?", col_names = "x", skip = 1)
> #> # A tibble: 1 × 1
> #>   x
> #>   <dttm>
> #> 1 1970-01-01 00:00:00
> arrow::read_csv_arrow(csv_file, col_types = "c", col_names = "x", skip = 1)
> #> # A tibble: 1 × 1
> #>   x
> #>   <chr>
> #> 1 1970-01-01T12:00:00+12:00
> arrow::read_csv_arrow(csv_file, col_types = "i", col_names = "x", skip = 1)
> #> Error in as.data.frame(tab): object 'tab' not found
> arrow::read_csv_arrow(csv_file, col_types = "T", col_names = "x", skip = 1)
> #> Error in as.data.frame(tab): object 'tab' not found
> {code}
> In arrow 9.0.0
> {code:r}
> tbl <- tibble::tibble(time = c("1970-01-01T12:00:00+12:00"))
> csv_file <- tempfile()
> on.exit(unlink(csv_file))
> write.csv(tbl, csv_file, row.names = FALSE)
> arrow::read_csv_arrow(csv_file, col_types = "?", col_names = "x", skip = 1)
> #> # A tibble: 1 × 1
> #>   x
> #>   <dttm>
> #> 1 1970-01-01 00:00:00
> arrow::read_csv_arrow(csv_file, col_types = "c", col_names = "x", skip = 1)
> #> # A tibble: 1 × 1
> #>   x
> #>   <chr>
> #> 1 1970-01-01T12:00:00+12:00
> arrow::read_csv_arrow(csv_file, col_types = "i", col_names = "x", skip = 1)
> #> Error:
> #> ! Invalid: In CSV column #0: CSV conversion error to int32: invalid value '1970-01-01T12:00:00+12:00'
> arrow::read_csv_arrow(csv_file, col_types = "T", col_names = "x", skip = 1)
> #> Error:
> #> ! Invalid: In CSV column #0: CSV conversion error to timestamp[ns]: expected no zone offset in '1970-01-01T12:00:00+12:00'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)