You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2022/07/31 14:40:00 UTC
[jira] [Commented] (ARROW-17241) [R] Support scientific notation for integers in csv reader
[ https://issues.apache.org/jira/browse/ARROW-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17573442#comment-17573442 ]
Antoine Pitrou commented on ARROW-17241:
----------------------------------------
You can easily cast floating-point columns to integer as desired.
cc [~npr] for opinions.
> [R] Support scientific notation for integers in csv reader
> ----------------------------------------------------------
>
> Key: ARROW-17241
> URL: https://issues.apache.org/jira/browse/ARROW-17241
> Project: Apache Arrow
> Issue Type: New Feature
> Components: R
> Environment: arrow R package 8.0.0
> Reporter: Hugo Gruson
> Priority: Minor
>
> It looks like the csv reader doesn't support scientific notation for integers, as shown in the following reprex. However, it works fine for floats/doubles.
> Could support for scientific notation for integers be added please?
>
> {noformat}
> testcsv <- tempfile(fileext = ".csv")
> c(1, 2, 1e6) |>
> as.data.frame() |>
> setNames("int") |>
> write.csv(testcsv, row.names = FALSE)
> arrow::read_csv_arrow(testcsv, col_types = "i", col_names = "int", skip = 1)
> #> Error:
> #> ! Invalid: In CSV column #0: CSV conversion error to int32: invalid value '1e+06'
> #> Backtrace:
> #> ▆
> #> 1. └─arrow (local) `<fn>`(...)
> #> 2. └─base::tryCatch(...)
> #> 3. └─base (local) tryCatchList(expr, classes, parentenv, handlers)
> #> 4. └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
> #> 5. └─value[[3L]](cond)
> #> 6. └─arrow:::handle_csv_read_error(e, schema, call)
> #> 7. └─rlang::abort(msg, call = call)
> arrow::read_csv_arrow(testcsv, col_types = "d", col_names = "int", skip = 1)
> #> # A tibble: 3 × 1
> #> int
> #> <dbl>
> #> 1 1
> #> 2 2
> #> 3 1000000
> {noformat}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)