You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Hugo Gruson (Jira)" <ji...@apache.org> on 2022/07/28 14:58:00 UTC
[jira] [Created] (ARROW-17241) Support scientific notation for integers in csv reader
Hugo Gruson created ARROW-17241:
-----------------------------------
Summary: Support scientific notation for integers in csv reader
Key: ARROW-17241
URL: https://issues.apache.org/jira/browse/ARROW-17241
Project: Apache Arrow
Issue Type: New Feature
Environment: arrow R package 8.0.0
Reporter: Hugo Gruson
It looks like the csv reader doesn't support scientific notation for integers, as shown in the following reprex. However, it works fine for floats/doubles.
Could support for scientific notation for integers be added please?
{noformat}
testcsv <- tempfile(fileext = ".csv")
c(1, 2, 1e6) |>
as.data.frame() |>
setNames("int") |>
write.csv(testcsv, row.names = FALSE)
arrow::read_csv_arrow(testcsv, col_types = "i", col_names = "int", skip = 1)
#> Error:
#> ! Invalid: In CSV column #0: CSV conversion error to int32: invalid value '1e+06'
#> Backtrace:
#> ▆
#> 1. └─arrow (local) `<fn>`(...)
#> 2. └─base::tryCatch(...)
#> 3. └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#> 4. └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#> 5. └─value[[3L]](cond)
#> 6. └─arrow:::handle_csv_read_error(e, schema, call)
#> 7. └─rlang::abort(msg, call = call)
arrow::read_csv_arrow(testcsv, col_types = "d", col_names = "int", skip = 1)
#> # A tibble: 3 × 1
#> int
#> <dbl>
#> 1 1
#> 2 2
#> 3 1000000
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)