You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Nicola Crane (Jira)" <ji...@apache.org> on 2021/11/29 17:44:00 UTC
[jira] [Updated] (ARROW-14909) [R] List column containing data frames with varying numbers of columns
[ https://issues.apache.org/jira/browse/ARROW-14909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nicola Crane updated ARROW-14909:
---------------------------------
Summary: [R] List column containing data frames with varying numbers of columns (was: List column containing data frames with varying numbers of columns)
> [R] List column containing data frames with varying numbers of columns
> ----------------------------------------------------------------------
>
> Key: ARROW-14909
> URL: https://issues.apache.org/jira/browse/ARROW-14909
> Project: Apache Arrow
> Issue Type: Bug
> Components: R
> Affects Versions: 6.0.1
> Environment: R 4.1.0, arrow 6.0.1, macOS Big Sur 11.6
> Reporter: Dan Hicks
> Priority: Major
>
> I'm brand new to arrow, but didn't seem to find anything like this issue in this bug tracker; apologies if this is a known issue.
> Arrow is giving me an error when I try to write Parquet or Feather files for a dataframe that contains a list column ({{{}df{}}} in the MWE) that contains dataframes that have varying numbers of columns:
> {code:r}
> library(tibble)
> library(arrow)
> df1 = data.frame(x = c(1, 2, 3),
> y = c('a', 'b', 'c'))
> df2 = data.frame(x = c(4),
> y = c('d'),
> z = c('foo'))
> comb_df = tibble(id = c(1, 2),
> df = c(list(df1), list(df2)))
> write_dataset(comb_df, 'mwe', format = 'feather')
> {code}
> This gives me
> {code:java}
> Error: Unknown: Number of fields in struct (2) incompatible with number of columns in the data frame (3)
> {code}
> Session info:
> {code}
> ─ Session info ────────────────────────────────────────────────────────────────────────
> setting value
> version R version 4.1.0 (2021-05-18)
> os macOS Big Sur 11.6
> system x86_64, darwin17.0
> ui RStudio
> language (EN)
> collate en_US.UTF-8
> ctype en_US.UTF-8
> tz America/Los_Angeles
> date 2021-11-29
> ─ Packages ────────────────────────────────────────────────────────────────────────────
> package * version date lib source
> arrow * 6.0.1 2021-11-20 [1] CRAN (R 4.1.0)
> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0)
> bit 4.0.4 2020-08-04 [1] CRAN (R 4.1.0)
> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.1.0)
> cli 3.0.1 2021-07-17 [1] CRAN (R 4.1.0)
> crayon 1.4.1 2021-02-08 [1] CRAN (R 4.1.0)
> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0)
> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0)
> glue 1.4.2 2020-08-27 [1] CRAN (R 4.1.0)
> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.0)
> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0)
> pillar 1.6.3 2021-09-26 [1] CRAN (R 4.1.0)
> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0)
> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.1.0)
> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0)
> rlang 0.4.11 2021-04-30 [1] CRAN (R 4.1.0)
> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0)
> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0)
> tibble * 3.1.5 2021-09-30 [1] CRAN (R 4.1.0)
> tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0)
> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0)
> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0)
> withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0)
> [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)