You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/06/25 15:20:14 UTC

[GitHub] [arrow] jonkeane commented on a change in pull request #10601: ARROW-13149 [R]: Convert named lists to structs instead of (unnamed) lists

jonkeane commented on a change in pull request #10601:
URL: https://github.com/apache/arrow/pull/10601#discussion_r658846244



##########
File path: r/R/metadata.R
##########
@@ -56,7 +56,21 @@ apply_arrow_r_metadata <- function(x, r_metadata) {
     if (is.data.frame(x)) {
       if (length(names(x)) && !is.null(columns_metadata)) {
         for (name in intersect(names(columns_metadata), names(x))) {
-          x[[name]] <- apply_arrow_r_metadata(x[[name]], columns_metadata[[name]])
+          x[[name]] <- tryCatch({
+            x[[name]] <- apply_arrow_r_metadata(x[[name]], columns_metadata[[name]])
+          },
+          error = function(e) {
+            # if we are erroring because of incompatible data, try and make this
+            # a tibble
+            # TODO: also check if this is a list?
+            # TODO: only if there are exactly as many sub-list elements as rows?
+            # TODO: decide if this obviates the need for the option
+            #   arrow.strucs_as_dfs (or if that is actually a better way to handle that)
+            if (grepl("must be compatible with existing data", e$message))
+              x[[name]] <- as.data.frame(x[[name]])
+              class(x[[name]]) <-  c("tbl_df", "tbl", "data.frame")
+              apply_arrow_r_metadata(x[[name]], columns_metadata[[name]])
+          })

Review comment:
       This code chunk will make it so that we can read in data saved in parquet files that don't also store the data.frame metadata needed to reconstruct them (since until this PR we always stripped that).
   
   If we go this route, we probably should implement the todos listed here and also warn that this setup is deprecated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org