You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/01/28 16:02:31 UTC

[GitHub] [arrow] jonkeane commented on a change in pull request #12277: ARROW-15480: [R] Expand on schema/colnames mismatch error messages

jonkeane commented on a change in pull request #12277:
URL: https://github.com/apache/arrow/pull/12277#discussion_r794636116



##########
File path: r/R/dataset-format.R
##########
@@ -133,10 +133,36 @@ CsvFileFormat$create <- function(...,
   schema_names <- names(schema)
 
   if (!is.null(schema) & !identical(schema_names, column_names)) {
+    missing_from_schema <- setdiff(column_names, schema_names)
+    missing_from_colnames <- setdiff(schema_names, column_names)
+    message_colnames <- NULL
+    message_schema <- NULL
+    message_order <- NULL
+
+    if (length(missing_from_colnames) > 0) {
+      message_colnames <- paste(
+        oxford_paste(missing_from_colnames, quote_symbol = "`"),
+        "not present in `column_names`"
+      )
+    }

Review comment:
       We don't need to do this as part of this PR, but I've seen this pattern a few times now:
   
   ```
   missing_from <- setdiff(set_a, set_b)
   if (length(missing_rom) > 0) {
     # construct a message
     # sometimes also abort()
   }
   ```
   
   Maybe the message bits are too unique and there's a bit too many types of them that we couldn't do something like a function that `check_match(x = set_a, y = set_b, x_name = "column_names", y_name = "schema")` that would produce messages like "X, Y, and Z not present in `column_names`".
   
   If you think that's feasible, would you mind making a jira linking this code to it as an improvement we could make?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org