You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/06/04 15:52:09 UTC

[GitHub] [arrow] nealrichardson commented on a change in pull request #10326: ARROW-12791: [R] Better error handling for DatasetFactory$Finish() when no format specified

nealrichardson commented on a change in pull request #10326:
URL: https://github.com/apache/arrow/pull/10326#discussion_r645674370



##########
File path: r/tests/testthat/test-dataset.R
##########
@@ -1502,3 +1502,12 @@ test_that("Collecting zero columns from a dataset doesn't return entire dataset"
     c(32, 0)
   )
 })
+
+# see https://issues.apache.org/jira/browse/ARROW-12791
+test_that("Error if no format specified and files are not parquet", {
+  skip_if_not_available("parquet")
+  expect_error(
+    open_dataset(csv_dir, partitioning = "part"),
+    regexp = "Did you mean to specify a 'format'? other than the default (parquet)?"

Review comment:
       ```suggestion
       "Did you mean to specify a 'format' other than the default (parquet)?",
       fixed = TRUE
   ```

##########
File path: r/tests/testthat/test-dataset.R
##########
@@ -1502,3 +1502,12 @@ test_that("Collecting zero columns from a dataset doesn't return entire dataset"
     c(32, 0)
   )
 })
+
+# see https://issues.apache.org/jira/browse/ARROW-12791
+test_that("Error if no format specified and files are not parquet", {
+  skip_if_not_available("parquet")
+  expect_error(
+    open_dataset(csv_dir, partitioning = "part"),
+    regexp = "Did you mean to specify a 'format'? other than the default (parquet)?"
+  )
+})

Review comment:
       Technically, for completeness, we should test that the extra error message *isn't* shown if you explicitly provided `format = "parquet"`. Maybe this would work?
   
   ```suggestion
     expect_failure(
       expect_error(
         open_dataset(csv_dir, partitioning = "part", format = "parquet"),
         "Did you mean to specify a 'format'"
       )
     )
   })
   ```

##########
File path: r/R/util.R
##########
@@ -110,3 +110,15 @@ handle_embedded_nul_error <- function(e) {
   }
   stop(e)
 }
+
+handle_parquet_io_error <- function(e, format) {
+  msg <- conditionMessage(e)
+  if (grepl("Parquet magic bytes not found in footer", msg) && length(format) > 1 && is_character(format)) {
+    abort(c(

Review comment:
       ```suggestion
       # If length(format) > 1, that means it is (almost certainly) the default/not specified value
       # so let the user know that they should specify the actual (not parquet) format
       abort(c(
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org