You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/01/18 17:14:15 UTC

[GitHub] [arrow] jonkeane commented on a change in pull request #12170: ARROW-14461 [R] write_dataset() allows users to pass invalid additional arguments

jonkeane commented on a change in pull request #12170:
URL: https://github.com/apache/arrow/pull/12170#discussion_r786960865



##########
File path: r/tests/testthat/test-dataset-write.R
##########
@@ -505,3 +505,40 @@ test_that("Max partitions fails with non-integer values and less than required p
     "max_partitions must be a positive, non-missing integer"
   )
 })
+
+test_that("write_dataset checks for format-specific arguments", {
+  df <- tibble::tibble(
+    int = 1:10,
+    dbl = as.numeric(1:10),
+    lgl = rep(c(TRUE, FALSE, NA, TRUE, FALSE), 2),
+    chr = letters[1:10],
+  )
+  dst_dir <- make_temp_dir()
+  expect_error(
+    write_dataset(df, dst_dir, format = "feather", compression = "snappy"),

Review comment:
       It might be nice to suggest using `codec` in this case (feather doesn't support snappy, but _does_ support lz4)

##########
File path: r/tests/testthat/test-dataset-write.R
##########
@@ -505,3 +505,40 @@ test_that("Max partitions fails with non-integer values and less than required p
     "max_partitions must be a positive, non-missing integer"
   )
 })
+
+test_that("write_dataset checks for format-specific arguments", {
+  df <- tibble::tibble(
+    int = 1:10,
+    dbl = as.numeric(1:10),
+    lgl = rep(c(TRUE, FALSE, NA, TRUE, FALSE), 2),
+    chr = letters[1:10],
+  )
+  dst_dir <- make_temp_dir()
+  expect_error(
+    write_dataset(df, dst_dir, format = "feather", compression = "snappy"),
+    "The following argument is not valid for your chosen 'format': \"compression\""
+  )
+  expect_error(
+    write_dataset(df, dst_dir, format = "feather", nonsensical_arg = "blah-blah"),
+    "The following argument is not valid for your chosen 'format': \"nonsensical_arg\""
+  )
+
+  expect_error(
+    write_dataset(df, dst_dir, format = "arrow", nonsensical_arg = "blah-blah"),
+    "The following argument is not valid for your chosen 'format': \"nonsensical_arg\""
+  )
+  expect_error(
+    write_dataset(df, dst_dir, format = "ipc", nonsensical_arg = "blah-blah"),
+    "The following argument is not valid for your chosen 'format': \"nonsensical_arg\""
+  )
+
+  expect_error(
+    write_dataset(df, dst_dir, format = "csv", nonsensical_arg = "blah-blah"),
+    "nonsensical_arg = \"blah-blah\""
+  )
+
+  expect_error(
+    write_dataset(df, dst_dir, format = "parquet", nonsensical_arg = "blah-blah"),
+    "The following argument is not valid for your chosen 'format': \"non_sensical_arg\""
+  )

Review comment:
       Does this test pass? Looking at the code block above, it looks like you do validate the arguments for `format = "feather"` but I don't think you'll hit that block for `format = "parquet"` (and it would be a slightly different set of arguments since compression _is_ supported as an argument for parquet.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org