You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/09/11 15:16:17 UTC

[GitHub] [arrow] nealrichardson commented on a change in pull request #8125: ARROW-9387: [R] Use new C++ table select method

nealrichardson commented on a change in pull request #8125:
URL: https://github.com/apache/arrow/pull/8125#discussion_r487105657



##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])

Review comment:
       Should we maintain the base R behavior for subsetting on column 0 (it is ignored)? I can't imagine why someone would do such a thing on purpose.

##########
File path: r/R/record-batch.R
##########
@@ -218,7 +207,12 @@ names.RecordBatch <- function(x) x$names()
   if (!missing(j)) {
     # Selecting columns is cheaper than filtering rows, so do it first.
     # That way, if we're filtering too, we have fewer arrays to filter/slice/take
-    x <- x$select(j)
+    if (is_integerish(j)) {
+      x <- x$SelectColumns(as.integer(j) - 1L)

Review comment:
       Judging from the tests, I think we want to add something like this (adapted from filter_rows in array.R):
   
   ```suggestion
         if (all(j < 0)) {
           # in R, negative j means "everything but j"
           j <- setdiff(seq_len(x$num_columns), -1 * j)
         }
         x <- x$SelectColumns(as.integer(j) - 1L)
   ```

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])
 
   expect_error(tab[[c(4, 3)]])
   expect_error(tab[[NA]], "'i' must be character or numeric, not logical")
   expect_error(tab[[NULL]], "'i' must be character or numeric, not NULL")
   expect_error(tab[[c("asdf", "jkl;")]], 'length(name) not equal to 1', fixed = TRUE)
-  expect_error(tab[-3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[-3:3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[1000]) # This is caught in vctrs, assert more specifically when it stabilizes
-  expect_error(tab[1:1000]) # same as ^
+  expect_error(tab[-3], "Invalid column index")

Review comment:
       This may not be allowed in tidyselect but `[` supports this to drop a column, so I don't think this one should error. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org