You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/09/07 13:19:48 UTC

[GitHub] [arrow] romainfrancois opened a new pull request #8125: ARROW-9387: [R] Use new C++ table select method

romainfrancois opened a new pull request #8125:
URL: https://github.com/apache/arrow/pull/8125


   R follow up from #7272
   
   The current `$select()` uses a more familiar (though more expensive) tidyselect interface: 
   
   ``` r
   library(arrow, warn.conflicts = FALSE)
   
   tab <- Table$create(x1 = 1:2, x2 = 3:4, y = 5:6)
   # lower level 0-based indices
   tab$SelectColumns(0:1)
   #> Table
   #> 2 rows x 2 columns
   #> $x1 <int32>
   #> $x2 <int32>
   
   # higher level tidyselect based
   tab$select(starts_with("x"))
   #> Table
   #> 2 rows x 2 columns
   #> $x1 <int32>
   #> $x2 <int32>
   ```
   
   <sup>Created on 2020-09-07 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0.9001)</sup>
   
   Do we want both ? `$select()` is used e.g. by the `read_csv(col_select=)` argument. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson closed pull request #8125: ARROW-9387: [R] Use new C++ table select method

Posted by GitBox <gi...@apache.org>.
nealrichardson closed pull request #8125:
URL: https://github.com/apache/arrow/pull/8125






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #8125: ARROW-9387: [R] Use new C++ table select method

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #8125:
URL: https://github.com/apache/arrow/pull/8125#discussion_r487105657



##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])

Review comment:
       Should we maintain the base R behavior for subsetting on column 0 (it is ignored)? I can't imagine why someone would do such a thing on purpose.

##########
File path: r/R/record-batch.R
##########
@@ -218,7 +207,12 @@ names.RecordBatch <- function(x) x$names()
   if (!missing(j)) {
     # Selecting columns is cheaper than filtering rows, so do it first.
     # That way, if we're filtering too, we have fewer arrays to filter/slice/take
-    x <- x$select(j)
+    if (is_integerish(j)) {
+      x <- x$SelectColumns(as.integer(j) - 1L)

Review comment:
       Judging from the tests, I think we want to add something like this (adapted from filter_rows in array.R):
   
   ```suggestion
         if (all(j < 0)) {
           # in R, negative j means "everything but j"
           j <- setdiff(seq_len(x$num_columns), -1 * j)
         }
         x <- x$SelectColumns(as.integer(j) - 1L)
   ```

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])
 
   expect_error(tab[[c(4, 3)]])
   expect_error(tab[[NA]], "'i' must be character or numeric, not logical")
   expect_error(tab[[NULL]], "'i' must be character or numeric, not NULL")
   expect_error(tab[[c("asdf", "jkl;")]], 'length(name) not equal to 1', fixed = TRUE)
-  expect_error(tab[-3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[-3:3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[1000]) # This is caught in vctrs, assert more specifically when it stabilizes
-  expect_error(tab[1:1000]) # same as ^
+  expect_error(tab[-3], "Invalid column index")

Review comment:
       This may not be allowed in tidyselect but `[` supports this to drop a column, so I don't think this one should error. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #8125: ARROW-9387: [R] Use new C++ table select method

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #8125:
URL: https://github.com/apache/arrow/pull/8125#discussion_r487105657



##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])

Review comment:
       Should we maintain the base R behavior for subsetting on column 0 (it is ignored)? I can't imagine why someone would do such a thing on purpose.

##########
File path: r/R/record-batch.R
##########
@@ -218,7 +207,12 @@ names.RecordBatch <- function(x) x$names()
   if (!missing(j)) {
     # Selecting columns is cheaper than filtering rows, so do it first.
     # That way, if we're filtering too, we have fewer arrays to filter/slice/take
-    x <- x$select(j)
+    if (is_integerish(j)) {
+      x <- x$SelectColumns(as.integer(j) - 1L)

Review comment:
       Judging from the tests, I think we want to add something like this (adapted from filter_rows in array.R):
   
   ```suggestion
         if (all(j < 0)) {
           # in R, negative j means "everything but j"
           j <- setdiff(seq_len(x$num_columns), -1 * j)
         }
         x <- x$SelectColumns(as.integer(j) - 1L)
   ```

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])
 
   expect_error(tab[[c(4, 3)]])
   expect_error(tab[[NA]], "'i' must be character or numeric, not logical")
   expect_error(tab[[NULL]], "'i' must be character or numeric, not NULL")
   expect_error(tab[[c("asdf", "jkl;")]], 'length(name) not equal to 1', fixed = TRUE)
-  expect_error(tab[-3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[-3:3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[1000]) # This is caught in vctrs, assert more specifically when it stabilizes
-  expect_error(tab[1:1000]) # same as ^
+  expect_error(tab[-3], "Invalid column index")

Review comment:
       This may not be allowed in tidyselect but `[` supports this to drop a column, so I don't think this one should error. 

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])

Review comment:
       Should we maintain the base R behavior for subsetting on column 0 (it is ignored)? I can't imagine why someone would do such a thing on purpose.

##########
File path: r/R/record-batch.R
##########
@@ -218,7 +207,12 @@ names.RecordBatch <- function(x) x$names()
   if (!missing(j)) {
     # Selecting columns is cheaper than filtering rows, so do it first.
     # That way, if we're filtering too, we have fewer arrays to filter/slice/take
-    x <- x$select(j)
+    if (is_integerish(j)) {
+      x <- x$SelectColumns(as.integer(j) - 1L)

Review comment:
       Judging from the tests, I think we want to add something like this (adapted from filter_rows in array.R):
   
   ```suggestion
         if (all(j < 0)) {
           # in R, negative j means "everything but j"
           j <- setdiff(seq_len(x$num_columns), -1 * j)
         }
         x <- x$SelectColumns(as.integer(j) - 1L)
   ```

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])
 
   expect_error(tab[[c(4, 3)]])
   expect_error(tab[[NA]], "'i' must be character or numeric, not logical")
   expect_error(tab[[NULL]], "'i' must be character or numeric, not NULL")
   expect_error(tab[[c("asdf", "jkl;")]], 'length(name) not equal to 1', fixed = TRUE)
-  expect_error(tab[-3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[-3:3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[1000]) # This is caught in vctrs, assert more specifically when it stabilizes
-  expect_error(tab[1:1000]) # same as ^
+  expect_error(tab[-3], "Invalid column index")

Review comment:
       This may not be allowed in tidyselect but `[` supports this to drop a column, so I don't think this one should error. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson closed pull request #8125: ARROW-9387: [R] Use new C++ table select method

Posted by GitBox <gi...@apache.org>.
nealrichardson closed pull request #8125:
URL: https://github.com/apache/arrow/pull/8125


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8125: ARROW-9387: [R] Use new C++ table select method

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8125:
URL: https://github.com/apache/arrow/pull/8125#issuecomment-688322569


   https://issues.apache.org/jira/browse/ARROW-9387


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson closed pull request #8125: ARROW-9387: [R] Use new C++ table select method

Posted by GitBox <gi...@apache.org>.
nealrichardson closed pull request #8125:
URL: https://github.com/apache/arrow/pull/8125






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #8125: ARROW-9387: [R] Use new C++ table select method

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #8125:
URL: https://github.com/apache/arrow/pull/8125#discussion_r487105657



##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])

Review comment:
       Should we maintain the base R behavior for subsetting on column 0 (it is ignored)? I can't imagine why someone would do such a thing on purpose.

##########
File path: r/R/record-batch.R
##########
@@ -218,7 +207,12 @@ names.RecordBatch <- function(x) x$names()
   if (!missing(j)) {
     # Selecting columns is cheaper than filtering rows, so do it first.
     # That way, if we're filtering too, we have fewer arrays to filter/slice/take
-    x <- x$select(j)
+    if (is_integerish(j)) {
+      x <- x$SelectColumns(as.integer(j) - 1L)

Review comment:
       Judging from the tests, I think we want to add something like this (adapted from filter_rows in array.R):
   
   ```suggestion
         if (all(j < 0)) {
           # in R, negative j means "everything but j"
           j <- setdiff(seq_len(x$num_columns), -1 * j)
         }
         x <- x$SelectColumns(as.integer(j) - 1L)
   ```

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])
 
   expect_error(tab[[c(4, 3)]])
   expect_error(tab[[NA]], "'i' must be character or numeric, not logical")
   expect_error(tab[[NULL]], "'i' must be character or numeric, not NULL")
   expect_error(tab[[c("asdf", "jkl;")]], 'length(name) not equal to 1', fixed = TRUE)
-  expect_error(tab[-3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[-3:3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[1000]) # This is caught in vctrs, assert more specifically when it stabilizes
-  expect_error(tab[1:1000]) # same as ^
+  expect_error(tab[-3], "Invalid column index")

Review comment:
       This may not be allowed in tidyselect but `[` supports this to drop a column, so I don't think this one should error. 

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])

Review comment:
       Should we maintain the base R behavior for subsetting on column 0 (it is ignored)? I can't imagine why someone would do such a thing on purpose.

##########
File path: r/R/record-batch.R
##########
@@ -218,7 +207,12 @@ names.RecordBatch <- function(x) x$names()
   if (!missing(j)) {
     # Selecting columns is cheaper than filtering rows, so do it first.
     # That way, if we're filtering too, we have fewer arrays to filter/slice/take
-    x <- x$select(j)
+    if (is_integerish(j)) {
+      x <- x$SelectColumns(as.integer(j) - 1L)

Review comment:
       Judging from the tests, I think we want to add something like this (adapted from filter_rows in array.R):
   
   ```suggestion
         if (all(j < 0)) {
           # in R, negative j means "everything but j"
           j <- setdiff(seq_len(x$num_columns), -1 * j)
         }
         x <- x$SelectColumns(as.integer(j) - 1L)
   ```

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])
 
   expect_error(tab[[c(4, 3)]])
   expect_error(tab[[NA]], "'i' must be character or numeric, not logical")
   expect_error(tab[[NULL]], "'i' must be character or numeric, not NULL")
   expect_error(tab[[c("asdf", "jkl;")]], 'length(name) not equal to 1', fixed = TRUE)
-  expect_error(tab[-3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[-3:3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[1000]) # This is caught in vctrs, assert more specifically when it stabilizes
-  expect_error(tab[1:1000]) # same as ^
+  expect_error(tab[-3], "Invalid column index")

Review comment:
       This may not be allowed in tidyselect but `[` supports this to drop a column, so I don't think this one should error. 

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])

Review comment:
       Should we maintain the base R behavior for subsetting on column 0 (it is ignored)? I can't imagine why someone would do such a thing on purpose.

##########
File path: r/R/record-batch.R
##########
@@ -218,7 +207,12 @@ names.RecordBatch <- function(x) x$names()
   if (!missing(j)) {
     # Selecting columns is cheaper than filtering rows, so do it first.
     # That way, if we're filtering too, we have fewer arrays to filter/slice/take
-    x <- x$select(j)
+    if (is_integerish(j)) {
+      x <- x$SelectColumns(as.integer(j) - 1L)

Review comment:
       Judging from the tests, I think we want to add something like this (adapted from filter_rows in array.R):
   
   ```suggestion
         if (all(j < 0)) {
           # in R, negative j means "everything but j"
           j <- setdiff(seq_len(x$num_columns), -1 * j)
         }
         x <- x$SelectColumns(as.integer(j) - 1L)
   ```

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])
 
   expect_error(tab[[c(4, 3)]])
   expect_error(tab[[NA]], "'i' must be character or numeric, not logical")
   expect_error(tab[[NULL]], "'i' must be character or numeric, not NULL")
   expect_error(tab[[c("asdf", "jkl;")]], 'length(name) not equal to 1', fixed = TRUE)
-  expect_error(tab[-3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[-3:3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[1000]) # This is caught in vctrs, assert more specifically when it stabilizes
-  expect_error(tab[1:1000]) # same as ^
+  expect_error(tab[-3], "Invalid column index")

Review comment:
       This may not be allowed in tidyselect but `[` supports this to drop a column, so I don't think this one should error. 

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])

Review comment:
       Should we maintain the base R behavior for subsetting on column 0 (it is ignored)? I can't imagine why someone would do such a thing on purpose.

##########
File path: r/R/record-batch.R
##########
@@ -218,7 +207,12 @@ names.RecordBatch <- function(x) x$names()
   if (!missing(j)) {
     # Selecting columns is cheaper than filtering rows, so do it first.
     # That way, if we're filtering too, we have fewer arrays to filter/slice/take
-    x <- x$select(j)
+    if (is_integerish(j)) {
+      x <- x$SelectColumns(as.integer(j) - 1L)

Review comment:
       Judging from the tests, I think we want to add something like this (adapted from filter_rows in array.R):
   
   ```suggestion
         if (all(j < 0)) {
           # in R, negative j means "everything but j"
           j <- setdiff(seq_len(x$num_columns), -1 * j)
         }
         x <- x$SelectColumns(as.integer(j) - 1L)
   ```

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])
 
   expect_error(tab[[c(4, 3)]])
   expect_error(tab[[NA]], "'i' must be character or numeric, not logical")
   expect_error(tab[[NULL]], "'i' must be character or numeric, not NULL")
   expect_error(tab[[c("asdf", "jkl;")]], 'length(name) not equal to 1', fixed = TRUE)
-  expect_error(tab[-3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[-3:3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[1000]) # This is caught in vctrs, assert more specifically when it stabilizes
-  expect_error(tab[1:1000]) # same as ^
+  expect_error(tab[-3], "Invalid column index")

Review comment:
       This may not be allowed in tidyselect but `[` supports this to drop a column, so I don't think this one should error. 

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])

Review comment:
       Should we maintain the base R behavior for subsetting on column 0 (it is ignored)? I can't imagine why someone would do such a thing on purpose.

##########
File path: r/R/record-batch.R
##########
@@ -218,7 +207,12 @@ names.RecordBatch <- function(x) x$names()
   if (!missing(j)) {
     # Selecting columns is cheaper than filtering rows, so do it first.
     # That way, if we're filtering too, we have fewer arrays to filter/slice/take
-    x <- x$select(j)
+    if (is_integerish(j)) {
+      x <- x$SelectColumns(as.integer(j) - 1L)

Review comment:
       Judging from the tests, I think we want to add something like this (adapted from filter_rows in array.R):
   
   ```suggestion
         if (all(j < 0)) {
           # in R, negative j means "everything but j"
           j <- setdiff(seq_len(x$num_columns), -1 * j)
         }
         x <- x$SelectColumns(as.integer(j) - 1L)
   ```

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])
 
   expect_error(tab[[c(4, 3)]])
   expect_error(tab[[NA]], "'i' must be character or numeric, not logical")
   expect_error(tab[[NULL]], "'i' must be character or numeric, not NULL")
   expect_error(tab[[c("asdf", "jkl;")]], 'length(name) not equal to 1', fixed = TRUE)
-  expect_error(tab[-3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[-3:3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[1000]) # This is caught in vctrs, assert more specifically when it stabilizes
-  expect_error(tab[1:1000]) # same as ^
+  expect_error(tab[-3], "Invalid column index")

Review comment:
       This may not be allowed in tidyselect but `[` supports this to drop a column, so I don't think this one should error. 

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])

Review comment:
       Should we maintain the base R behavior for subsetting on column 0 (it is ignored)? I can't imagine why someone would do such a thing on purpose.

##########
File path: r/R/record-batch.R
##########
@@ -218,7 +207,12 @@ names.RecordBatch <- function(x) x$names()
   if (!missing(j)) {
     # Selecting columns is cheaper than filtering rows, so do it first.
     # That way, if we're filtering too, we have fewer arrays to filter/slice/take
-    x <- x$select(j)
+    if (is_integerish(j)) {
+      x <- x$SelectColumns(as.integer(j) - 1L)

Review comment:
       Judging from the tests, I think we want to add something like this (adapted from filter_rows in array.R):
   
   ```suggestion
         if (all(j < 0)) {
           # in R, negative j means "everything but j"
           j <- setdiff(seq_len(x$num_columns), -1 * j)
         }
         x <- x$SelectColumns(as.integer(j) - 1L)
   ```

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])
 
   expect_error(tab[[c(4, 3)]])
   expect_error(tab[[NA]], "'i' must be character or numeric, not logical")
   expect_error(tab[[NULL]], "'i' must be character or numeric, not NULL")
   expect_error(tab[[c("asdf", "jkl;")]], 'length(name) not equal to 1', fixed = TRUE)
-  expect_error(tab[-3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[-3:3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[1000]) # This is caught in vctrs, assert more specifically when it stabilizes
-  expect_error(tab[1:1000]) # same as ^
+  expect_error(tab[-3], "Invalid column index")

Review comment:
       This may not be allowed in tidyselect but `[` supports this to drop a column, so I don't think this one should error. 

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])

Review comment:
       Should we maintain the base R behavior for subsetting on column 0 (it is ignored)? I can't imagine why someone would do such a thing on purpose.

##########
File path: r/R/record-batch.R
##########
@@ -218,7 +207,12 @@ names.RecordBatch <- function(x) x$names()
   if (!missing(j)) {
     # Selecting columns is cheaper than filtering rows, so do it first.
     # That way, if we're filtering too, we have fewer arrays to filter/slice/take
-    x <- x$select(j)
+    if (is_integerish(j)) {
+      x <- x$SelectColumns(as.integer(j) - 1L)

Review comment:
       Judging from the tests, I think we want to add something like this (adapted from filter_rows in array.R):
   
   ```suggestion
         if (all(j < 0)) {
           # in R, negative j means "everything but j"
           j <- setdiff(seq_len(x$num_columns), -1 * j)
         }
         x <- x$SelectColumns(as.integer(j) - 1L)
   ```

##########
File path: r/tests/testthat/test-Table.R
##########
@@ -130,16 +130,16 @@ test_that("[, [[, $ for Table", {
   expect_null(tab[["asdf"]])
   # List-like column slicing
   expect_data_frame(tab[2:4], tbl[2:4])
-  expect_data_frame(tab[c(1, 0)], tbl[c(1, 0)])
+  expect_data_frame(tab[c(2, 1)], tbl[c(2, 1)])
 
   expect_error(tab[[c(4, 3)]])
   expect_error(tab[[NA]], "'i' must be character or numeric, not logical")
   expect_error(tab[[NULL]], "'i' must be character or numeric, not NULL")
   expect_error(tab[[c("asdf", "jkl;")]], 'length(name) not equal to 1', fixed = TRUE)
-  expect_error(tab[-3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[-3:3], "Selections can't have negative value") # From tidyselect
-  expect_error(tab[1000]) # This is caught in vctrs, assert more specifically when it stabilizes
-  expect_error(tab[1:1000]) # same as ^
+  expect_error(tab[-3], "Invalid column index")

Review comment:
       This may not be allowed in tidyselect but `[` supports this to drop a column, so I don't think this one should error. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org