You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/04 17:27:43 UTC

[GitHub] [arrow] jonkeane opened a new pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

jonkeane opened a new pull request #9092:
URL: https://github.com/apache/arrow/pull/9092


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #9092:
URL: https://github.com/apache/arrow/pull/9092#discussion_r551534161



##########
File path: r/R/record-batch.R
##########
@@ -274,6 +274,12 @@ as.data.frame.RecordBatch <- function(x, row.names = NULL, optional = FALSE, ...
 }
 
 .serialize_arrow_r_metadata <- function(x) {
+  # drop problems attributes (most likely from readr)
+  if ("attributes" %in% names(x) &&
+      "problems" %in% names(x[["attributes"]]) ) {
+    x[["attributes"]][["problems"]] <- NULL
+  }
+

Review comment:
       Should we validate/assert `is.list()` in this function?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #9092:
URL: https://github.com/apache/arrow/pull/9092#discussion_r551483818



##########
File path: r/R/record-batch.R
##########
@@ -274,6 +274,12 @@ as.data.frame.RecordBatch <- function(x, row.names = NULL, optional = FALSE, ...
 }
 
 .serialize_arrow_r_metadata <- function(x) {
+  # drop problems attributes (most likely from readr)
+  if ("attributes" %in% names(x) &&
+      "problems" %in% names(x[["attributes"]]) ) {
+    x[["attributes"]][["problems"]] <- NULL
+  }
+

Review comment:
       I believe you can safely just
   ```suggestion
     x[["attributes"]][["problems"]] <- NULL
   ```
   
   ```
   > x <- list()
   > x
   list()
   > x$attributes$problems <- NULL
   > x
   list()
   > x[["attributes"]][["problems"]] <- NULL
   > x
   list()
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on a change in pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

Posted by GitBox <gi...@apache.org>.
jonkeane commented on a change in pull request #9092:
URL: https://github.com/apache/arrow/pull/9092#discussion_r551560380



##########
File path: r/R/record-batch.R
##########
@@ -274,6 +274,12 @@ as.data.frame.RecordBatch <- function(x, row.names = NULL, optional = FALSE, ...
 }
 
 .serialize_arrow_r_metadata <- function(x) {
+  # drop problems attributes (most likely from readr)
+  if ("attributes" %in% names(x) &&
+      "problems" %in% names(x[["attributes"]]) ) {
+    x[["attributes"]][["problems"]] <- NULL
+  }
+

Review comment:
       Ok, cool — I'll rewrite the second test to avoid using `.serialize_arrow_r_metadata()` so that we don't error there, but still keep the testing-serialized-data aspect of that test.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #9092:
URL: https://github.com/apache/arrow/pull/9092#discussion_r551556251



##########
File path: r/R/record-batch.R
##########
@@ -274,6 +274,12 @@ as.data.frame.RecordBatch <- function(x, row.names = NULL, optional = FALSE, ...
 }
 
 .serialize_arrow_r_metadata <- function(x) {
+  # drop problems attributes (most likely from readr)
+  if ("attributes" %in% names(x) &&
+      "problems" %in% names(x[["attributes"]]) ) {
+    x[["attributes"]][["problems"]] <- NULL
+  }
+

Review comment:
       I would have to look into the implementation to be sure, but I think the intent of the first test should be that we don't save garbage `r` metadata and the second would be that, if someone happened to have bad metadata stored in the `r` key (we can't prevent some other data generating process from doing that), that we don't crash on loading it. It's not clear that that's actually what those tests are doing, but that's what I think they should do.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9092:
URL: https://github.com/apache/arrow/pull/9092#issuecomment-754108091


   https://issues.apache.org/jira/browse/ARROW-10624


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson closed pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

Posted by GitBox <gi...@apache.org>.
nealrichardson closed pull request #9092:
URL: https://github.com/apache/arrow/pull/9092


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on a change in pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

Posted by GitBox <gi...@apache.org>.
jonkeane commented on a change in pull request #9092:
URL: https://github.com/apache/arrow/pull/9092#discussion_r551517846



##########
File path: r/R/record-batch.R
##########
@@ -274,6 +274,12 @@ as.data.frame.RecordBatch <- function(x, row.names = NULL, optional = FALSE, ...
 }
 
 .serialize_arrow_r_metadata <- function(x) {
+  # drop problems attributes (most likely from readr)
+  if ("attributes" %in% names(x) &&
+      "problems" %in% names(x[["attributes"]]) ) {
+    x[["attributes"]][["problems"]] <- NULL
+  }
+

Review comment:
       _it turns out_ just `NULL`ing doesn’t quite work with the tests for garbage metadata (`"garbage"[["attributes"]][["problems"]] <- NULL` errors. I can either test that the attributes are a list before this or alter our garbage tests to send in a list.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #9092:
URL: https://github.com/apache/arrow/pull/9092#issuecomment-754243679


   Updated, passing actions on my fork:
   https://github.com/jonkeane/arrow/actions/runs/461948379
   https://github.com/jonkeane/arrow/actions/runs/461948380


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on a change in pull request #9092: ARROW-10624: [R] Proactively remove "problems" attributes

Posted by GitBox <gi...@apache.org>.
jonkeane commented on a change in pull request #9092:
URL: https://github.com/apache/arrow/pull/9092#discussion_r551549190



##########
File path: r/R/record-batch.R
##########
@@ -274,6 +274,12 @@ as.data.frame.RecordBatch <- function(x, row.names = NULL, optional = FALSE, ...
 }
 
 .serialize_arrow_r_metadata <- function(x) {
+  # drop problems attributes (most likely from readr)
+  if ("attributes" %in% names(x) &&
+      "problems" %in% names(x[["attributes"]]) ) {
+    x[["attributes"]][["problems"]] <- NULL
+  }
+

Review comment:
       Yeah I can do that. Though [one of the existing tests](https://github.com/apache/arrow/blob/master/r/tests/testthat/test-metadata.R#L76-L81) fails here — the one right above it works which might be sufficient, is there something that lines 76-81 is testing that we aren't covering with lines 70-75?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org