You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/06/08 15:42:49 UTC

[GitHub] [arrow] thisisnic commented on a change in pull request #10269: ARROW-11705: [R] Support scalar value recycling in RecordBatch/Table$create()

thisisnic commented on a change in pull request #10269:
URL: https://github.com/apache/arrow/pull/10269#discussion_r647568522



##########
File path: r/R/util.R
##########
@@ -110,3 +110,36 @@ handle_embedded_nul_error <- function(e) {
   }
   stop(e)
 }
+
+#' Take an object of length 1 and repeat it.
+#' 
+#' @param object Object of length 1 to be repeated - vector, `Scalar`, `Array`, or `ChunkedArray`
+#' @param n Number of repetitions
+#' 
+#' @return `Array` of length `n`
+#' 
+#' @keywords internal
+repeat_value_as_array <- function(object, n) {
+  if (inherits(object, "ChunkedArray")) {
+    return(MakeArrayFromScalar(Scalar$create(object$chunks[[1]]), n))
+  }
+  return(MakeArrayFromScalar(Scalar$create(object), n))
+}
+
+#' Recycle scalar values in a list of arrays
+#' 
+#' @param arrays List of arrays
+#' @return List of arrays with any vector/Scalar/Array/ChunkedArray values of length 1 recycled 
+#' @keywords internal
+recycle_scalars <- function(arrays){
+  # Get lengths of items in arrays
+  is_df <- map_lgl(arrays, ~inherits(.x, "data.frame"))
+  arr_lens <- lengths(arrays)
+  arr_lens[is_df] <- map_int(arrays[is_df], nrow)
+  
+  if (length(arrays) > 1 && any(arr_lens == 1) && !all(arr_lens==1)) {
+    max_array_len <- max(arr_lens)
+    arrays[arr_lens == 1 & !is_df] <- lapply(arrays[arr_lens == 1 & !is_df], repeat_value_as_array, max_array_len)

Review comment:
       I've now ended up removing this extra code to try to handle mixed length tibble/data.frames as it seems like a lot of effort for something that doesn't seem to be a particularly common use case.  I've added in an error message for if the input tables are tibbles of different lengths.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org