You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2020/08/11 02:02:31 UTC

[spark] branch branch-3.0 updated: [SPARK-32543][R] Remove arrow::as_tibble usage in SparkR

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
     new bfe9489  [SPARK-32543][R] Remove arrow::as_tibble usage in SparkR
bfe9489 is described below

commit bfe94890061138515ec7b6bcb27165e61e150ea7
Author: HyukjinKwon <gu...@apache.org>
AuthorDate: Wed Aug 5 10:35:03 2020 -0700

    [SPARK-32543][R] Remove arrow::as_tibble usage in SparkR
    
    SparkR increased the minimal version of Arrow R version to 1.0.0 at SPARK-32452, and Arrow R 0.14 dropped `as_tibble`. We can remove the usage in SparkR.
    
    To remove codes unused anymore.
    
    No.
    
    GitHub Actions will test them out.
    
    Closes #29361 from HyukjinKwon/SPARK-32543.
    
    Authored-by: HyukjinKwon <gu...@apache.org>
    Signed-off-by: Dongjoon Hyun <do...@apache.org>
---
 R/pkg/R/DataFrame.R   |  7 +------
 R/pkg/R/deserialize.R | 13 +------------
 2 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index 15b3ce2..f9a07f6 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -1234,12 +1234,7 @@ setMethod("collect",
                 output <- tryCatch({
                   doServerAuth(conn, authSecret)
                   arrowTable <- arrow::read_arrow(readRaw(conn))
-                  # Arrow drops `as_tibble` since 0.14.0, see ARROW-5190.
-                  if (exists("as_tibble", envir = asNamespace("arrow"))) {
-                    as.data.frame(arrow::as_tibble(arrowTable), stringsAsFactors = stringsAsFactors)
-                  } else {
-                    as.data.frame(arrowTable, stringsAsFactors = stringsAsFactors)
-                  }
+                  as.data.frame(arrowTable, stringsAsFactors = stringsAsFactors)
                 }, finally = {
                   close(conn)
                 })
diff --git a/R/pkg/R/deserialize.R b/R/pkg/R/deserialize.R
index 3e7c456..5d22340 100644
--- a/R/pkg/R/deserialize.R
+++ b/R/pkg/R/deserialize.R
@@ -233,24 +233,13 @@ readMultipleObjectsWithKeys <- function(inputCon) {
 
 readDeserializeInArrow <- function(inputCon) {
   if (requireNamespace("arrow", quietly = TRUE)) {
-    # Arrow drops `as_tibble` since 0.14.0, see ARROW-5190.
-    useAsTibble <- exists("as_tibble", envir = asNamespace("arrow"))
-
-
     # Currently, there looks no way to read batch by batch by socket connection in R side,
     # See ARROW-4512. Therefore, it reads the whole Arrow streaming-formatted binary at once
     # for now.
     dataLen <- readInt(inputCon)
     arrowData <- readBin(inputCon, raw(), as.integer(dataLen), endian = "big")
     batches <- arrow::RecordBatchStreamReader$create(arrowData)$batches()
-
-    if (useAsTibble) {
-      as_tibble <- get("as_tibble", envir = asNamespace("arrow"))
-      # Read all groupped batches. Tibble -> data.frame is cheap.
-      lapply(batches, function(batch) as.data.frame(as_tibble(batch)))
-    } else {
-      lapply(batches, function(batch) as.data.frame(batch))
-    }
+    lapply(batches, function(batch) as.data.frame(batch))
   } else {
     stop("'arrow' package should be installed.")
   }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org