You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2020/08/11 02:02:31 UTC
[spark] branch branch-3.0 updated: [SPARK-32543][R] Remove
arrow::as_tibble usage in SparkR
This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new bfe9489 [SPARK-32543][R] Remove arrow::as_tibble usage in SparkR
bfe9489 is described below
commit bfe94890061138515ec7b6bcb27165e61e150ea7
Author: HyukjinKwon <gu...@apache.org>
AuthorDate: Wed Aug 5 10:35:03 2020 -0700
[SPARK-32543][R] Remove arrow::as_tibble usage in SparkR
SparkR increased the minimal version of Arrow R version to 1.0.0 at SPARK-32452, and Arrow R 0.14 dropped `as_tibble`. We can remove the usage in SparkR.
To remove codes unused anymore.
No.
GitHub Actions will test them out.
Closes #29361 from HyukjinKwon/SPARK-32543.
Authored-by: HyukjinKwon <gu...@apache.org>
Signed-off-by: Dongjoon Hyun <do...@apache.org>
---
R/pkg/R/DataFrame.R | 7 +------
R/pkg/R/deserialize.R | 13 +------------
2 files changed, 2 insertions(+), 18 deletions(-)
diff --git a/R/pkg/R/DataFrame.R b/R/pkg/R/DataFrame.R
index 15b3ce2..f9a07f6 100644
--- a/R/pkg/R/DataFrame.R
+++ b/R/pkg/R/DataFrame.R
@@ -1234,12 +1234,7 @@ setMethod("collect",
output <- tryCatch({
doServerAuth(conn, authSecret)
arrowTable <- arrow::read_arrow(readRaw(conn))
- # Arrow drops `as_tibble` since 0.14.0, see ARROW-5190.
- if (exists("as_tibble", envir = asNamespace("arrow"))) {
- as.data.frame(arrow::as_tibble(arrowTable), stringsAsFactors = stringsAsFactors)
- } else {
- as.data.frame(arrowTable, stringsAsFactors = stringsAsFactors)
- }
+ as.data.frame(arrowTable, stringsAsFactors = stringsAsFactors)
}, finally = {
close(conn)
})
diff --git a/R/pkg/R/deserialize.R b/R/pkg/R/deserialize.R
index 3e7c456..5d22340 100644
--- a/R/pkg/R/deserialize.R
+++ b/R/pkg/R/deserialize.R
@@ -233,24 +233,13 @@ readMultipleObjectsWithKeys <- function(inputCon) {
readDeserializeInArrow <- function(inputCon) {
if (requireNamespace("arrow", quietly = TRUE)) {
- # Arrow drops `as_tibble` since 0.14.0, see ARROW-5190.
- useAsTibble <- exists("as_tibble", envir = asNamespace("arrow"))
-
-
# Currently, there looks no way to read batch by batch by socket connection in R side,
# See ARROW-4512. Therefore, it reads the whole Arrow streaming-formatted binary at once
# for now.
dataLen <- readInt(inputCon)
arrowData <- readBin(inputCon, raw(), as.integer(dataLen), endian = "big")
batches <- arrow::RecordBatchStreamReader$create(arrowData)$batches()
-
- if (useAsTibble) {
- as_tibble <- get("as_tibble", envir = asNamespace("arrow"))
- # Read all groupped batches. Tibble -> data.frame is cheap.
- lapply(batches, function(batch) as.data.frame(as_tibble(batch)))
- } else {
- lapply(batches, function(batch) as.data.frame(batch))
- }
+ lapply(batches, function(batch) as.data.frame(batch))
} else {
stop("'arrow' package should be installed.")
}
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org