You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by HyukjinKwon <gi...@git.apache.org> on 2018/11/09 01:08:17 UTC
[GitHub] spark pull request #22954: [SPARK-25981][R] Enables Arrow optimization from ...
Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22954#discussion_r232115966
--- Diff: R/pkg/R/SQLContext.R ---
@@ -147,6 +147,55 @@ getDefaultSqlSource <- function() {
l[["spark.sql.sources.default"]]
}
+writeToTempFileInArrow <- function(rdf, numPartitions) {
+ # R API in Arrow is not yet released. CRAN requires to add the package in requireNamespace
+ # at DESCRIPTION. Later, CRAN checks if the package is available or not. Therefore, it works
+ # around by avoiding direct requireNamespace.
+ requireNamespace1 <- requireNamespace
+ if (requireNamespace1("arrow", quietly = TRUE)) {
+ record_batch <- get("record_batch", envir = asNamespace("arrow"), inherits = FALSE)
+ record_batch_stream_writer <- get(
+ "record_batch_stream_writer", envir = asNamespace("arrow"), inherits = FALSE)
+ file_output_stream <- get(
+ "file_output_stream", envir = asNamespace("arrow"), inherits = FALSE)
+ write_record_batch <- get(
+ "write_record_batch", envir = asNamespace("arrow"), inherits = FALSE)
+
+ # Currently arrow requires withr; otherwise, write APIs don't work.
+ # Direct 'require' is not recommended by CRAN. Here's a workaround.
+ require1 <- require
+ if (require1("withr", quietly = TRUE)) {
--- End diff --
It's actually a bit odd that I need to manually require this package. Otherwise, it complains, for instance, here https://github.com/apache/arrow/blob/d3ec69069649013229366ebe01e22f389597dc19/r/R/on_exit.R#L20
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org