You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by ro...@apache.org on 2019/06/25 09:39:29 UTC

[arrow] branch master updated: ARROW-5555: [R] Add install_arrow() function to assist the user in obtaining C++ runtime libraries

This is an automated email from the ASF dual-hosted git repository.

romainfrancois pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new c9290cb  ARROW-5555: [R] Add install_arrow() function to assist the user in obtaining C++ runtime libraries
c9290cb is described below

commit c9290cb9ff8d756bdabcff30b776b297198a7a8d
Author: Neal Richardson <ne...@gmail.com>
AuthorDate: Tue Jun 25 11:39:16 2019 +0200

    ARROW-5555: [R] Add install_arrow() function to assist the user in obtaining C++ runtime libraries
    
    Note that this function doesn't install anything: it provides context-dependent advice and for some platforms recommends commands to run. In future versions, we may want to make it do more, but this is a useful starting point.
    
    In addition to this function, this patch also:
    
    * Updates the README with installation instructions, including for Windows (assuming https://github.com/apache/arrow/pull/4622)
    * Adds a Makefile to facilitate local package dev, building, testing, etc. (https://issues.apache.org/jira/browse/ARROW-5328)
    * Fixes an UTF-8 issue in the DESCRIPTION
    * A few other cleanups
    
    Author: Neal Richardson <ne...@gmail.com>
    
    Closes #4654 from nealrichardson/r-install-arrow and squashes the following commits:
    
    90c2e5e8 <Neal Richardson> Simplify installation instructions
    a2145629 <Neal Richardson> Add ARROW_R_DEV to Makefile
    b2870489 <Neal Richardson> if (
    60150149 <Neal Richardson> Update authors, including ASCII fix
    3327cb1b <Neal Richardson> Implement messages for install_arrow, write tests
    af34401c <Neal Richardson> Sketch of install_arrow()
    4c756bb3 <Neal Richardson> Update installation docs; add Makefile for CLI-oriented folks
---
 r/.Rbuildignore                       |   1 +
 r/DESCRIPTION                         |   5 +-
 r/Makefile                            |  45 +++++++++
 r/NAMESPACE                           |   2 +
 r/R/R6.R                              |   6 +-
 r/R/RecordBatchWriter.R               |   2 +-
 r/R/array.R                           |   4 +-
 r/R/arrow-package.R                   |   7 +-
 r/R/feather.R                         |   6 +-
 r/R/install-arrow.R                   | 137 +++++++++++++++++++++++++
 r/README.Rmd                          |  90 ++++++++++-------
 r/README.md                           | 182 ++++++++++++++++++++++------------
 r/configure                           |   8 +-
 r/man/arrow_available.Rd              |  10 +-
 r/man/install_arrow.Rd                |  14 +++
 r/tests/testthat/test-install-arrow.R |  80 +++++++++++++++
 16 files changed, 486 insertions(+), 113 deletions(-)

diff --git a/r/.Rbuildignore b/r/.Rbuildignore
index b157f70..af5457a 100644
--- a/r/.Rbuildignore
+++ b/r/.Rbuildignore
@@ -14,3 +14,4 @@ clang_format.sh
 ^_pkgdown\.yml$
 ^docs$
 ^pkgdown$
+^Makefile$
diff --git a/r/DESCRIPTION b/r/DESCRIPTION
index 9bec314..70a6654 100644
--- a/r/DESCRIPTION
+++ b/r/DESCRIPTION
@@ -2,10 +2,11 @@ Package: arrow
 Title: Integration to 'Apache' 'Arrow'
 Version: 0.13.0.9000
 Authors@R: c(
-    person("Romain", "François", email = "romain@rstudio.com", role = c("aut", "cre")),
+    person("Romain", "Fran\u00e7ois", email = "romain@rstudio.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0002-2444-4226")),
     person("Javier", "Luraschi", email = "javier@rstudio.com", role = c("ctb")),
     person("Jeffrey", "Wong", email = "jeffreyw@netflix.com", role = c("ctb")),
     person("Jeroen", "Ooms", email = "jeroen@berkeley.edu", role = c("aut")),
+    person("Neal", "Richardson", email = "neal@ursalabs.org", role = c("aut")),
     person("Apache Arrow", email = "dev@arrow.apache.org", role = c("aut", "cph"))
   )
 Description: 'Apache' 'Arrow' <https://arrow.apache.org/> is a cross-language
@@ -23,6 +24,7 @@ SystemRequirements: C++11
 LinkingTo:
     Rcpp (>= 1.0.1)
 Imports:
+    utils,
     Rcpp (>= 1.0.1),
     rlang,
     purrr,
@@ -66,6 +68,7 @@ Collate:
     'csv.R'
     'dictionary.R'
     'feather.R'
+    'install-arrow.R'
     'json.R'
     'memory_pool.R'
     'message.R'
diff --git a/r/Makefile b/r/Makefile
new file mode 100644
index 0000000..f3bdbd6
--- /dev/null
+++ b/r/Makefile
@@ -0,0 +1,45 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+VERSION=$(shell grep ^Version DESCRIPTION | sed s/Version:\ //)
+ARROW_R_DEV="TRUE"
+
+doc:
+	R --slave -e 'devtools::document(); rmarkdown::render("README.Rmd")'
+	-git add --all man/*.Rd
+
+test:
+	export ARROW_R_DEV=$(ARROW_R_DEV) && R CMD INSTALL --install-tests .
+	export NOT_CRAN=true && R --slave -e 'library(testthat); setwd(file.path(.libPaths()[1], "arrow", "tests")); system.time(test_check("arrow", filter="${file}", reporter=ifelse(nchar("${r}"), "${r}", "summary")))'
+
+deps:
+	R --slave -e 'lib <- Sys.getenv("R_LIB", .libPaths()[1]); install.packages("devtools", repo="https://cloud.r-project.org", lib=lib); devtools::install_dev_deps(lib=lib)'
+
+build: doc
+	R CMD build .
+
+check: build
+	-export _R_CHECK_CRAN_INCOMING_REMOTE_=FALSE && export ARROW_R_DEV=$(ARROW_R_DEV) && R CMD check --as-cran arrow_$(VERSION).tar.gz
+	rm -rf arrow.Rcheck/
+
+clean:
+	-rm src/*.o
+	-rm src/*.so
+	-rm src/*.dll
+	-rm src/Makevars
+	-rm src/Makevars.win
+	-rm -rf arrow.Rcheck/
diff --git a/r/NAMESPACE b/r/NAMESPACE
index 281d27b..66f2004 100644
--- a/r/NAMESPACE
+++ b/r/NAMESPACE
@@ -133,6 +133,7 @@ export(field)
 export(float16)
 export(float32)
 export(float64)
+export(install_arrow)
 export(int16)
 export(int32)
 export(int64)
@@ -184,4 +185,5 @@ importFrom(rlang,dots_n)
 importFrom(rlang,is_false)
 importFrom(rlang,list2)
 importFrom(rlang,warn)
+importFrom(utils,packageVersion)
 useDynLib(arrow, .registration = TRUE)
diff --git a/r/R/R6.R b/r/R/R6.R
index 41169f3..06dd6f0 100644
--- a/r/R/R6.R
+++ b/r/R/R6.R
@@ -27,7 +27,7 @@
     },
     print = function(...){
       cat(class(self)[[1]], "\n")
-      if(!is.null(self$ToString)){
+      if (!is.null(self$ToString)){
         cat(self$ToString(), "\n")
       }
       invisible(self)
@@ -36,11 +36,11 @@
 )
 
 shared_ptr <- function(class, xp) {
-  if(!shared_ptr_is_null(xp)) class$new(xp)
+  if (!shared_ptr_is_null(xp)) class$new(xp)
 }
 
 unique_ptr <- function(class, xp) {
-  if(!unique_ptr_is_null(xp)) class$new(xp)
+  if (!unique_ptr_is_null(xp)) class$new(xp)
 }
 
 #' @export
diff --git a/r/R/RecordBatchWriter.R b/r/R/RecordBatchWriter.R
index 7730511..59aa984 100644
--- a/r/R/RecordBatchWriter.R
+++ b/r/R/RecordBatchWriter.R
@@ -44,7 +44,7 @@
     write = function(x) {
       if (inherits(x, "arrow::RecordBatch")) {
         self$write_batch(x)
-      } else if(inherits(x, "arrow::Table")) {
+      } else if (inherits(x, "arrow::Table")) {
         self$write_table(x)
       } else if (inherits(x, "data.frame")) {
         self$write_table(table(x))
diff --git a/r/R/array.R b/r/R/array.R
index 7e5e955..deb3bc5 100644
--- a/r/R/array.R
+++ b/r/R/array.R
@@ -132,11 +132,11 @@
 
 `arrow::Array`$dispatch <- function(xp){
   a <- shared_ptr(`arrow::Array`, xp)
-  if(a$type_id() == Type$DICTIONARY){
+  if (a$type_id() == Type$DICTIONARY){
     a <- shared_ptr(`arrow::DictionaryArray`, xp)
   } else if (a$type_id() == Type$STRUCT) {
     a <- shared_ptr(`arrow::StructArray`, xp)
-  } else if(a$type_id() == Type$LIST) {
+  } else if (a$type_id() == Type$LIST) {
     a <- shared_ptr(`arrow::ListArray`, xp)
   }
   a
diff --git a/r/R/arrow-package.R b/r/R/arrow-package.R
index faaaf2a..77b3a31 100644
--- a/r/R/arrow-package.R
+++ b/r/R/arrow-package.R
@@ -24,8 +24,13 @@
 #' @keywords internal
 "_PACKAGE"
 
-#' Is the C++ Arrow library available
+#' Is the C++ Arrow library available?
 #'
+#' You won't generally need to call this function, but it's here in case it
+#' helps for development purposes.
+#' @return `TRUE` or `FALSE` depending on whether the package was installed
+#' with the Arrow C++ library. If `FALSE`, you'll need to install the C++
+#' library and then reinstall the R package. See [install_arrow()] for help.
 #' @export
 arrow_available <- function() {
   .Call(`_arrow_available`)
diff --git a/r/R/feather.R b/r/R/feather.R
index 998f39b..f197fd0 100644
--- a/r/R/feather.R
+++ b/r/R/feather.R
@@ -141,7 +141,11 @@ FeatherTableReader.character <- function(file, mmap = TRUE, ...) {
 
 #' @export
 FeatherTableReader.fs_path <- function(file, mmap = TRUE, ...) {
-  stream <- if(isTRUE(mmap)) mmap_open(file, ...) else ReadableFile(file, ...)
+  if (isTRUE(mmap)) {
+    stream <- mmap_open(file, ...)
+  } else {
+    stream <- ReadableFile(file, ...)
+  }
   FeatherTableReader(stream)
 }
 
diff --git a/r/R/install-arrow.R b/r/R/install-arrow.R
new file mode 100644
index 0000000..5b30277
--- /dev/null
+++ b/r/R/install-arrow.R
@@ -0,0 +1,137 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+#' Help installing the Arrow C++ library
+#'
+#' Binary package installations should come with a working Arrow C++ library,
+#' but when installing from source, you'll need to obtain the C++ library
+#' first. This function offers guidance on how to get the C++ library depending
+#' on your operating system and package version.
+#' @export
+#' @importFrom utils packageVersion
+install_arrow <- function() {
+  os <- tolower(Sys.info()[["sysname"]])
+  # c("windows", "darwin", "linux", "sunos") # win/mac/linux/solaris
+  version <- packageVersion("arrow")
+  # From CRAN check:
+  rep <- installed.packages(fields="Repository")["arrow", "Repository"]
+  from_cran <- identical(rep, "CRAN")
+  # Is it possible to tell if was a binary install from CRAN vs. source?
+
+  message(install_arrow_msg(arrow_available(), version, from_cran, os))
+}
+
+install_arrow_msg <- function(has_arrow, version, from_cran, os) {
+  # TODO: check if there is a newer version on CRAN?
+
+  # install_arrow() sends "version" as a "package_version" class, but for
+  # convenience, this also accepts a string like "0.13.0". Calling
+  # `package_version` is idempotent so do it again, and then `unclass` to get
+  # the integers. Then see how many there are.
+  dev_version <- length(unclass(package_version(version))[[1]]) > 3
+  # Based on these parameters, assemble a string with installation advice
+  if (has_arrow) {
+    # Respond that you already have it
+    msg <- ALREADY_HAVE
+  } else if (os == "sunos") {
+    # Good luck with that.
+    msg <- c(SEE_DEV_GUIDE, THEN_REINSTALL)
+  } else if (os == "linux") {
+    if (dev_version) {
+      # Point to compilation instructions on readme
+      msg <- c(SEE_DEV_GUIDE, THEN_REINSTALL)
+    } else {
+      # Suggest arrow.apache.org/install for PPAs, or compilation instructions
+      msg <- c(paste(SEE_ARROW_INSTALL, OR_SEE_DEV_GUIDE), THEN_REINSTALL)
+    }
+  } else if (!dev_version && !from_cran) {
+    # Windows or Mac with a released version but not from CRAN
+    # Recommend installing released binary package from CRAN
+    msg <- INSTALL_FROM_CRAN
+  } else {
+    # Windows or Mac, most likely a dev version
+    # for each OS, recommend dev installation, refer to readme
+    # TODO: if there is a newer version on CRAN, recommend CRAN
+    if (os == "windows") {
+      msg <- c(paste(FIND_WIN_BINARY, OR_SEE_DEV_GUIDE), THEN_REINSTALL)
+    } else {
+      # macOS
+      msg <- c(paste(FIND_MAC_BINARY, OR_SEE_DEV_GUIDE), THEN_REINSTALL)
+    }
+  }
+  # Common postscript
+  msg <- c(msg, SEE_README, REPORT_ISSUE)
+  paste(msg, collapse="\n\n")
+}
+
+ALREADY_HAVE <- paste(
+  "It appears you already have Arrow installed successfully:",
+  "are you trying to install a different version of the library?"
+)
+
+SEE_DEV_GUIDE <- paste(
+  "See the Arrow C++ developer guide",
+  "<https://arrow.apache.org/docs/developers/cpp.html>",
+  "for instructions on building the library from source."
+)
+# Variation of that
+OR_SEE_DEV_GUIDE <- paste0(
+  "Or, s",
+  substr(SEE_DEV_GUIDE, 2, nchar(SEE_DEV_GUIDE))
+)
+
+SEE_ARROW_INSTALL <- paste(
+  "See the Apache Arrow project installation page",
+  "<https://arrow.apache.org/install/>",
+  "for how to install the C++ package from a PPA."
+)
+
+THEN_REINSTALL <- paste(
+  "After you've installed the C++ library,",
+  "you'll need to reinstall the R package from source to find it."
+)
+
+SEE_README <- paste(
+  "Refer to the R package README",
+  "<https://github.com/apache/arrow/blob/master/r/README.md>",
+  "for further details."
+)
+
+REPORT_ISSUE <- paste(
+  "If you have other trouble, or if you think this message could be improved,",
+  "please report an issue here:",
+  "<https://issues.apache.org/jira/projects/ARROW/issues>"
+)
+
+INSTALL_FROM_CRAN <- paste(
+  'Try installing the package from CRAN:',
+  '`install.packages("arrow")`'
+)
+
+FIND_WIN_BINARY <- paste(
+  "You may be able to download a development version of the C++ binary",
+  "from the Apache Arrow project's Appveyor:",
+  "<https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow>.",
+  "Select an R job from a recent build,",
+  'and download the `build\arrow-*.zip` file from the "Artifacts" tab.',
+  "Then, set the RWINLIB_LOCAL environment variable to point to that file."
+)
+
+FIND_MAC_BINARY <- paste(
+  "You may be able to get a development version of the Arrow C++ library",
+  "using Homebrew: `brew install apache-arrow --HEAD`"
+)
diff --git a/r/README.Rmd b/r/README.Rmd
index f718732..6b83817 100644
--- a/r/README.Rmd
+++ b/r/README.Rmd
@@ -16,42 +16,75 @@ knitr::opts_chunk$set(
 ```
 # arrow
 
-[![conda-forge](https://img.shields.io/conda/vn/conda-forge/r-arrow.svg)](https://anaconda.org/conda-forge/r-arrow)
+[![cran](https://www.r-pkg.org/badges/version-last-release/arrow)](https://cran.r-project.org/package=arrow) [![conda-forge](https://img.shields.io/conda/vn/conda-forge/r-arrow.svg)](https://anaconda.org/conda-forge/r-arrow)
 
 [Apache Arrow](https://arrow.apache.org/) is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication.
 
-The `arrow` package exposes an interface to the Arrow C++ library.
+The `arrow` package exposes an interface to the Arrow C++ library to access many of its features in R. This includes support for working with Parquet (`read_parquet()`, `write_parquet()`) and Feather (`read_feather()`, `write_feather()`) files, as well as lower-level access to Arrow memory and messages.
 
 ## Installation
 
-Installing `arrow` is a two-step process: first install the Arrow C++ library, then the R package.
+Install the latest release of `arrow` from CRAN with
 
-### Release version
+```r
+install.packages("arrow")
+```
+
+On macOS and Windows, installing a binary package from CRAN will handle Arrow's C++ dependencies for you. On Linux, you'll need to first install the C++ library. See the [Arrow project installation page](https://arrow.apache.org/install/) for a list of PPAs from which you can obtain it.
 
-The current release of Apache Arrow is 0.13.
+If you install the `arrow` package from source and the C++ library is not found, the R package functions will notify you that Arrow is not available. Call
+
+```r
+arrow::install_arrow()
+```
 
-On macOS, you may install the C++ library using [Homebrew](https://brew.sh/):
+for version- and platform-specific guidance on installing the Arrow C++ library.
+
+## Example
+
+```{r}
+library(arrow)
+set.seed(24)
+
+tab <- arrow::table(x = 1:10, y = rnorm(10))
+tab$schema
+tab
+as.data.frame(tab)
+```
+
+## Installing a development version
+
+To use the development version of the R package, you'll need to install it from source, which requires the additional C++ library setup. On macOS, you may install the C++ library using [Homebrew](https://brew.sh/):
 
 ```shell
+# For the released version:
 brew install apache-arrow
+# Or for a development version, you can try:
+brew install apache-arrow --HEAD
 ```
 
-On Linux, see the [Arrow project installation page](https://arrow.apache.org/install/).
+On Windows, you can download a .zip file with the arrow dependencies from the [rwinlib](https://github.com/rwinlib/arrow/releases) project, and then set the `RWINLIB_LOCAL` environment variable to point to that zip file before installing the `arrow` R package. That project contains released versions of the C++ library; for a development version, Windows users may be able to find a binary by going to the [Apache Arrow project's Appveyor](https://ci.appveyor.com/project/ApacheSoftwareFound [...]
 
-Then, you can install the release version of the package from GitHub using the [`remotes`](https://remotes.r-lib.org/) package. From within an R session,
+Linux users can get a released version of the library from our PPAs, as described above. If you need a development version of the C++ library, you will likely need to build it from source. See "Development" below.
+
+Once you have the C++ library, you can install the R package from GitHub using the [`remotes`](https://remotes.r-lib.org/) package. From within an R session,
 
 ```r
 # install.packages("remotes") # Or install "devtools", which includes remotes
-remotes::install_github("apache/arrow/r", ref="76e1bc5dfb9d08e31eddd5cbcc0b1bab934da2c7")
+remotes::install_github("apache/arrow/r")
 ```
 
 or if you prefer to stay at the command line,
 
 ```shell
-R -e 'remotes::install_github("apache/arrow/r", ref="76e1bc5dfb9d08e31eddd5cbcc0b1bab934da2c7")'
+R -e 'remotes::install_github("apache/arrow/r")'
 ```
 
-### Development version
+You can specify a particular commit, branch, or [release](https://github.com/apache/arrow/releases) to install by including a `ref` argument to `install_github()`.
+
+## Developing
+
+If you need to alter both the Arrow C++ library and the R package code, or if you can't get a binary version of the latest C++ library elsewhere, you'll need to build it from source too.
 
 First, clone the repository and install a release build of the C++ library.
 
@@ -62,11 +95,13 @@ cmake .. -DARROW_PARQUET=ON -DARROW_BOOST_USE_SHARED:BOOL=Off -DARROW_INSTALL_NA
 make install
 ```
 
-Then, you can install the R package and its dependencies from the git checkout:
+This likely will require additional system libraries to be installed, the specifics of which are platform dependent. See the [C++ developer guide](https://arrow.apache.org/docs/developers/cpp.html) for details.
+
+Once you've built the C++ library, you can install the R package and its dependencies, along with additional dev dependencies, from the git checkout:
 
 ```shell
 cd ../../r
-R -e 'install.packages("remotes"); remotes::install_deps()'
+R -e 'install.packages("devtools"); devtools::install_dev_deps()'
 R CMD INSTALL .
 ```
 
@@ -83,40 +118,27 @@ try setting the environment variable `LD_LIBRARY_PATH` (or `DYLD_LIBRARY_PATH` o
 
 For any other build/configuration challenges, see the [C++ developer guide](https://arrow.apache.org/docs/developers/cpp.html#building).
 
-## Example
-
-```{r}
-library(arrow)
-
-tab <- arrow::table(x = 1:10, y = rnorm(10))
-tab$schema
-tab
-as.data.frame(tab)
-```
-
-## Developing
-
-It is recommended to use the `devtools` package to help with some package-related manipulations. You'll need a few more R packages, which you can install using
+### Editing Rcpp code
 
-```r
-install.packages("devtools")
-devtools::install_dev_deps()
-```
+The `arrow` package uses some customized tools on top of `Rcpp` to prepare its C++ code in `src/`. If you change C++ code in the R package, you will need to set the `ARROW_R_DEV` environment variable to `TRUE` (optionally, add it to your`~/.Renviron` file to persist across sessions) so that the `data-raw/codegen.R` file is used for code generation.
 
-If you change C++ code, you need to set the `ARROW_R_DEV` environment variable to `TRUE`, e.g. 
-in your `~/.Renviron` so that the `data-raw/codegen.R` file is used for code generation. 
+You'll also need `remotes::install_github("romainfrancois/decor")`.
 
 ### Useful functions
 
+Within an R session, these can help with package development:
+
 ```r
 devtools::load_all() # Load the dev package
 devtools::test(filter="^regexp$") # Run the test suite, optionally filtering file names
 devtools::document() # Update roxygen documentation
 rmarkdown::render("README.Rmd") # To rebuild README.md
-pkgdown::build_site() # To preview the documentation website
+pkgdown::build_site(run_dont_run=TRUE) # To preview the documentation website
 devtools::check() # All package checks; see also below
 ```
 
+Any of those can be run from the command line by wrapping them in `R -e '$COMMAND'`. There's also a `Makefile` to help with some common tasks from the command line (`make test`, `make doc`, `make clean`, etc.)
+
 ### Full package validation
 
 ```shell
diff --git a/r/README.md b/r/README.md
index b584486..6fc2e33 100644
--- a/r/README.md
+++ b/r/README.md
@@ -3,6 +3,7 @@
 
 # arrow
 
+[![cran](https://www.r-pkg.org/badges/version-last-release/arrow)](https://cran.r-project.org/package=arrow)
 [![conda-forge](https://img.shields.io/conda/vn/conda-forge/r-arrow.svg)](https://anaconda.org/conda-forge/r-arrow)
 
 [Apache Arrow](https://arrow.apache.org/) is a cross-language
@@ -12,43 +13,125 @@ data, organized for efficient analytic operations on modern hardware. It
 also provides computational libraries and zero-copy streaming messaging
 and interprocess communication.
 
-The `arrow` package exposes an interface to the Arrow C++ library.
+The `arrow` package exposes an interface to the Arrow C++ library to
+access many of its features in R. This includes support for working with
+Parquet (`read_parquet()`, `write_parquet()`) and Feather
+(`read_feather()`, `write_feather()`) files, as well as lower-level
+access to Arrow memory and messages.
 
 ## Installation
 
-Installing `arrow` is a two-step process: first install the Arrow C++
-library, then the R package.
+Install the latest release of `arrow` from CRAN with
 
-### Release version
+``` r
+install.packages("arrow")
+```
+
+On macOS and Windows, installing a binary package from CRAN will handle
+Arrow’s C++ dependencies for you. On Linux, you’ll need to first install
+the C++ library. See the [Arrow project installation
+page](https://arrow.apache.org/install/) for a list of PPAs from which
+you can obtain it.
 
-The current release of Apache Arrow is 0.13.
+If you install the `arrow` package from source and the C++ library is
+not found, the R package functions will notify you that Arrow is not
+available. Call
 
-On macOS, you may install the C++ library using
+``` r
+arrow::install_arrow()
+```
+
+for version- and platform-specific guidance on installing the Arrow C++
+library.
+
+## Example
+
+``` r
+library(arrow)
+#> 
+#> Attaching package: 'arrow'
+#> The following object is masked from 'package:utils':
+#> 
+#>     timestamp
+#> The following objects are masked from 'package:base':
+#> 
+#>     array, table
+set.seed(24)
+
+tab <- arrow::table(x = 1:10, y = rnorm(10))
+tab$schema
+#> arrow::Schema 
+#> x: int32
+#> y: double
+tab
+#> arrow::Table
+as.data.frame(tab)
+#>     x            y
+#> 1   1 -0.545880758
+#> 2   2  0.536585304
+#> 3   3  0.419623149
+#> 4   4 -0.583627199
+#> 5   5  0.847460017
+#> 6   6  0.266021979
+#> 7   7  0.444585270
+#> 8   8 -0.466495124
+#> 9   9 -0.848370044
+#> 10 10  0.002311942
+```
+
+## Installing a development version
+
+To use the development version of the R package, you’ll need to install
+it from source, which requires the additional C++ library setup. On
+macOS, you may install the C++ library using
 [Homebrew](https://brew.sh/):
 
 ``` shell
+# For the released version:
 brew install apache-arrow
+# Or for a development version, you can try:
+brew install apache-arrow --HEAD
 ```
 
-On Linux, see the [Arrow project installation
-page](https://arrow.apache.org/install/).
-
-Then, you can install the release version of the package from GitHub
+On Windows, you can download a .zip file with the arrow dependencies
+from the [rwinlib](https://github.com/rwinlib/arrow/releases) project,
+and then set the `RWINLIB_LOCAL` environment variable to point to that
+zip file before installing the `arrow` R package. That project contains
+released versions of the C++ library; for a development version, Windows
+users may be able to find a binary by going to the [Apache Arrow
+project’s
+Appveyor](https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow),
+selecting an R job from a recent build, and downloading the
+`build\arrow-*.zip` file from the “Artifacts” tab.
+
+Linux users can get a released version of the library from our PPAs, as
+described above. If you need a development version of the C++ library,
+you will likely need to build it from source. See “Development” below.
+
+Once you have the C++ library, you can install the R package from GitHub
 using the [`remotes`](https://remotes.r-lib.org/) package. From within
 an R session,
 
 ``` r
 # install.packages("remotes") # Or install "devtools", which includes remotes
-remotes::install_github("apache/arrow/r", ref="76e1bc5dfb9d08e31eddd5cbcc0b1bab934da2c7")
+remotes::install_github("apache/arrow/r")
 ```
 
 or if you prefer to stay at the command line,
 
 ``` shell
-R -e 'remotes::install_github("apache/arrow/r", ref="76e1bc5dfb9d08e31eddd5cbcc0b1bab934da2c7")'
+R -e 'remotes::install_github("apache/arrow/r")'
 ```
 
-### Development version
+You can specify a particular commit, branch, or
+[release](https://github.com/apache/arrow/releases) to install by
+including a `ref` argument to `install_github()`.
+
+## Developing
+
+If you need to alter both the Arrow C++ library and the R package code,
+or if you can’t get a binary version of the latest C++ library
+elsewhere, you’ll need to build it from source too.
 
 First, clone the repository and install a release build of the C++
 library.
@@ -60,12 +143,17 @@ cmake .. -DARROW_PARQUET=ON -DARROW_BOOST_USE_SHARED:BOOL=Off -DARROW_INSTALL_NA
 make install
 ```
 
-Then, you can install the R package and its dependencies from the git
+This likely will require additional system libraries to be installed,
+the specifics of which are platform dependent. See the [C++ developer
+guide](https://arrow.apache.org/docs/developers/cpp.html) for details.
+
+Once you’ve built the C++ library, you can install the R package and its
+dependencies, along with additional dev dependencies, from the git
 checkout:
 
 ``` shell
 cd ../../r
-R -e 'install.packages("remotes"); remotes::install_deps()'
+R -e 'install.packages("devtools"); devtools::install_dev_deps()'
 R CMD INSTALL .
 ```
 
@@ -84,68 +172,34 @@ installing the R package.
 For any other build/configuration challenges, see the [C++ developer
 guide](https://arrow.apache.org/docs/developers/cpp.html#building).
 
-## Example
+### Editing Rcpp code
 
-``` r
-library(arrow)
-#> 
-#> Attaching package: 'arrow'
-#> The following object is masked from 'package:utils':
-#> 
-#>     timestamp
-#> The following objects are masked from 'package:base':
-#> 
-#>     array, table
+The `arrow` package uses some customized tools on top of `Rcpp` to
+prepare its C++ code in `src/`. If you change C++ code in the R package,
+you will need to set the `ARROW_R_DEV` environment variable to `TRUE`
+(optionally, add it to your`~/.Renviron` file to persist across
+sessions) so that the `data-raw/codegen.R` file is used for code
+generation.
 
-tab <- arrow::table(x = 1:10, y = rnorm(10))
-tab$schema
-#> arrow::Schema 
-#> x: int32
-#> y: double
-tab
-#> arrow::Table
-as.data.frame(tab)
-#> # A tibble: 10 x 2
-#>        x      y
-#>    <int>  <dbl>
-#>  1     1 -1.56 
-#>  2     2 -0.147
-#>  3     3 -1.16 
-#>  4     4  0.106
-#>  5     5  1.14 
-#>  6     6  0.340
-#>  7     7  0.184
-#>  8     8 -1.01 
-#>  9     9  1.77 
-#> 10    10  0.344
-```
-
-## Developing
-
-It is recommended to use the `devtools` package to help with some
-package-related manipulations. You’ll need a few more R packages, which
-you can install using
-
-``` r
-install.packages("devtools")
-devtools::install_dev_deps()
-```
-
-If you change C++ code, you need to set the `ARROW_R_DEV` environment
-variable to `TRUE`, e.g.  in your `~/.Renviron` so that the
-`data-raw/codegen.R` file is used for code generation.
+You’ll also need `remotes::install_github("romainfrancois/decor")`.
 
 ### Useful functions
 
+Within an R session, these can help with package development:
+
 ``` r
 devtools::load_all() # Load the dev package
 devtools::test(filter="^regexp$") # Run the test suite, optionally filtering file names
 devtools::document() # Update roxygen documentation
 rmarkdown::render("README.Rmd") # To rebuild README.md
-pkgdown::build_site() # To preview the documentation website
+pkgdown::build_site(run_dont_run=TRUE) # To preview the documentation website
 devtools::check() # All package checks; see also below
 ```
 
+Any of those can be run from the command line by wrapping them in `R -e
+'$COMMAND'`. There’s also a `Makefile` to help with some common tasks
+from the command line (`make test`, `make doc`, `make clean`, etc.)
+
 ### Full package validation
 
 ``` shell
diff --git a/r/configure b/r/configure
index 4b3484f..d5a2688 100755
--- a/r/configure
+++ b/r/configure
@@ -69,8 +69,8 @@ else
         BREWDIR=$(brew --prefix)
 
         if brew ls --versions apache-arrow > /dev/null; then
-          # right now, we need HEAD version
-          brew install apache-arrow --HEAD
+          # Install apache-arrow via Homebrew if we don't already have it
+          brew install apache-arrow
         fi
 
         PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include"
@@ -104,8 +104,8 @@ echo "#include $PKG_TEST_HEADER" | ${CXXCPP} ${CPPFLAGS} ${PKG_CFLAGS} ${CXX11FL
 # Customize the error
 if [ $? -ne 0 ]; then
   echo "------------------------- NOTE ---------------------------"
-  echo "After installation, please run arrow::install_arrow() to install"
-  echo "required runtime libraries"
+  echo "After installation, please run arrow::install_arrow()"
+  echo "for help installing required runtime libraries"
   echo "---------------------------------------------------------"
   PKG_LIBS=""
   PKG_CFLAGS=""
diff --git a/r/man/arrow_available.Rd b/r/man/arrow_available.Rd
index 26a01ca..ea2c54a 100644
--- a/r/man/arrow_available.Rd
+++ b/r/man/arrow_available.Rd
@@ -2,10 +2,16 @@
 % Please edit documentation in R/arrow-package.R
 \name{arrow_available}
 \alias{arrow_available}
-\title{Is the C++ Arrow library available}
+\title{Is the C++ Arrow library available?}
 \usage{
 arrow_available()
 }
+\value{
+\code{TRUE} or \code{FALSE} depending on whether the package was installed
+with the Arrow C++ library. If \code{FALSE}, you'll need to install the C++
+library and then reinstall the R package. See \code{\link[=install_arrow]{install_arrow()}} for help.
+}
 \description{
-Is the C++ Arrow library available
+You won't generally need to call this function, but it's here in case it
+helps for development purposes.
 }
diff --git a/r/man/install_arrow.Rd b/r/man/install_arrow.Rd
new file mode 100644
index 0000000..4393258
--- /dev/null
+++ b/r/man/install_arrow.Rd
@@ -0,0 +1,14 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/install-arrow.R
+\name{install_arrow}
+\alias{install_arrow}
+\title{Help installing the Arrow C++ library}
+\usage{
+install_arrow()
+}
+\description{
+Binary package installations should come with a working Arrow C++ library,
+but when installing from source, you'll need to obtain the C++ library
+first. This function offers guidance on how to get the C++ library depending
+on your operating system and package version.
+}
diff --git a/r/tests/testthat/test-install-arrow.R b/r/tests/testthat/test-install-arrow.R
new file mode 100644
index 0000000..318e8e0
--- /dev/null
+++ b/r/tests/testthat/test-install-arrow.R
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+context("install_arrow()")
+
+test_that("install_arrow() prints a message", {
+  expect_message(install_arrow())
+})
+
+i_have_arrow_msg <- "It appears you already have Arrow installed successfully: are you trying to install a different version of the library?
+
+Refer to the R package README <https://github.com/apache/arrow/blob/master/r/README.md> for further details.
+
+If you have other trouble, or if you think this message could be improved, please report an issue here: <https://issues.apache.org/jira/projects/ARROW/issues>"
+
+test_that("Messages get the standard postscript appended", {
+  expect_identical(
+    install_arrow_msg(has_arrow = TRUE, "0.13.0"),
+    i_have_arrow_msg
+  )
+})
+
+test_that("Solaris and Linux dev version get pointed to C++ guide", {
+  expect_match(
+    install_arrow_msg(FALSE, "0.13.0", os="sunos"),
+    "See the Arrow C++ developer guide",
+    fixed = TRUE
+  )
+  expect_match(
+    install_arrow_msg(FALSE, "0.13.0.9000", os="linux"),
+    "See the Arrow C++ developer guide",
+    fixed = TRUE
+  )
+})
+
+test_that("Linux on release version gets pointed to PPA first, then C++", {
+  expect_match(
+    install_arrow_msg(FALSE, "0.13.0", os="linux"),
+    "PPA. Or, see the Arrow C++ developer guide",
+    fixed = TRUE
+  )
+})
+
+test_that("Win/mac release version get pointed to CRAN", {
+  expect_match(
+    install_arrow_msg(FALSE, "0.13.0", os="darwin", from_cran=FALSE),
+    "install.packages",
+    fixed = TRUE
+  )
+  expect_match(
+    install_arrow_msg(FALSE, "0.13.0", os="windows", from_cran=FALSE),
+    "install.packages",
+    fixed = TRUE
+  )
+})
+
+test_that("Win/mac dev version get recommendations", {
+  expect_match(
+    install_arrow_msg(FALSE, "0.13.0.9000", os="darwin", from_cran=FALSE),
+    "Homebrew"
+  )
+  expect_match(
+    install_arrow_msg(FALSE, "0.13.0.9000", os="windows", from_cran=FALSE),
+    "RWINLIB_LOCAL"
+  )
+})