You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by np...@apache.org on 2023/05/03 14:37:18 UTC

[arrow] branch main updated: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow (#35147)

This is an automated email from the ASF dual-hosted git repository.

npr pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/main by this push:
     new ec89360212 GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow (#35147)
ec89360212 is described below

commit ec893602124b776fb42261361d1a2d21a6d61f06
Author: Neal Richardson <ne...@gmail.com>
AuthorDate: Wed May 3 10:37:09 2023 -0400

    GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow (#35147)
    
    I've significantly rewritten `r/configure` to make it easier to reason about and harder for issues like https://github.com/apache/arrow/pull/34229 and #35140 to happen. I've also added a version check to make sure that we don't obviously try to use a system C++ library that doesn't match the R package version. Making sure this was applied in all of the right places and handling what to do if the versions didn't match was the impetus for the whole refactor.
    
    `configure` has been broken up into some functions, and the flow of the script is, as is documented at the top of the file:
    
    ```
    # * Find libarrow on the system. If it is present, make sure
    #   that its version is compatible with the R package.
    # * If no suitable libarrow is found, download it (where allowed)
    #   or build it from source.
    # * Determine what features this libarrow has and what other
    #   flags it requires, and set them in src/Makevars for use when
    #   compiling the bindings.
    # * Run a test program to confirm that arrow headers are found
    ```
    
    All of the detection of CFLAGS and `-L` dirs etc. happen in one place now, and they all prefer using `pkg-config` to read from the libarrow build what libraries and flags it requires, rather than hard-coding. (autobrew is the only remaining exception, but I didn't feel like messing with that today.) This should make the builds more future proof, should make it so more build configurations work (e.g. I suspect that a static build in ARROW_HOME wouldn't have gotten picked up correctly b [...]
    
    Version checking has been added in an R script for ease of testing (and for easier handling of arithmetic), and there is an accompanying `test-check-versions.R` added. These are run on all the builds that use `ci/scripts/r_test.sh`.
    
    ### Behavior changes
    
    * If libarrow is found on the system (via ARROW_HOME, pkg-config, or brew), but the version does not match, it will not be used, and we will try a bundled build. This should mean that users installing a released version will never have libarrow version problems.
    * If both the found C++ library and R package are on matching dev versions (i.e. not identical given the x.y.z.9000 vs x+1.y.z-SNAPSHOT difference), it will proceed with a warning that you may need to rebuild if there are issues. This means that regular developers will see an extra message in the build output.
    * autobrew is only used on a release version unless you set FORCE_AUTOBREW=true. This eliminates another source of version mismatches (C++ release version, R dev version).
    * The path where you could set `LIB_DIR` and `INCLUDE_DIR` env vars has been removed. Use `ARROW_HOME` instead.
    
    * Closes: #35140
    * Closes: #31989
    
    Lead-authored-by: Neal Richardson <ne...@gmail.com>
    Co-authored-by: Sutou Kouhei <ko...@cozmixng.org>
    Signed-off-by: Neal Richardson <ne...@gmail.com>
---
 dev/tasks/conda-recipes/r-arrow/meta.yaml  |   4 +-
 dev/tasks/r/github.macos.autobrew.yml      |   1 +
 dev/tasks/r/github.packages.yml            |   3 +-
 r/Makefile                                 |   2 +-
 r/configure                                | 555 ++++++++++++++++-------------
 r/inst/build_arrow_static.sh               |   8 +-
 r/tools/check-versions.R                   |  59 +++
 r/tools/nixlibs.R                          |  30 +-
 r/tools/test-check-versions.R              |  62 ++++
 r/vignettes/developers/install_details.Rmd |  42 ++-
 r/vignettes/developers/install_nix.png     | Bin 99333 -> 0 bytes
 r/vignettes/install.Rmd                    |   9 +-
 r/vignettes/install_nightly.Rmd            |   2 +-
 13 files changed, 502 insertions(+), 275 deletions(-)

diff --git a/dev/tasks/conda-recipes/r-arrow/meta.yaml b/dev/tasks/conda-recipes/r-arrow/meta.yaml
index 4c86dc9280..28ee8eb92c 100644
--- a/dev/tasks/conda-recipes/r-arrow/meta.yaml
+++ b/dev/tasks/conda-recipes/r-arrow/meta.yaml
@@ -59,8 +59,8 @@ requirements:
 
 test:
   commands:
-    - $R -e "library('arrow')"           # [not win]
-    - "\"%R%\" -e \"library('arrow'); data(mtcars); write_parquet(mtcars, 'test.parquet')\""  # [win]
+    - $R -e "library('arrow'); stopifnot(arrow_with_acero(), arrow_with_dataset(), arrow_with_parquet(), arrow_with_s3())"           # [not win]
+    - "\"%R%\" -e \"library('arrow'); stopifnot(arrow_with_acero(), arrow_with_dataset(), arrow_with_parquet(), arrow_with_s3())\""  # [win]
 
 about:
   home: https://github.com/apache/arrow
diff --git a/dev/tasks/r/github.macos.autobrew.yml b/dev/tasks/r/github.macos.autobrew.yml
index 67157e854f..28733dbfef 100644
--- a/dev/tasks/r/github.macos.autobrew.yml
+++ b/dev/tasks/r/github.macos.autobrew.yml
@@ -71,6 +71,7 @@ jobs:
           NOT_CRAN: true
           ARROW_USE_PKG_CONFIG: false
           ARROW_R_DEV: true
+          FORCE_AUTOBREW: true
         {{ macros.github_set_sccache_envvars()|indent(8)}}  
         run: arrow/ci/scripts/r_test.sh arrow
       - name: Dump install logs
diff --git a/dev/tasks/r/github.packages.yml b/dev/tasks/r/github.packages.yml
index 32ad6f3ab3..e3e3d34e15 100644
--- a/dev/tasks/r/github.packages.yml
+++ b/dev/tasks/r/github.packages.yml
@@ -186,7 +186,8 @@ jobs:
         shell: Rscript {0}
         env:
           NOT_CRAN: "true" # actions/setup-r sets this implicitly
-          ARROW_R_DEV: TRUE
+          ARROW_R_DEV: "true"
+          FORCE_AUTOBREW: "true" # this is ignored on windows
           # sccache for macos
         {{ macros.github_set_sccache_envvars()|indent(8) }}
         run: |
diff --git a/r/Makefile b/r/Makefile
index 07a350e4e9..3679840ca9 100644
--- a/r/Makefile
+++ b/r/Makefile
@@ -31,7 +31,7 @@ doc: style
 	-git add --all man/*.Rd
 
 test:
-	export ARROW_R_DEV=$(ARROW_R_DEV) && R CMD INSTALL --install-tests --no-test-load --no-docs --no-help --no-byte-compile .
+	export NOT_CRAN=true && export ARROW_R_DEV=$(ARROW_R_DEV) && R CMD INSTALL --install-tests --no-test-load --no-docs --no-help --no-byte-compile .
 	export NOT_CRAN=true && export ARROW_R_DEV=$(ARROW_R_DEV) && export AWS_EC2_METADATA_DISABLED=TRUE && export ARROW_LARGE_MEMORY_TESTS=$(ARROW_LARGE_MEMORY_TESTS) && R -s -e 'library(testthat); setwd(file.path(.libPaths()[1], "arrow", "tests")); system.time(test_check("arrow", filter="${file}", reporter=ifelse(nchar("${r}"), "${r}", "summary")))'
 
 deps:
diff --git a/r/configure b/r/configure
index 65528bdc38..a6d6f37037 100755
--- a/r/configure
+++ b/r/configure
@@ -17,39 +17,81 @@
 # specific language governing permissions and limitations
 # under the License.
 
-# Anticonf (tm) script by Jeroen Ooms, Jim Hester (2017)
-# License: MIT
+# This script makes sure we have our C++ dependencies in order
+# so that we can compile the bindings in src/ and build the
+# R package's shared library.
 #
-# This script will query 'pkg-config' for the required cflags and ldflags.
-# If pkg-config is unavailable or does not find the library, try setting
-# INCLUDE_DIR and LIB_DIR manually via e.g:
-# R CMD INSTALL --configure-vars='INCLUDE_DIR=/.../include LIB_DIR=/.../lib'
+# The core logic is:
+#
+# * Find libarrow on the system. If it is present, make sure
+#   that its version is compatible with the R package.
+# * If no suitable libarrow is found, download it (where allowed)
+#   or build it from source.
+# * Determine what features this libarrow has and what other
+#   flags it requires, and set them in src/Makevars for use when
+#   compiling the bindings.
+# * Run a test program to confirm that arrow headers are found
+#
+# It is intended to be flexible enough to account for several
+# workflows, and should be expected to always result in a successful
+# build as long as you haven't set environment variables that
+# would limit its options.
+#
+# * Installing a released version from source, as from CRAN, with
+#   no other prior setup
+#   * On macOS, autobrew is used to retrieve libarrow and dependencies
+#   * On Linux, the nixlibs.R build script will download or build
+# * Installing a released version but first installing libarrow.
+#   It will use pkg-config and brew to search for libraries.
+# * Installing a development version from source as a user.
+#   Any libarrow found on the system corresponding to a release will not match
+#   the dev version and will thus be skipped. nixlibs.R will be used to build libarrow.
+# * TODO(GH-35049): Installing a dev version as an infrequent contributor.
+#   Currently the configure script doesn't offer much to make this easy.
+#   If you expect to rebuild multiple times, you should set up a dev
+#   environment.
+# * Installing a dev version as a regular developer. 
+#   The best way is to maintain your own cmake build and install it
+#   to a directory (not system) that you set as the env var
+#   $ARROW_HOME.  
+#
+# For more information, see the various installation and developer vignettes.
 
 # Library settings
 PKG_CONFIG_NAME="arrow"
-PKG_DEB_NAME="(unsuppored)"
-PKG_RPM_NAME="(unsuppored)"
 PKG_BREW_NAME="apache-arrow"
 PKG_TEST_HEADER="<arrow/api.h>"
 
-# Make some env vars case-insensitive
+# Some env vars that control the build (all logical, case insensitive)
+# Development mode, also increases verbosity in the bundled build
 ARROW_R_DEV=`echo $ARROW_R_DEV | tr '[:upper:]' '[:lower:]'`
+# autobrew is how mac binaries are built on CRAN; FORCE ensures we use it here
 FORCE_AUTOBREW=`echo $FORCE_AUTOBREW | tr '[:upper:]' '[:lower:]'`
+# The bundled build compiles arrow C++ from source; FORCE ensures we don't pick up
+# any other packages that may be found on the system
 FORCE_BUNDLED_BUILD=`echo $FORCE_BUNDLED_BUILD | tr '[:upper:]' '[:lower:]'`
+# If present, `pkg-config` will be used to find libarrow on the system,
+# unless this is set to false
 ARROW_USE_PKG_CONFIG=`echo $ARROW_USE_PKG_CONFIG | tr '[:upper:]' '[:lower:]'`
-LIBARROW_MINIMAL=`echo $LIBARROW_MINIMAL | tr '[:upper:]' '[:lower:]'`
+# Just used in testing: whether or not it is ok to download dependencies (in the
+# bundled build)
 TEST_OFFLINE_BUILD=`echo $TEST_OFFLINE_BUILD | tr '[:upper:]' '[:lower:]'`
-NOT_CRAN=`echo $NOT_CRAN | tr '[:upper:]' '[:lower:]'`
 
 VERSION=`grep '^Version' DESCRIPTION | sed s/Version:\ //`
 UNAME=`uname -s`
+: ${PKG_CONFIG:="pkg-config"}
 
-# generate code
+# These will only be set in the bundled build
+S3_LIBS=""
+GCS_LIBS=""
+
+# If in development mode, run the codegen script to render arrowExports.*
 if [ "$ARROW_R_DEV" = "true" ] && [ -f "data-raw/codegen.R" ]; then
   echo "*** Generating code with data-raw/codegen.R"
   ${R_HOME}/bin/Rscript data-raw/codegen.R
 fi
 
+# Arrow requires C++17, so check for it
 if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   echo "------------------------- NOTE ---------------------------"
   echo "Cannot install arrow: a C++17 compiler is required."
@@ -58,191 +100,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
+# Test if pkg-config is available to use
+if ${PKG_CONFIG} --version >/dev/null 2>&1; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
+  ARROW_USE_PKG_CONFIG="false"
 fi
 
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
-  ARROW_USE_PKG_CONFIG="false"
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "${OPENSSL_ROOT_DIR}" = "" ] && brew --prefix openssl >/dev/null 2>&1; then
+  export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+  export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
-  if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
+#############
+# Functions #
+#############
+
+find_or_build_libarrow () {
+  if [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+    do_bundled_build
+  elif [ "$FORCE_AUTOBREW" = "true" ]; then
+    do_autobrew
   else
-    PKG_CONFIG_AVAILABLE=false
+    find_arrow
+    if [ "$_LIBARROW_FOUND" = "false" ]; then
+      # If we haven't found a suitable version of libarrow, build it
+      if [ "$UNAME" = "Darwin" ] && ! echo $VERSION | grep -q "000"; then
+        # Only autobrew on release version (for testing, use FORCE_AUTOBREW above)
+        # (dev versions end in .9000, and nightly gets something like .10000xxx)
+        do_autobrew
+      else
+        do_bundled_build
+      fi
+    fi
   fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+}
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --variable=prefix --silence-errors ${PKG_CONFIG_NAME}`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif brew --prefix ${PKG_BREW_NAME} > /dev/null 2>&1; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  else
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion ${PKG_CONFIG_NAME}`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    if ! ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    export PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # These would be identified by pkg-config, in Requires.private and Libs.private.
+      # Rather than try to re-implement pkg-config, we can just hard-code them here.
+      S3_LIBS="-lcurl -lssl -lcrypto"
+      GCS_LIBS="-lcurl -lssl -lcrypto"
+    fi
+  else
+    # If the library directory does not exist, the script must not have been successful
+    _LIBARROW_FOUND="false"
+  fi
+}
 
-      # find openssl on macos. macOS ships with libressl. openssl is installable
-      # with brew, but it is generally not linked. We can over-ride this and find
-      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
-      # the installation process). FWIW, arrow's cmake process uses this
-      # same process to find openssl, but doing it now allows us to catch it in
-      # nixlibs.R and throw a nicer error.
-      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
-        brew --prefix openssl >/dev/null 2>&1
-        if [ $? -eq 0 ]; then
-          export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
-          export PKG_CONFIG_PATH="`brew --prefix openssl`/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-        fi
-      fi
+do_autobrew () {
+  echo "*** Downloading ${PKG_BREW_NAME}"
 
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "" ]; then
-        export ARROW_DEPENDENCY_SOURCE=AUTO
-      fi
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ] && \
-           [ "${PKG_CONFIG_AVAILABLE}" = "false" ]; then
-        export ARROW_DEPENDENCY_SOURCE=BUNDLED
-        echo "**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED"
-      fi
+  # Setup for local autobrew testing
+  if [ -f "tools/apache-arrow.rb" ]; then
+    # If you want to use a local apache-arrow.rb formula, do
+    # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
+    # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository).
+    # If you have this, you should use the local autobrew script so they match.
+    cp tools/autobrew .
+  fi
 
-      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
-
-      LIB_DIR="`pwd`/libarrow/arrow-${VERSION}/lib"
-      if [ -d "$LIB_DIR" ]; then
-        if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
-          # Use pkg-config to do static linking of libarrow's dependencies
-          export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-          PKG_CONFIG="pkg-config"
-          # pkg-config on CentOS 7 doesn't have --define-prefix option.
-          if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
-            # --define-prefix is for binary packages. Binary packages
-            # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
-            # match the extracted path. --define-prefix uses a directory
-            # that arrow.pc exists as its prefix instead of
-            # "/arrow/r/libarrow/dist".
-            PKG_CONFIG="${PKG_CONFIG} --define-prefix"
-          else
-            # Rewrite prefix= in arrow.pc on CentOS 7.
-            sed \
-              -i.bak \
-              -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
-              ${LIB_DIR}/pkgconfig/*.pc
-            rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
-          fi
-          PKG_CONFIG="${PKG_CONFIG} --silence-errors"
-          PKG_CFLAGS="`${PKG_CONFIG} --cflags ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
-          PKG_DIRS="`${PKG_CONFIG} --libs-only-L ${PKG_CONFIG_NAME}`"
-          PKG_LIBS="`${PKG_CONFIG} --libs-only-l --libs-only-other ${PKG_CONFIG_NAME}`"
-        else
-          # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
-          PKG_CFLAGS="-I${LIB_DIR}/../include $PKG_CFLAGS"
-          if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
-            PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
-          fi
-          PKG_DIRS="-L${LIB_DIR}"
-          if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
-            PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
-          fi
-          PKG_LIBS="-larrow"
-          if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
-            PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
-          fi
-          S3_LIBS="-lcurl -lssl -lcrypto"
-          GCS_LIBS="-lcurl -lssl -lcrypto"
-        fi
-      fi
+  if [ -f "autobrew" ]; then
+    echo "**** Using local manifest for ${PKG_BREW_NAME}"
+  else
+    if ! curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew; then
+      echo "Failed to download manifest for ${PKG_BREW_NAME}"
+      # Fall back to the local copy
+      cp tools/autobrew .
     fi
   fi
-fi
+  if ! . autobrew; then
+    echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
+  fi
+  # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+  # TODO: move PKG_LIBS and PKG_CFLAGS out of autobrew and use set_pkg_vars
+}
 
-# If on Raspberry Pi, need to manually link against latomic
-# See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
-if grep raspbian /etc/os-release >/dev/null 2>&1; then
-  PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
-  PKG_LIBS="-latomic $PKG_LIBS"
-fi
+# Once libarrow is obtained, this function sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+# either from pkg-config or by inferring things about the directory in $1
+set_pkg_vars () {
+  LIB_DIR="$1/lib"
+  if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+    set_pkg_vars_with_pc
+  else
+    set_pkg_vars_without_pc $1
+  fi
+
+  # Set any user-defined CXXFLAGS
+  if [ "$ARROW_R_CXXFLAGS" ]; then
+    PKG_CFLAGS="$PKG_CFLAGS $ARROW_R_CXXFLAGS"
+  fi
+
+  # Finally, check cmake options for enabled features
+  add_feature_flags
+}
+
+# If we have pkg-config, it will tell us what libarrow needs
+set_pkg_vars_with_pc () {
+  PKG_CFLAGS="`${PKG_CONFIG} --cflags --silence-errors ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
+  PKG_LIBS=`${PKG_CONFIG} --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
+  PKG_DIRS=`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+}
+
+# If we don't have pkg-config, we can make some inferences
+set_pkg_vars_without_pc () {
+  PKG_CFLAGS="-I$1/include $PKG_CFLAGS"
+  if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
+    PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
+  fi
+  PKG_DIRS="-L${LIB_DIR}"
+  if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
+    PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
+  fi
+  PKG_LIBS="-larrow"
+  if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
+    PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
+  fi
 
-# Set any user-defined CXXFLAGS
-if [ "$ARROW_R_CXXFLAGS" ]; then
-  PKG_CFLAGS="$PKG_CFLAGS $ARROW_R_CXXFLAGS"
+  # If on Raspberry Pi, need to manually link against latomic
+  # See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
+  # pkg-config will handle this for us automatically, see ARROW-6312
+  if grep raspbian /etc/os-release >/dev/null 2>&1; then
+    PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
+    PKG_LIBS="$PKG_LIBS -latomic"
+  fi
+}
+
+add_feature_flags () {
+  # Now we need to check what features it was built with and enable
+  # the corresponding feature flags in the R bindings (-DARROW_R_WITH_stuff).
+  # We do this by inspecting ArrowOptions.cmake, which the libarrow build
+  # generates.
+  ARROW_OPTS_CMAKE="$LIB_DIR/cmake/Arrow/ArrowOptions.cmake"
+  if [ ! -f "${ARROW_OPTS_CMAKE}" ]; then
+    echo "*** $ARROW_OPTS_CMAKE not found; some features will not be enabled"
+  else
+    if arrow_built_with ARROW_PARQUET; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_PARQUET"
+      PKG_LIBS="-lparquet $PKG_LIBS"
+      # NOTE: parquet is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_DATASET; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_DATASET"
+      PKG_LIBS="-larrow_dataset $PKG_LIBS"
+      # NOTE: arrow-dataset is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_ACERO; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_ACERO"
+      PKG_LIBS="-larrow_acero $PKG_LIBS"
+      # NOTE: arrow-acero is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_SUBSTRAIT; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_SUBSTRAIT"
+      PKG_LIBS="-larrow_substrait $PKG_LIBS"
+      # NOTE: arrow-substrait is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_JSON; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_JSON"
+    fi
+    if arrow_built_with ARROW_S3; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_S3"
+      PKG_LIBS="$PKG_LIBS $S3_LIBS"
+    fi
+    if arrow_built_with ARROW_GCS; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_GCS"
+      PKG_LIBS="$PKG_LIBS $GCS_LIBS"
+    fi
+  fi
+}
+
+arrow_built_with() {
+  # Function to check cmake options for features
+  grep -i 'set('"$1"' "ON")' $ARROW_OPTS_CMAKE >/dev/null 2>&1
+}
+
+##############
+# Main logic #
+##############
+
+# First:
+find_or_build_libarrow
+
+# We should have a valid libarrow build in $_LIBARROW_FOUND
+# Now set `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS` based on that.
+# _LIBARROW_FOUND should only be unset in the FORCE_AUTOBREW case,
+# and autobrew sets its own PKG_* vars. (See the TODO about changing that.)
+if [ "$_LIBARROW_FOUND" != "false" ] && [ "$_LIBARROW_FOUND" != "" ]; then
+  set_pkg_vars ${_LIBARROW_FOUND}
 fi
 
-# Test that we can find libarrow
+# Test that we can compile something with those flags
 CXX17="`${R_HOME}/bin/R CMD config CXX17` -E"
 CXX17FLAGS=`"${R_HOME}"/bin/R CMD config CXX17FLAGS`
 CXX17STD=`"${R_HOME}"/bin/R CMD config CXX17STD`
@@ -251,54 +386,14 @@ TEST_CMD="${CXX17} ${CPPFLAGS} ${PKG_CFLAGS} ${CXX17FLAGS} ${CXX17STD} -xc++ -"
 echo "#include $PKG_TEST_HEADER" | ${TEST_CMD} >/dev/null 2>&1
 
 if [ $? -eq 0 ]; then
-  # Check for features
-  ARROW_OPTS_CMAKE="$LIB_DIR/cmake/Arrow/ArrowOptions.cmake"
-
-  arrow_built_with() {
-    # Function to check cmake options for features
-    grep -i 'set('"$1"' "ON")' $ARROW_OPTS_CMAKE >/dev/null 2>&1
-  }
-
-  if arrow_built_with ARROW_PARQUET; then
-    PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_PARQUET"
-    PKG_LIBS="-lparquet $PKG_LIBS"
-    # NOTE: parquet is assumed to have the same -L flag as arrow
-    # so there is no need to add its location to PKG_DIRS
-  fi
-  if arrow_built_with ARROW_DATASET; then
-    PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_DATASET"
-    PKG_LIBS="-larrow_dataset $PKG_LIBS"
-    # NOTE: arrow-dataset is assumed to have the same -L flag as arrow
-    # so there is no need to add its location to PKG_DIRS
-  fi
-  if arrow_built_with ARROW_ACERO; then
-    PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_ACERO"
-    PKG_LIBS="-larrow_acero $PKG_LIBS"
-    # NOTE: arrow-acero is assumed to have the same -L flag as arrow
-    # so there is no need to add its location to PKG_DIRS
-  fi
-  if arrow_built_with ARROW_SUBSTRAIT; then
-    PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_SUBSTRAIT"
-    PKG_LIBS="-larrow_substrait $PKG_LIBS"
-    # NOTE: arrow-substrait is assumed to have the same -L flag as arrow
-    # so there is no need to add its location to PKG_DIRS
-  fi
-  if arrow_built_with ARROW_JSON; then
-    PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_JSON"
-  fi
-  if arrow_built_with ARROW_S3; then
-    PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_S3"
-    PKG_LIBS="$PKG_LIBS $S3_LIBS"
-  fi
-  if arrow_built_with ARROW_GCS; then
-    PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_GCS"
-    PKG_LIBS="$PKG_LIBS $GCS_LIBS"
-  fi
-
-  # prepend PKG_DIRS to PKG_LIBS
+  # Prepend PKG_DIRS to PKG_LIBS and write to Makevars
   PKG_LIBS="$PKG_DIRS $PKG_LIBS"
   echo "PKG_CFLAGS=$PKG_CFLAGS"
   echo "PKG_LIBS=$PKG_LIBS"
+  sed -e "s|@cflags@|$PKG_CFLAGS|" -e "s|@libs@|$PKG_LIBS|" src/Makevars.in > src/Makevars
+  # Success
+  exit 0
+
 else
   echo "------------------------- NOTE ---------------------------"
   echo "There was an issue preparing the Arrow C++ libraries."
@@ -309,21 +404,3 @@ else
   exit 1
 
 fi
-
-# Write to Makevars
-sed -e "s|@cflags@|$PKG_CFLAGS|" -e "s|@libs@|$PKG_LIBS|" src/Makevars.in > src/Makevars
-
-# This is removed because a (bad?) CRAN check fails when arrow.so is stripped
-# # Add stripping
-# if [ "$R_STRIP_SHARED_LIB" != "" ]; then
-#   # R_STRIP_SHARED_LIB is set in the global Renviron and should be available here
-#   echo "
-# strip: \$(SHLIB)
-# 	$R_STRIP_SHARED_LIB \$(SHLIB) >/dev/null 2>&1 || true
-#
-# .phony: strip
-# " >> src/Makevars
-# fi
-
-# Success
-exit 0
diff --git a/r/inst/build_arrow_static.sh b/r/inst/build_arrow_static.sh
index e5a9f127ed..4c7f705708 100755
--- a/r/inst/build_arrow_static.sh
+++ b/r/inst/build_arrow_static.sh
@@ -36,7 +36,13 @@ set -x
 SOURCE_DIR="$(cd "${SOURCE_DIR}" && pwd)"
 DEST_DIR="$(mkdir -p "${DEST_DIR}" && cd "${DEST_DIR}" && pwd)"
 
-: ${N_JOBS:="$(nproc)"}
+if [ "$N_JOBS" = "" ]; then
+  if [ "`uname -s`" = "Darwin" ]; then
+    N_JOBS="$(sysctl -n hw.logicalcpu)"
+  else
+    N_JOBS="$(nproc)"
+  fi
+fi
 
 # Make some env vars case-insensitive
 if [ "$LIBARROW_MINIMAL" != "" ]; then
diff --git a/r/tools/check-versions.R b/r/tools/check-versions.R
new file mode 100644
index 0000000000..265c207fa9
--- /dev/null
+++ b/r/tools/check-versions.R
@@ -0,0 +1,59 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+args <- commandArgs(TRUE)
+
+# TESTING is set in test-check-version.R; it won't be set when called from configure
+test_mode <- exists("TESTING")
+
+check_versions <- function(r_version, cpp_version) {
+  r_parsed <- package_version(r_version)
+  r_dev_version <- r_parsed[1, 4]
+  r_is_dev <- !is.na(r_dev_version) && r_dev_version > 100
+  cpp_is_dev <- grepl("SNAPSHOT$", cpp_version)
+  cpp_parsed <- package_version(sub("-SNAPSHOT$", "", cpp_version))
+
+  major <- function(x) as.numeric(x[1, 1])
+  # R and C++ denote dev versions differently
+  # R is current.release.9000, C++ is next.release-SNAPSHOT
+  # So a "match" is if the R major version + 1 = C++ major version
+  if (r_is_dev && cpp_is_dev && major(r_parsed) + 1 == major(cpp_parsed)) {
+    msg <- c(
+      sprintf("*** > Packages are both on development versions (%s, %s)", cpp_version, r_version),
+      "*** > If installation fails, rebuild the C++ library to match the R version",
+      "*** > or retry with FORCE_BUNDLED_BUILD=true"
+    )
+    cat(paste0(msg, "\n", collapse = ""))
+  } else if (r_version != cpp_version) {
+    cat(
+      sprintf(
+        "**** Not using: C++ library version (%s) does not match R package (%s)\n",
+        cpp_version,
+        r_version
+      )
+    )
+    stop("version mismatch")
+    # Add ALLOW_VERSION_MISMATCH env var to override stop()? (Could be useful for debugging)
+  } else {
+    # OK
+    cat(sprintf("**** C++ and R library versions match: %s\n", cpp_version))
+  }
+}
+
+if (!test_mode) {
+  check_versions(args[1], args[2])
+}
diff --git a/r/tools/nixlibs.R b/r/tools/nixlibs.R
index b1af28272a..f8e42ecead 100644
--- a/r/tools/nixlibs.R
+++ b/r/tools/nixlibs.R
@@ -53,6 +53,17 @@ try_download <- function(from_url, to_file, hush = quietly) {
   !inherits(status, "try-error") && status == 0
 }
 
+not_cran <- env_is("NOT_CRAN", "true")
+if (not_cran) {
+  # Set more eager defaults
+  if (env_is("LIBARROW_BINARY", "")) {
+    Sys.setenv(LIBARROW_BINARY = "true")
+  }
+  if (env_is("LIBARROW_MINIMAL", "")) {
+    Sys.setenv(LIBARROW_MINIMAL = "false")
+  }
+}
+
 # For local debugging, set ARROW_R_DEV=TRUE to make this script print more
 quietly <- !env_is("ARROW_R_DEV", "true")
 
@@ -374,11 +385,14 @@ build_libarrow <- function(src_dir, dst_dir) {
   # We'll need to compile R bindings with these libs, so delete any .o files
   system("rm src/*.o", ignore.stdout = TRUE, ignore.stderr = TRUE)
   # Set up make for parallel building
+  # CRAN policy says not to use more than 2 cores during checks
+  # If you have more and want to use more, set MAKEFLAGS or NOT_CRAN
+  ncores <- parallel::detectCores()
+  if (!not_cran) {
+    ncores <- min(ncores, 2)
+  }
   makeflags <- Sys.getenv("MAKEFLAGS")
   if (makeflags == "") {
-    # CRAN policy says not to use more than 2 cores during checks
-    # If you have more and want to use more, set MAKEFLAGS
-    ncores <- min(parallel::detectCores(), 2)
     makeflags <- sprintf("-j%s", ncores)
     Sys.setenv(MAKEFLAGS = makeflags)
   }
@@ -416,8 +430,16 @@ build_libarrow <- function(src_dir, dst_dir) {
     CC = sub("^.*ccache", "", R_CMD_config("CC")),
     CXX = paste(sub("^.*ccache", "", R_CMD_config("CXX17")), R_CMD_config("CXX17STD")),
     # CXXFLAGS = R_CMD_config("CXX17FLAGS"), # We don't want the same debug symbols
-    LDFLAGS = R_CMD_config("LDFLAGS")
+    LDFLAGS = R_CMD_config("LDFLAGS"),
+    N_JOBS = ncores
   )
+
+  dep_source <- Sys.getenv("ARROW_DEPENDENCY_SOURCE")
+  if (dep_source %in% c("", "AUTO") && !nzchar(Sys.which("pkg-config"))) {
+    cat("**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED\n")
+    env_var_list <- c(env_var_list, ARROW_DEPENDENCY_SOURCE = "BUNDLED")
+  }
+
   env_var_list <- with_cloud_support(env_var_list)
 
   # turn_off_all_optional_features() needs to happen after
diff --git a/r/tools/test-check-versions.R b/r/tools/test-check-versions.R
new file mode 100644
index 0000000000..5a26bf87db
--- /dev/null
+++ b/r/tools/test-check-versions.R
@@ -0,0 +1,62 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+
+# Usage: run testthat::test_dir(".") inside of this directory
+
+# Flag so that we just load the functions and don't evaluate them like we do
+# when called from configure.R
+TESTING <- TRUE
+
+source("check-versions.R", local = TRUE)
+
+test_that("check_versions", {
+  expect_output(
+    check_versions("10.0.0", "10.0.0"),
+    "**** C++ and R library versions match: 10.0.0",
+    fixed = TRUE
+  )
+  expect_output(
+    expect_error(
+      check_versions("10.0.0", "10.0.0-SNAPSHOT"),
+      "version mismatch"
+    ),
+    "**** Not using: C++ library version (10.0.0-SNAPSHOT) does not match R package (10.0.0)",
+    fixed = TRUE
+  )
+  expect_output(
+    expect_error(
+      check_versions("10.0.0.9000", "10.0.0-SNAPSHOT"),
+      "version mismatch"
+    ),
+    "**** Not using: C++ library version (10.0.0-SNAPSHOT) does not match R package (10.0.0.9000)",
+    fixed = TRUE
+  )
+  expect_output(
+    expect_error(
+      check_versions("10.0.0.9000", "10.0.0"),
+      "version mismatch"
+    ),
+    "**** Not using: C++ library version (10.0.0) does not match R package (10.0.0.9000)",
+    fixed = TRUE
+  )
+  expect_output(
+    check_versions("10.0.0.9000", "11.0.0-SNAPSHOT"),
+    "*** > Packages are both on development versions (11.0.0-SNAPSHOT, 10.0.0.9000)\n",
+    fixed = TRUE
+  )
+})
diff --git a/r/vignettes/developers/install_details.Rmd b/r/vignettes/developers/install_details.Rmd
index 3ab6265cb1..2f2f126b61 100644
--- a/r/vignettes/developers/install_details.Rmd
+++ b/r/vignettes/developers/install_details.Rmd
@@ -35,14 +35,14 @@ handle finding the libarrow, setting up the build variables necessary, and
 writing the package Makevars file that is used to compile the C++ code in the R
 package.
 
-* `tools/nixlibs.R` - this script is sometimes called by `configure` on Linux
+* `tools/nixlibs.R` - this script is called by `configure` on Linux
 (or on any non-windows OS with the environment variable
 `FORCE_BUNDLED_BUILD=true`) if an existing libarrow installation cannot be found.
 This sets up the build process for our bundled builds (which is the default on 
 linux) and checks for binaries or downloads libarrow from source depending on 
 dependency availability and build configuration.
 
-* `tools/winlibs.R` - this script is sometimes called by `configure.win` on Windows
+* `tools/winlibs.R` - this script is called by `configure.win` on Windows
 when environment variable `ARROW_HOME` is not set.  It looks for an existing libarrow
 installation, and if it can't find one downloads an appropriate libarrow binary.
 
@@ -74,24 +74,30 @@ If no existing libarrow installations can be found, the script proceeds to try t
 
 ### Non-Windows
 
-The diagram below shows how the R package finds a libarrow installation on non-Windows systems.
+On Linux and macOS, the core logic is:
 
-```{r, echo=FALSE, out.width="70%", fig.alt = "Flowchart of libarrow installation on non-Windows systems - find full description in sections 'Using pkg-config', 'Prebuilt binaries' and 'Building from source' below"}
-knitr::include_graphics("./install_nix.png")
-```
+1. If `FORCE_BUNDLED_BUILD=true`, skip to step 3.
+2. Find libarrow on the system. If it is present, make sure that its version
+   is compatible with the R package.
+3. If no suitable libarrow is found, download it (where allowed) or build it from source.
+4. Determine what features this libarrow has and what other flags it requires, 
+   and set them in `src/Makevars` for use when compiling the bindings.
 
-More information about these steps can be found below.
+#### Finding libarrow on the system
 
-#### Using pkg-config
+The `configure` script will look for libarrow in three places:
 
-When you install the arrow R package on non-Windows systems, if no environment variables 
-relating to the location of an existing libarrow installation have already by 
-set, the installation code will attempt to find libarrow on
-your system using the `pkg-config` command.
+1. The path in environment variable `ARROW_HOME`, if set
+2. Whatever `pkg-config` finds, unless `ARROW_USE_PKG_CONFIG=false`
+3. Homebrew, if you have done `brew install apache-arrow`
 
-This will find either installed system packages or libraries you've built yourself.
-In order for `install.packages("arrow")` to work with these system packages,
-you'll need to install them before installing the R package.
+If a libarrow build is found, it will then check that the version of that C++ library
+matches that of the R package. If the versions do not match, like when you've installed
+a system package for a release version but you have a development version of the 
+R package, that libarrow will not be used. If both the C++ library and R package are
+on development versions, you will see a warning message advising you that if you do have
+trouble, you should ensure that the C++ library was built from the same commit as the R
+package, as development version numbers do not change with every commit.
 
 #### Prebuilt binaries
 
@@ -108,7 +114,7 @@ downloaded and bundled when your R package compiles.
 
 #### Building from source
 
-If no libarrow binary is found, it will attempt to build it locally.
+If no suitable libarrow binary is found, it will attempt to build it locally.
 First, it will also look to see if you are in a checkout of the `apache/arrow` 
 git repository and thus have the libarrow source files there.
 Otherwise, it builds from the source files included in the package.
@@ -126,8 +132,8 @@ See the [Arrow project installation page](https://arrow.apache.org/install/)
 to find pre-compiled binary packages for some common Linux distributions,
 including Debian, Ubuntu, and CentOS.
 
-Generally, we do not recommend this method of working with libarrow with the R 
-package unless you have a specific reason to do so. 
+If you are a developer contributing to the R package, system libarrow packages won't
+be useful because the versions will not match.
 
 # Using the R package with an existing libarrow build
 
diff --git a/r/vignettes/developers/install_nix.png b/r/vignettes/developers/install_nix.png
deleted file mode 100644
index e8ddef94c1..0000000000
Binary files a/r/vignettes/developers/install_nix.png and /dev/null differ
diff --git a/r/vignettes/install.Rmd b/r/vignettes/install.Rmd
index b33c439236..20fe4ed896 100644
--- a/r/vignettes/install.Rmd
+++ b/r/vignettes/install.Rmd
@@ -520,17 +520,10 @@ so that we can improve the script.
 
 * On CentOS, building the package requires a more modern `devtoolset` than the default system compilers. See "System dependencies" above.
 
-* If you have multiple versions of `zstd` installed on your system,
-installation by building libarrow from source may fail with an "undefined symbols"
-error. Workarounds include (1) setting `LIBARROW_BINARY` to use a C++ binary; (2)
-setting `ARROW_WITH_ZSTD=OFF` to build without `zstd`; or (3) uninstalling
-the conflicting `zstd`.
-See discussion [here](https://issues.apache.org/jira/browse/ARROW-8556).
-
 ## Contributing
 
 We are constantly working to make the installation process as painless as 
-possible. If you find ways to improve the process, please [report an issue](https://issues.apache.org/jira/projects/ARROW/issues) so that we can
+possible. If you find ways to improve the process, please [report an issue](https://github.com/apache/arrow/issues) so that we can
 document it. Similarly, if you find that your Linux distribution
 or version is not supported, we would welcome the contribution of Docker 
 images (hosted on Docker Hub) that we can use in our continuous integration 
diff --git a/r/vignettes/install_nightly.Rmd b/r/vignettes/install_nightly.Rmd
index 2562cdf6e8..af1940204b 100644
--- a/r/vignettes/install_nightly.Rmd
+++ b/r/vignettes/install_nightly.Rmd
@@ -46,7 +46,7 @@ R CMD INSTALL .
 
 If you don't already have libarrow on your system,
 when installing the R package from source, it will also download and build
-libarrow for you. See the section above on build environment
+libarrow for you. See the links below for build environment
 variables for options for configuring the build source and enabled features.
 
 ## Further reading