You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "nealrichardson (via GitHub)" <gi...@apache.org> on 2023/04/14 21:14:28 UTC

[GitHub] [arrow] nealrichardson opened a new pull request, #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

nealrichardson opened a new pull request, #35147:
URL: https://github.com/apache/arrow/pull/35147

   See #35140. We do not currently have any CI jobs that test installing system arrow packages and using them in the R package.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1181569617


##########
r/configure:
##########
@@ -58,191 +100,282 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
+# Test if pkg-config is available to use
+if ${PKG_CONFIG} --version >/dev/null 2>&1; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
+  ARROW_USE_PKG_CONFIG="false"
 fi
 
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
-  ARROW_USE_PKG_CONFIG="false"
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  if brew --prefix openssl >/dev/null 2>&1; then
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  fi

Review Comment:
   @kou can we simplify this one further?
   
   ```suggestion
   if [ "${OPENSSL_ROOT_DIR}" = "" ] && brew --prefix openssl >/dev/null 2>&1; then
     export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
     export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531512012

   I think that this problem is related to `-I` not `-L`/`-l`: https://github.com/apache/arrow/issues/34523#issuecomment-1463069526


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1181984625


##########
r/configure:
##########
@@ -58,191 +100,282 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
+# Test if pkg-config is available to use
+if ${PKG_CONFIG} --version >/dev/null 2>&1; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
+  ARROW_USE_PKG_CONFIG="false"
 fi
 
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
-  ARROW_USE_PKG_CONFIG="false"
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  if brew --prefix openssl >/dev/null 2>&1; then
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  fi

Review Comment:
   Yes!
   Sorry, I didn't notice this in the previous review. :<



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] h-vetinari commented on a diff in pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "h-vetinari (via GitHub)" <gi...@apache.org>.
h-vetinari commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1176024056


##########
r/configure:
##########
@@ -252,6 +252,10 @@ echo "#include $PKG_TEST_HEADER" | ${TEST_CMD} >/dev/null 2>&1
 
 if [ $? -eq 0 ]; then
   # Check for features
+  if [ "$LIB_DIR" = "" ]; then
+    # In case LIB_DIR wasn't previously set, infer from PKG_DIRS
+    LIB_DIR=`echo $PKG_DIRS | sed -e 's/^-L//'`
+  fi

Review Comment:
   From the conda-side, we were able to fix it by specifying `LIB_DIR=$PREFIX/lib` in our build script, see https://github.com/conda-forge/r-arrow-feedstock/pull/63



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531517943

   Ah, wait. The job was passed at https://github.com/apache/arrow/pull/35147#issuecomment-1529017959 .
   It may not be related to `-I`. I'l take a look at this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1181983536


##########
r/configure:
##########
@@ -58,191 +100,282 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
+# Test if pkg-config is available to use
+if ${PKG_CONFIG} --version >/dev/null 2>&1; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
+  ARROW_USE_PKG_CONFIG="false"
 fi
 
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
-  ARROW_USE_PKG_CONFIG="false"
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  if brew --prefix openssl >/dev/null 2>&1; then
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  fi
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
-  if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --variable=prefix --silence-errors ${PKG_CONFIG_NAME}`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif brew --prefix ${PKG_BREW_NAME} > /dev/null 2>&1; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   else
-    PKG_CONFIG_AVAILABLE=false
-  fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion ${PKG_CONFIG_NAME}`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    if ! ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    export PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # These would be identified by pkg-config, in Requires.private and Libs.private.
+      # Rather than try to re-implement pkg-config, we can just hard-code them here.
+      S3_LIBS="-lcurl -lssl -lcrypto"
+      GCS_LIBS="-lcurl -lssl -lcrypto"
+    fi
+  else
+    # If the library directory does not exist, the script must not have been successful
+    _LIBARROW_FOUND="false"
+  fi
+}
 
-      # find openssl on macos. macOS ships with libressl. openssl is installable
-      # with brew, but it is generally not linked. We can over-ride this and find
-      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
-      # the installation process). FWIW, arrow's cmake process uses this
-      # same process to find openssl, but doing it now allows us to catch it in
-      # nixlibs.R and throw a nicer error.
-      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
-        brew --prefix openssl >/dev/null 2>&1
-        if [ $? -eq 0 ]; then
-          export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
-          export PKG_CONFIG_PATH="`brew --prefix openssl`/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-        fi
-      fi
+do_autobrew () {
+  echo "*** Downloading ${PKG_BREW_NAME}"
 
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "" ]; then
-        export ARROW_DEPENDENCY_SOURCE=AUTO
-      fi
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ] && \
-           [ "${PKG_CONFIG_AVAILABLE}" = "false" ]; then
-        export ARROW_DEPENDENCY_SOURCE=BUNDLED
-        echo "**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED"
-      fi
+  # Setup for local autobrew testing
+  if [ -f "tools/apache-arrow.rb" ]; then
+    # If you want to use a local apache-arrow.rb formula, do
+    # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
+    # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository).
+    # If you have this, you should use the local autobrew script so they match.
+    cp tools/autobrew .
+  fi
 
-      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
-
-      LIB_DIR="`pwd`/libarrow/arrow-${VERSION}/lib"
-      if [ -d "$LIB_DIR" ]; then
-        if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
-          # Use pkg-config to do static linking of libarrow's dependencies
-          export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-          PKG_CONFIG="pkg-config"
-          # pkg-config on CentOS 7 doesn't have --define-prefix option.
-          if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
-            # --define-prefix is for binary packages. Binary packages
-            # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
-            # match the extracted path. --define-prefix uses a directory
-            # that arrow.pc exists as its prefix instead of
-            # "/arrow/r/libarrow/dist".
-            PKG_CONFIG="${PKG_CONFIG} --define-prefix"
-          else
-            # Rewrite prefix= in arrow.pc on CentOS 7.
-            sed \
-              -i.bak \
-              -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
-              ${LIB_DIR}/pkgconfig/*.pc
-            rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
-          fi
-          PKG_CONFIG="${PKG_CONFIG} --silence-errors"
-          PKG_CFLAGS="`${PKG_CONFIG} --cflags ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
-          PKG_DIRS="`${PKG_CONFIG} --libs-only-L ${PKG_CONFIG_NAME}`"
-          PKG_LIBS="`${PKG_CONFIG} --libs-only-l --libs-only-other ${PKG_CONFIG_NAME}`"
-        else
-          # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
-          PKG_CFLAGS="-I${LIB_DIR}/../include $PKG_CFLAGS"
-          if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
-            PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
-          fi
-          PKG_DIRS="-L${LIB_DIR}"
-          if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
-            PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
-          fi
-          PKG_LIBS="-larrow"
-          if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
-            PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
-          fi
-          S3_LIBS="-lcurl -lssl -lcrypto"
-          GCS_LIBS="-lcurl -lssl -lcrypto"
-        fi
-      fi
+  if [ -f "autobrew" ]; then
+    echo "**** Using local manifest for ${PKG_BREW_NAME}"
+  else
+    if ! curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew; then
+      echo "Failed to download manifest for ${PKG_BREW_NAME}"
+      # Fall back to the local copy
+      cp tools/autobrew .
     fi
   fi
-fi
+  if ! . autobrew; then
+    echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
+  fi
+  # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+  # TODO: move PKG_LIBS and PKG_CFLAGS out of autobrew and use set_pkg_vars
+}
 
-# If on Raspberry Pi, need to manually link against latomic
-# See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
-if grep raspbian /etc/os-release >/dev/null 2>&1; then
-  PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
-  PKG_LIBS="-latomic $PKG_LIBS"
+# Once libarrow is obtained, this function sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+# either from pkg-config or by inferring things about the directory in $1
+set_pkg_vars () {
+  LIB_DIR="$1/lib"
+  if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+    set_pkg_vars_with_pc
+  else
+    set_pkg_vars_without_pc $1
+  fi
+
+  # Set any user-defined CXXFLAGS
+  if [ "$ARROW_R_CXXFLAGS" ]; then
+    PKG_CFLAGS="$PKG_CFLAGS $ARROW_R_CXXFLAGS"
+  fi
+
+  # Finally, check cmake options for enabled features
+  add_feature_flags
+}
+
+# If we have pkg-config, it will tell us what libarrow needs
+set_pkg_vars_with_pc () {
+  PKG_CFLAGS="`${PKG_CONFIG} --cflags --silence-errors ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
+  PKG_LIBS=`${PKG_CONFIG} --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
+  PKG_DIRS=`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+}
+
+# If we don't have pkg-config, we can make some inferences
+set_pkg_vars_without_pc () {
+  PKG_CFLAGS="-I$1/include $PKG_CFLAGS"
+  if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
+    PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
+  fi
+  PKG_DIRS="-L${LIB_DIR}"
+  if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
+    PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
+  fi
+  PKG_LIBS="-larrow"
+  if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
+    PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
+  fi
+
+  # If on Raspberry Pi, need to manually link against latomic
+  # See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
+  # pkg-config will handle this for us automatically, see ARROW-6312
+  if grep raspbian /etc/os-release >/dev/null 2>&1; then
+    PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
+    PKG_LIBS="-latomic $PKG_LIBS"
+  fi
+}
+
+add_feature_flags () {
+  # Now we need to check what features it was built with and enable
+  # the corresponding feature flags in the R bindings (-DARROW_R_WITH_stuff).
+  # We do this by inspecting ArrowOptions.cmake, which the libarrow build
+  # generates.
+  ARROW_OPTS_CMAKE="$LIB_DIR/cmake/Arrow/ArrowOptions.cmake"
+  if [ ! -f "${ARROW_OPTS_CMAKE}" ]; then
+    echo "*** $ARROW_OPTS_CMAKE not found; some features will not be enabled"
+  else
+    if arrow_built_with ARROW_PARQUET; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_PARQUET"
+      PKG_LIBS="-lparquet $PKG_LIBS"
+      # NOTE: parquet is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_DATASET; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_DATASET"
+      PKG_LIBS="-larrow_dataset $PKG_LIBS"
+      # NOTE: arrow-dataset is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_ACERO; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_ACERO"
+      PKG_LIBS="-larrow_acero $PKG_LIBS"
+      # NOTE: arrow-acero is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_SUBSTRAIT; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_SUBSTRAIT"
+      PKG_LIBS="-larrow_substrait $PKG_LIBS"
+      # NOTE: arrow-substrait is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_JSON; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_JSON"
+    fi
+    if arrow_built_with ARROW_S3; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_S3"
+      PKG_LIBS="$PKG_LIBS $S3_LIBS"
+    fi
+    if arrow_built_with ARROW_GCS; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_GCS"
+      PKG_LIBS="$PKG_LIBS $GCS_LIBS"
+    fi
+  fi
+}
+
+arrow_built_with() {
+  # Function to check cmake options for features
+  grep -i 'set('"$1"' "ON")' $ARROW_OPTS_CMAKE >/dev/null 2>&1
+}
+
+##############
+# Main logic #
+##############
+
+# First, find or build libarrow
+if [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+  do_bundled_build
+elif [ "$FORCE_AUTOBREW" = "true" ]; then
+  do_autobrew
+else
+  find_arrow
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # If we haven't found a suitable version of libarrow, build it
+    if [ "$UNAME" = "Darwin" ] && ! echo $VERSION | grep -q "000"; then
+      # Only autobrew on release version (for testing, use FORCE_AUTOBREW above)
+      # (dev versions end in .9000, and nightly gets something like .10000xxx)
+      do_autobrew
+    else
+      do_bundled_build
+    fi
+  fi

Review Comment:
   IMO it's overkill, and moving this function to the top will make it more visible too. I'm pretty sure the only use of FORCE_AUTOBREW is in our CI. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1533453710

   ['Python', 'R'] benchmarks have high level of regressions.
   [test-mac-arm](https://conbench.ursa.dev/compare/runs/8d0be02fd53c451985848e1288179e55...80ef86bb96f34eea9ed62b51335a9976/)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] h-vetinari commented on a diff in pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "h-vetinari (via GitHub)" <gi...@apache.org>.
h-vetinari commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1174820664


##########
dev/tasks/conda-recipes/r-arrow/build.sh:
##########
@@ -14,3 +14,5 @@ fi
 
 # ${R_ARGS} necessary to support cross-compilation
 ${R} CMD INSTALL --build r/. ${R_ARGS}
+# Ensure that features are enabled in the R build (feel free to add others)
+${R} -s -e 'library(arrow); stopifnot(arrow_with_dataset(), arrow_with_parquet(), arrow_with_s3())'

Review Comment:
   That part is fine and already working (as in: shows the build fails) in https://github.com/conda-forge/r-arrow-feedstock/pull/63.
   
   However, I don't see how the BREW-specific patch above would apply to conda-forge (we'd need `$PREFIX`), nor do I know what changed between 10 & 11. While taking a first look, my comment in that conda-forge PR was:
   > Do we need to do something like in the upstream R [Makefile](https://github.com/apache/arrow/blob/apache-arrow-11.0.0/r/Makefile#L37-L38) to discover the deps correctly?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1181565154


##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
   ARROW_USE_PKG_CONFIG="false"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  brew --prefix openssl >/dev/null 2>&1
   if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
-  else
-    PKG_CONFIG_AVAILABLE=false
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+fi
+
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME} | sed -e 's/^-L//' | sed -e 's@/lib\$@@'`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif [ "$UNAME" = "Darwin" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  else
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion arrow`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null
+    if [ $? -eq 1 ]; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # TODO: is there no way we can infer from ArrowOptions.cmake or arrow.pc if these are needed?
+      S3_LIBS="-lcurl -lssl -lcrypto"
+      GCS_LIBS="-lcurl -lssl -lcrypto"
+    fi
+  else
+    # If the library directory does not exist, the script must not have been successful
+    _LIBARROW_FOUND="false"
+  fi
+}
 
-      # find openssl on macos. macOS ships with libressl. openssl is installable
-      # with brew, but it is generally not linked. We can over-ride this and find
-      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
-      # the installation process). FWIW, arrow's cmake process uses this
-      # same process to find openssl, but doing it now allows us to catch it in
-      # nixlibs.R and throw a nicer error.
-      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
-        brew --prefix openssl >/dev/null 2>&1
-        if [ $? -eq 0 ]; then
-          export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
-          export PKG_CONFIG_PATH="`brew --prefix openssl`/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-        fi
-      fi
+do_autobrew () {
+  echo "*** Downloading ${PKG_BREW_NAME}"
 
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "" ]; then
-        export ARROW_DEPENDENCY_SOURCE=AUTO
-      fi
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ] && \
-           [ "${PKG_CONFIG_AVAILABLE}" = "false" ]; then
-        export ARROW_DEPENDENCY_SOURCE=BUNDLED
-        echo "**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED"
-      fi
+  # Setup for local autobrew testing
+  if [ -f "tools/apache-arrow.rb" ]; then
+    # If you want to use a local apache-arrow.rb formula, do
+    # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
+    # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository).
+    # If you have this, you should use the local autobrew script so they match.
+    cp tools/autobrew .
+  fi
 
-      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
-
-      LIB_DIR="`pwd`/libarrow/arrow-${VERSION}/lib"
-      if [ -d "$LIB_DIR" ]; then
-        if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
-          # Use pkg-config to do static linking of libarrow's dependencies
-          export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-          PKG_CONFIG="pkg-config"
-          # pkg-config on CentOS 7 doesn't have --define-prefix option.
-          if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
-            # --define-prefix is for binary packages. Binary packages
-            # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
-            # match the extracted path. --define-prefix uses a directory
-            # that arrow.pc exists as its prefix instead of
-            # "/arrow/r/libarrow/dist".
-            PKG_CONFIG="${PKG_CONFIG} --define-prefix"
-          else
-            # Rewrite prefix= in arrow.pc on CentOS 7.
-            sed \
-              -i.bak \
-              -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
-              ${LIB_DIR}/pkgconfig/*.pc
-            rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
-          fi
-          PKG_CONFIG="${PKG_CONFIG} --silence-errors"
-          PKG_CFLAGS="`${PKG_CONFIG} --cflags ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
-          PKG_DIRS="`${PKG_CONFIG} --libs-only-L ${PKG_CONFIG_NAME}`"
-          PKG_LIBS="`${PKG_CONFIG} --libs-only-l --libs-only-other ${PKG_CONFIG_NAME}`"
-        else
-          # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
-          PKG_CFLAGS="-I${LIB_DIR}/../include $PKG_CFLAGS"
-          if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
-            PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
-          fi
-          PKG_DIRS="-L${LIB_DIR}"
-          if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
-            PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
-          fi
-          PKG_LIBS="-larrow"
-          if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
-            PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
-          fi
-          S3_LIBS="-lcurl -lssl -lcrypto"
-          GCS_LIBS="-lcurl -lssl -lcrypto"
-        fi
-      fi
+  if [ -f "autobrew" ]; then
+    echo "**** Using local manifest for ${PKG_BREW_NAME}"
+  else
+    curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
+    if [ $? -ne 0 ]; then
+      echo "Failed to download manifest for ${PKG_BREW_NAME}"
+    else

Review Comment:
   Ah, good catch



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1529017557

   @github-actions crossbow submit r-binary-packages


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1533451992

   Benchmark runs are scheduled for baseline = f794b208ce70682f95bdce2339c4832c28045a4c and contender = ec893602124b776fb42261361d1a2d21a6d61f06. ec893602124b776fb42261361d1a2d21a6d61f06 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/c097e5fed88642f8815e24c9a494a12e...4d409f0d91b2459d8694f9081f47a528/)
   [Finished :arrow_down:1.22% :arrow_up:0.0%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/8d0be02fd53c451985848e1288179e55...80ef86bb96f34eea9ed62b51335a9976/)
   [Finished :arrow_down:0.26% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/ac9065653bce4562a3dcb873beddfb00...d9147009233f427badf3ac646ecb8a48/)
   [Finished :arrow_down:0.66% :arrow_up:0.03%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/2c4c9ee8fc524c6ebb74af3652c64d73...081ecdde29024a3883922356ea1f9b9d/)
   Buildkite builds:
   [Finished] [`ec893602` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2812)
   [Finished] [`ec893602` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2848)
   [Finished] [`ec893602` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2812)
   [Finished] [`ec893602` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2838)
   [Finished] [`f794b208` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2811)
   [Finished] [`f794b208` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2847)
   [Finished] [`f794b208` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2811)
   [Finished] [`f794b208` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2837)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1174794249


##########
r/configure:
##########
@@ -252,6 +252,10 @@ echo "#include $PKG_TEST_HEADER" | ${TEST_CMD} >/dev/null 2>&1
 
 if [ $? -eq 0 ]; then
   # Check for features
+  if [ "$LIB_DIR" = "" ]; then
+    # In case LIB_DIR wasn't previously set, infer from PKG_DIRS
+    LIB_DIR=`echo $PKG_DIRS | sed -e 's/^-L//'`
+  fi

Review Comment:
   I think that the following approach (we must set `LIB_DIR` when `PKG_*` are set) is better than this approach:
   
   ```diff
   diff --git a/r/configure b/r/configure
   index 65528bdc3..f452fffb1 100755
   --- a/r/configure
   +++ b/r/configure
   @@ -126,10 +126,11 @@ else
        if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
          if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
            echo "*** Using Homebrew ${PKG_BREW_NAME}"
   -        BREWDIR=`brew --prefix`
   +        BREWDIR=`brew --prefix ${PKG_BREW_NAME}`
   +        LIB_DIR="$BREWDIR/lib"
            PKG_LIBS="-larrow -larrow_bundled_dependencies"
   -        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
   -        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
   +        PKG_DIRS="-L$LIB_DIR $PKG_DIRS"
   +        PKG_CFLAGS="-I$BREWDIR/include $PKG_CFLAGS"
          else
            echo "*** Downloading ${PKG_BREW_NAME}"
            if [ -f "autobrew" ]; then
   @@ -144,7 +145,9 @@ else
            if [ $? -ne 0 ]; then
              echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
            fi
   -        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
   +        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, `PKG_CFLAGS`, and `BREWDIR`
   +        # but `LIB_DIR` isn't set.
   +        LIB_DIR="$BREWDIR/lib"
          fi
        else
          if [ "${NOT_CRAN}" = "true" ]; then
   ```



##########
dev/tasks/conda-recipes/r-arrow/build.sh:
##########
@@ -14,3 +14,5 @@ fi
 
 # ${R_ARGS} necessary to support cross-compilation
 ${R} CMD INSTALL --build r/. ${R_ARGS}
+# Ensure that features are enabled in the R build (feel free to add others)
+${R} -s -e 'library(arrow); stopifnot(arrow_with_dataset(), arrow_with_parquet(), arrow_with_s3())'

Review Comment:
   @h-vetinari Could you review conda part?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1181402350


##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then

Review Comment:
   We can simplify this.
   
   ```suggestion
   if ${PKG_CONFIG} --version >/dev/null 2>&1; then
   ```



##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
   ARROW_USE_PKG_CONFIG="false"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  brew --prefix openssl >/dev/null 2>&1
   if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
-  else
-    PKG_CONFIG_AVAILABLE=false
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+fi
+
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME} | sed -e 's/^-L//' | sed -e 's@/lib\$@@'`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif [ "$UNAME" = "Darwin" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then

Review Comment:
   How about simplifying this like the following?
   
   ```suggestion
     elif brew --prefix ${PKG_BREW_NAME} > /dev/null 2>&1; then
   ```



##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
   ARROW_USE_PKG_CONFIG="false"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  brew --prefix openssl >/dev/null 2>&1
   if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
-  else
-    PKG_CONFIG_AVAILABLE=false
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+fi
+
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME} | sed -e 's/^-L//' | sed -e 's@/lib\$@@'`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif [ "$UNAME" = "Darwin" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  else
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion arrow`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null
+    if [ $? -eq 1 ]; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # TODO: is there no way we can infer from ArrowOptions.cmake or arrow.pc if these are needed?
+      S3_LIBS="-lcurl -lssl -lcrypto"
+      GCS_LIBS="-lcurl -lssl -lcrypto"
+    fi
+  else
+    # If the library directory does not exist, the script must not have been successful
+    _LIBARROW_FOUND="false"
+  fi
+}
 
-      # find openssl on macos. macOS ships with libressl. openssl is installable
-      # with brew, but it is generally not linked. We can over-ride this and find
-      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
-      # the installation process). FWIW, arrow's cmake process uses this
-      # same process to find openssl, but doing it now allows us to catch it in
-      # nixlibs.R and throw a nicer error.
-      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
-        brew --prefix openssl >/dev/null 2>&1
-        if [ $? -eq 0 ]; then
-          export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
-          export PKG_CONFIG_PATH="`brew --prefix openssl`/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-        fi
-      fi
+do_autobrew () {
+  echo "*** Downloading ${PKG_BREW_NAME}"
 
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "" ]; then
-        export ARROW_DEPENDENCY_SOURCE=AUTO
-      fi
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ] && \
-           [ "${PKG_CONFIG_AVAILABLE}" = "false" ]; then
-        export ARROW_DEPENDENCY_SOURCE=BUNDLED
-        echo "**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED"
-      fi
+  # Setup for local autobrew testing
+  if [ -f "tools/apache-arrow.rb" ]; then
+    # If you want to use a local apache-arrow.rb formula, do
+    # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
+    # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository).
+    # If you have this, you should use the local autobrew script so they match.
+    cp tools/autobrew .
+  fi
 
-      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
-
-      LIB_DIR="`pwd`/libarrow/arrow-${VERSION}/lib"
-      if [ -d "$LIB_DIR" ]; then
-        if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
-          # Use pkg-config to do static linking of libarrow's dependencies
-          export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-          PKG_CONFIG="pkg-config"
-          # pkg-config on CentOS 7 doesn't have --define-prefix option.
-          if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
-            # --define-prefix is for binary packages. Binary packages
-            # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
-            # match the extracted path. --define-prefix uses a directory
-            # that arrow.pc exists as its prefix instead of
-            # "/arrow/r/libarrow/dist".
-            PKG_CONFIG="${PKG_CONFIG} --define-prefix"
-          else
-            # Rewrite prefix= in arrow.pc on CentOS 7.
-            sed \
-              -i.bak \
-              -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
-              ${LIB_DIR}/pkgconfig/*.pc
-            rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
-          fi
-          PKG_CONFIG="${PKG_CONFIG} --silence-errors"
-          PKG_CFLAGS="`${PKG_CONFIG} --cflags ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
-          PKG_DIRS="`${PKG_CONFIG} --libs-only-L ${PKG_CONFIG_NAME}`"
-          PKG_LIBS="`${PKG_CONFIG} --libs-only-l --libs-only-other ${PKG_CONFIG_NAME}`"
-        else
-          # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
-          PKG_CFLAGS="-I${LIB_DIR}/../include $PKG_CFLAGS"
-          if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
-            PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
-          fi
-          PKG_DIRS="-L${LIB_DIR}"
-          if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
-            PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
-          fi
-          PKG_LIBS="-larrow"
-          if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
-            PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
-          fi
-          S3_LIBS="-lcurl -lssl -lcrypto"
-          GCS_LIBS="-lcurl -lssl -lcrypto"
-        fi
-      fi
+  if [ -f "autobrew" ]; then
+    echo "**** Using local manifest for ${PKG_BREW_NAME}"
+  else
+    curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
+    if [ $? -ne 0 ]; then
+      echo "Failed to download manifest for ${PKG_BREW_NAME}"
+    else

Review Comment:
   I think that the `else` is needless.
   
   ```suggestion
       if ! curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew; then
         echo "Failed to download manifest for ${PKG_BREW_NAME}"
   ```



##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
   ARROW_USE_PKG_CONFIG="false"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  brew --prefix openssl >/dev/null 2>&1
   if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
-  else
-    PKG_CONFIG_AVAILABLE=false
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+fi
+
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME} | sed -e 's/^-L//' | sed -e 's@/lib\$@@'`"

Review Comment:
   We can simplify this.
   
   ```suggestion
       _LIBARROW_FOUND="`${PKG_CONFIG} --variable=prefix --silence-errors ${PKG_CONFIG_NAME}`"
   ```



##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
   ARROW_USE_PKG_CONFIG="false"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  brew --prefix openssl >/dev/null 2>&1
   if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
-  else
-    PKG_CONFIG_AVAILABLE=false
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+fi
+
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME} | sed -e 's/^-L//' | sed -e 's@/lib\$@@'`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif [ "$UNAME" = "Darwin" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  else
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion arrow`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null
+    if [ $? -eq 1 ]; then

Review Comment:
   ```suggestion
       if ! ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null; then
   ```



##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
   ARROW_USE_PKG_CONFIG="false"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  brew --prefix openssl >/dev/null 2>&1
   if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
-  else
-    PKG_CONFIG_AVAILABLE=false
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+fi
+
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME} | sed -e 's/^-L//' | sed -e 's@/lib\$@@'`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif [ "$UNAME" = "Darwin" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  else
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion arrow`

Review Comment:
   ```suggestion
         PC_LIB_VERSION=`${PKG_CONFIG} --modversion ${PKG_CONFIG_NAME}`
   ```



##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
   ARROW_USE_PKG_CONFIG="false"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  brew --prefix openssl >/dev/null 2>&1
   if [ $? -eq 0 ]; then

Review Comment:
   We can simplify this.
   
   ```suggestion
     if brew --prefix openssl >/dev/null 2>&1; then
   ```



##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
   ARROW_USE_PKG_CONFIG="false"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  brew --prefix openssl >/dev/null 2>&1
   if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
-  else
-    PKG_CONFIG_AVAILABLE=false
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+fi
+
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME} | sed -e 's/^-L//' | sed -e 's@/lib\$@@'`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif [ "$UNAME" = "Darwin" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  else
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion arrow`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null
+    if [ $? -eq 1 ]; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # TODO: is there no way we can infer from ArrowOptions.cmake or arrow.pc if these are needed?
+      S3_LIBS="-lcurl -lssl -lcrypto"
+      GCS_LIBS="-lcurl -lssl -lcrypto"
+    fi
+  else
+    # If the library directory does not exist, the script must not have been successful
+    _LIBARROW_FOUND="false"
+  fi
+}
 
-      # find openssl on macos. macOS ships with libressl. openssl is installable
-      # with brew, but it is generally not linked. We can over-ride this and find
-      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
-      # the installation process). FWIW, arrow's cmake process uses this
-      # same process to find openssl, but doing it now allows us to catch it in
-      # nixlibs.R and throw a nicer error.
-      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
-        brew --prefix openssl >/dev/null 2>&1
-        if [ $? -eq 0 ]; then
-          export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
-          export PKG_CONFIG_PATH="`brew --prefix openssl`/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-        fi
-      fi
+do_autobrew () {
+  echo "*** Downloading ${PKG_BREW_NAME}"
 
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "" ]; then
-        export ARROW_DEPENDENCY_SOURCE=AUTO
-      fi
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ] && \
-           [ "${PKG_CONFIG_AVAILABLE}" = "false" ]; then
-        export ARROW_DEPENDENCY_SOURCE=BUNDLED
-        echo "**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED"
-      fi
+  # Setup for local autobrew testing
+  if [ -f "tools/apache-arrow.rb" ]; then
+    # If you want to use a local apache-arrow.rb formula, do
+    # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
+    # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository).
+    # If you have this, you should use the local autobrew script so they match.
+    cp tools/autobrew .
+  fi
 
-      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
-
-      LIB_DIR="`pwd`/libarrow/arrow-${VERSION}/lib"
-      if [ -d "$LIB_DIR" ]; then
-        if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
-          # Use pkg-config to do static linking of libarrow's dependencies
-          export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-          PKG_CONFIG="pkg-config"
-          # pkg-config on CentOS 7 doesn't have --define-prefix option.
-          if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
-            # --define-prefix is for binary packages. Binary packages
-            # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
-            # match the extracted path. --define-prefix uses a directory
-            # that arrow.pc exists as its prefix instead of
-            # "/arrow/r/libarrow/dist".
-            PKG_CONFIG="${PKG_CONFIG} --define-prefix"
-          else
-            # Rewrite prefix= in arrow.pc on CentOS 7.
-            sed \
-              -i.bak \
-              -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
-              ${LIB_DIR}/pkgconfig/*.pc
-            rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
-          fi
-          PKG_CONFIG="${PKG_CONFIG} --silence-errors"
-          PKG_CFLAGS="`${PKG_CONFIG} --cflags ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
-          PKG_DIRS="`${PKG_CONFIG} --libs-only-L ${PKG_CONFIG_NAME}`"
-          PKG_LIBS="`${PKG_CONFIG} --libs-only-l --libs-only-other ${PKG_CONFIG_NAME}`"
-        else
-          # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
-          PKG_CFLAGS="-I${LIB_DIR}/../include $PKG_CFLAGS"
-          if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
-            PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
-          fi
-          PKG_DIRS="-L${LIB_DIR}"
-          if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
-            PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
-          fi
-          PKG_LIBS="-larrow"
-          if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
-            PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
-          fi
-          S3_LIBS="-lcurl -lssl -lcrypto"
-          GCS_LIBS="-lcurl -lssl -lcrypto"
-        fi
-      fi
+  if [ -f "autobrew" ]; then
+    echo "**** Using local manifest for ${PKG_BREW_NAME}"
+  else
+    curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
+    if [ $? -ne 0 ]; then
+      echo "Failed to download manifest for ${PKG_BREW_NAME}"
+    else
+      # Fall back to the local copy
+      cp tools/autobrew .
     fi
   fi
-fi
+  . autobrew
+  if [ $? -ne 0 ]; then
+    echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
+  fi
+  # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+  # TODO: move PKG_LIBS and PKG_CFLAGS out of autobrew and use set_pkg_vars
+}
 
-# If on Raspberry Pi, need to manually link against latomic
-# See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
-if grep raspbian /etc/os-release >/dev/null 2>&1; then
-  PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
-  PKG_LIBS="-latomic $PKG_LIBS"
+# Once libarrow is obtained, this function sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+# either from pkg-config or by inferring things about the directory in $1
+set_pkg_vars () {
+  LIB_DIR="$1/lib"
+  if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+    set_pkg_vars_with_pc
+  else
+    set_pkg_vars_without_pc $1
+  fi
+
+  # If on Raspberry Pi, need to manually link against latomic
+  # See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
+  if grep raspbian /etc/os-release >/dev/null 2>&1; then
+    PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
+    PKG_LIBS="-latomic $PKG_LIBS"
+  fi

Review Comment:
   We need this only when `pkg-config` doesn't exist. Because it's cared in C++: https://github.com/apache/arrow/commit/74af39d486b5a750a8099abc163935e5c8d911bf
   
   ```suggestion
       # If on Raspberry Pi, need to manually link against latomic
       # See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
       if grep raspbian /etc/os-release >/dev/null 2>&1; then
         PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
         PKG_LIBS="$PKG_LIBS -latomic"
       fi
     fi
   ```
   
   Or we may want to move this to `set_pkg_vars_without_pc`.



##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
   ARROW_USE_PKG_CONFIG="false"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  brew --prefix openssl >/dev/null 2>&1
   if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
-  else
-    PKG_CONFIG_AVAILABLE=false
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+fi
+
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME} | sed -e 's/^-L//' | sed -e 's@/lib\$@@'`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif [ "$UNAME" = "Darwin" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  else
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion arrow`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null
+    if [ $? -eq 1 ]; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # TODO: is there no way we can infer from ArrowOptions.cmake or arrow.pc if these are needed?
+      S3_LIBS="-lcurl -lssl -lcrypto"
+      GCS_LIBS="-lcurl -lssl -lcrypto"
+    fi
+  else
+    # If the library directory does not exist, the script must not have been successful
+    _LIBARROW_FOUND="false"
+  fi
+}
 
-      # find openssl on macos. macOS ships with libressl. openssl is installable
-      # with brew, but it is generally not linked. We can over-ride this and find
-      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
-      # the installation process). FWIW, arrow's cmake process uses this
-      # same process to find openssl, but doing it now allows us to catch it in
-      # nixlibs.R and throw a nicer error.
-      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
-        brew --prefix openssl >/dev/null 2>&1
-        if [ $? -eq 0 ]; then
-          export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
-          export PKG_CONFIG_PATH="`brew --prefix openssl`/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-        fi
-      fi
+do_autobrew () {
+  echo "*** Downloading ${PKG_BREW_NAME}"
 
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "" ]; then
-        export ARROW_DEPENDENCY_SOURCE=AUTO
-      fi
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ] && \
-           [ "${PKG_CONFIG_AVAILABLE}" = "false" ]; then
-        export ARROW_DEPENDENCY_SOURCE=BUNDLED
-        echo "**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED"
-      fi
+  # Setup for local autobrew testing
+  if [ -f "tools/apache-arrow.rb" ]; then
+    # If you want to use a local apache-arrow.rb formula, do
+    # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
+    # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository).
+    # If you have this, you should use the local autobrew script so they match.
+    cp tools/autobrew .
+  fi
 
-      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
-
-      LIB_DIR="`pwd`/libarrow/arrow-${VERSION}/lib"
-      if [ -d "$LIB_DIR" ]; then
-        if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
-          # Use pkg-config to do static linking of libarrow's dependencies
-          export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-          PKG_CONFIG="pkg-config"
-          # pkg-config on CentOS 7 doesn't have --define-prefix option.
-          if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
-            # --define-prefix is for binary packages. Binary packages
-            # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
-            # match the extracted path. --define-prefix uses a directory
-            # that arrow.pc exists as its prefix instead of
-            # "/arrow/r/libarrow/dist".
-            PKG_CONFIG="${PKG_CONFIG} --define-prefix"
-          else
-            # Rewrite prefix= in arrow.pc on CentOS 7.
-            sed \
-              -i.bak \
-              -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
-              ${LIB_DIR}/pkgconfig/*.pc
-            rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
-          fi
-          PKG_CONFIG="${PKG_CONFIG} --silence-errors"
-          PKG_CFLAGS="`${PKG_CONFIG} --cflags ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
-          PKG_DIRS="`${PKG_CONFIG} --libs-only-L ${PKG_CONFIG_NAME}`"
-          PKG_LIBS="`${PKG_CONFIG} --libs-only-l --libs-only-other ${PKG_CONFIG_NAME}`"
-        else
-          # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
-          PKG_CFLAGS="-I${LIB_DIR}/../include $PKG_CFLAGS"
-          if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
-            PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
-          fi
-          PKG_DIRS="-L${LIB_DIR}"
-          if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
-            PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
-          fi
-          PKG_LIBS="-larrow"
-          if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
-            PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
-          fi
-          S3_LIBS="-lcurl -lssl -lcrypto"
-          GCS_LIBS="-lcurl -lssl -lcrypto"
-        fi
-      fi
+  if [ -f "autobrew" ]; then
+    echo "**** Using local manifest for ${PKG_BREW_NAME}"
+  else
+    curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
+    if [ $? -ne 0 ]; then
+      echo "Failed to download manifest for ${PKG_BREW_NAME}"
+    else
+      # Fall back to the local copy
+      cp tools/autobrew .
     fi
   fi
-fi
+  . autobrew
+  if [ $? -ne 0 ]; then

Review Comment:
   ```suggestion
     if ! . autobrew; then
   ```



##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
   ARROW_USE_PKG_CONFIG="false"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  brew --prefix openssl >/dev/null 2>&1
   if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
-  else
-    PKG_CONFIG_AVAILABLE=false
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+fi
+
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME} | sed -e 's/^-L//' | sed -e 's@/lib\$@@'`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif [ "$UNAME" = "Darwin" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  else
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion arrow`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null
+    if [ $? -eq 1 ]; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # TODO: is there no way we can infer from ArrowOptions.cmake or arrow.pc if these are needed?

Review Comment:
   We can use `Requires.private` and `Libs.private` in `arrow.pc` for it but it means that "we re-implement pkg-config"... So I don't recommend it...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531709457

   I don't fully understand the devdocs failure, but it has to do with the fact that arrow 12.0.0 has been released (https://github.com/ursacomputing/crossbow/actions/runs/4862485406/jobs/8668915213#step:9:3878 vs. https://github.com/ursacomputing/crossbow/actions/runs/4833910722/jobs/8614497534#step:9:3880 where it passed before). The cflags/libs are identical between the two builds. Rebasing to pull in the new version number would probably solve it, though I'm not sure it's worth the electricity to confirm it (that's an expensive build for very little value, it seems).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1522233383

   I've applied the patches and added a bunch of comments to `r/configure` to help with any upcoming refactoring. I want to do a bit more testing of the various branches locally before we move forward, so I've marked this as draft for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531932426

   Made #35391 for the segfault on the ubuntu force tests job, which I see on `main` as well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1175256010


##########
r/configure:
##########
@@ -252,6 +252,10 @@ echo "#include $PKG_TEST_HEADER" | ${TEST_CMD} >/dev/null 2>&1
 
 if [ $? -eq 0 ]; then
   # Check for features
+  if [ "$LIB_DIR" = "" ]; then
+    # In case LIB_DIR wasn't previously set, infer from PKG_DIRS
+    LIB_DIR=`echo $PKG_DIRS | sed -e 's/^-L//'`
+  fi

Review Comment:
   The main reason I didn't take this approach is that `r/configure` has gotten so complex that I didn't think I could reason about all of the different branches of the code to know that I had found them all (see also https://github.com/apache/arrow/issues/35049). Other, more concrete issues:
   
   * I did [confirm](https://github.com/apache/arrow/issues/35140#issuecomment-1509271404) what @paleolimbot suggested, which is that if you have `pkg-config` available and have installed `apache-arrow` via `brew`, you won't get to this `*** Using Homebrew` block, you'll see that you're using arrow found by pkg-config. So at a minimum, we (also) need `LIB_DIR` set in the pkg-config section above this.
   * Likewise, there have been similar reports, like the conda-forge one and https://github.com/apache/arrow/issues/31989, not on macOS/brew, so a brew-only fix wouldn't be sufficient.
   * We actually *don't* need the autobrew one because (for "historical reasons", I assume) we explicitly set all of the feature flags in the autobrew script https://github.com/apache/arrow/blob/apache-arrow-11.0.0/r/tools/autobrew#L77



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1181558736


##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
   ARROW_USE_PKG_CONFIG="false"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  brew --prefix openssl >/dev/null 2>&1
   if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
-  else
-    PKG_CONFIG_AVAILABLE=false
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+fi
+
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME} | sed -e 's/^-L//' | sed -e 's@/lib\$@@'`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif [ "$UNAME" = "Darwin" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  else
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion arrow`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null
+    if [ $? -eq 1 ]; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # TODO: is there no way we can infer from ArrowOptions.cmake or arrow.pc if these are needed?

Review Comment:
   Thanks, I'll update the comment and remove the TODO



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "paleolimbot (via GitHub)" <gi...@apache.org>.
paleolimbot commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1530754841

   I did get a build error after unsetting ARROW_HOME (see below); however, it worked beautifully on MacOS x86! I don't think my M1 problem should block this...this rewrite makes it much easier to fix if it does turn out to be a script issue.
   
   I'll try on Windows and on a few suitable Ubuntu docker images tomorrow.
   
   ```
   unable to load shared object '/Users/deweydunnington/Library/R/arm64/4.2/library/00LOCK-r/00new/arrow/libs/arrow.so':
     dlopen(/Users/deweydunnington/Library/R/arm64/4.2/library/00LOCK-r/00new/arrow/libs/arrow.so, 0x0006): symbol not found in flat namespace (__ZN4absl12lts_2023012510FormatTimeENSt3__117basic_string_viewIcNS1_11char_traitsIcEEEENS0_4TimeENS0_8TimeZoneE)
   Error: loading failed
   Execution halted
   ERROR: loading failed
   * removing ‘/Users/deweydunnington/Library/R/arm64/4.2/library/arrow’
   * restoring previous ‘/Users/deweydunnington/Library/R/arm64/4.2/library/arrow’
   
   Exited with status 1.
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1182550426


##########
r/tools/check-versions.R:
##########
@@ -0,0 +1,56 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+args <- commandArgs(TRUE)
+
+# TESTING is set in test-check-version.R; it won't be set when called from configure
+test_mode <- exists("TESTING")
+
+check_versions <- function(r_version, cpp_version) {
+  r_parsed <- package_version(r_version)
+  r_dev_version <- r_parsed[1, 4]
+  r_is_dev <- !is.na(r_dev_version) && r_dev_version > 100
+  cpp_is_dev <- grepl("SNAPSHOT$", cpp_version)
+  cpp_parsed <- package_version(sub("-SNAPSHOT$", "", cpp_version))
+
+  major <- function(x) as.numeric(x[1, 1])
+  # R and C++ denote dev versions differently
+  # R is current.release.9000, C++ is next.release-SNAPSHOT
+  # So a "match" is if the R major version + 1 = C++ major version
+  if (r_is_dev && cpp_is_dev && major(r_parsed) + 1 == major(cpp_parsed)) {
+    msg <- c(
+      sprintf("*** > Packages are both on development versions (%s, %s)", cpp_version, r_version),
+      "*** > If installation fails, rebuild the C++ library to match the R version",
+      "*** > or retry with FORCE_BUNDLED_BUILD=true"
+    )
+    cat(paste0(msg, "\n", collapse = ""))
+  } else if (r_version != cpp_version) {
+    cat(
+      sprintf(
+        "**** Not using found C++ library: version (%s) does not match R package (%s)\n",

Review Comment:
   Yeah it should be in the line immediately above this in the logs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531917921

   > It doesn't seem to be pulling the hosted static libraries for development, which I expected on Ubuntu/x86 (but maybe it's not supposed to?).
   
   We don't host binaries for x.y.z.9000, the nightly versions are like x.y.z.1000123, so if you wanted it to pull a nightly for a .9000 dev build, we'd have to teach it how to find the most recent one. Something for you (or whoever) to consider on #35049 I suppose. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1509279534

   * Closes: #35140


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1527976612

   Revision: 9dfa86d337e8428818148320c84881b3e89cb935
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-8ccc8bc92d](https://github.com/ursacomputing/crossbow/branches/all?query=actions-8ccc8bc92d)
   
   |Task|Status|
   |----|------|
   |conda-linux-aarch64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-conda-linux-aarch64-cpu-r42)](https://github.com/ursacomputing/crossbow/tree/actions-8ccc8bc92d-azure-conda-linux-aarch64-cpu-r42)|
   |conda-linux-x64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-conda-linux-x64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/13108697752)|
   |conda-osx-arm64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-conda-osx-arm64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/13108689686)|
   |conda-osx-x64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-conda-osx-x64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/13108709012)|
   |conda-win-x64-cpu-r41|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-conda-win-x64-cpu-r41)](https://github.com/ursacomputing/crossbow/runs/13108716531)|
   |homebrew-r-autobrew|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-homebrew-r-autobrew)](https://github.com/ursacomputing/crossbow/actions/runs/4833913949/jobs/8614504663)|
   |homebrew-r-brew|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-homebrew-r-brew)](https://github.com/ursacomputing/crossbow/actions/runs/4833909642/jobs/8614494216)|
   |r-binary-packages|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-r-binary-packages)](https://github.com/ursacomputing/crossbow/actions/runs/4833911390/jobs/8614512476)|
   |test-fedora-r-clang-sanitizer|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-fedora-r-clang-sanitizer)](https://github.com/ursacomputing/crossbow/runs/13108706917)|
   |test-r-arrow-backwards-compatibility|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-arrow-backwards-compatibility)](https://github.com/ursacomputing/crossbow/actions/runs/4833911218/jobs/8614498685)|
   |test-r-depsource-bundled|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-r-depsource-bundled)](https://github.com/ursacomputing/crossbow/runs/13108690920)|
   |test-r-depsource-system|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-depsource-system)](https://github.com/ursacomputing/crossbow/actions/runs/4833908723/jobs/8614491977)|
   |test-r-dev-duckdb|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-dev-duckdb)](https://github.com/ursacomputing/crossbow/actions/runs/4833914621/jobs/8614505330)|
   |test-r-devdocs|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-devdocs)](https://github.com/ursacomputing/crossbow/actions/runs/4833910722/jobs/8614497643)|
   |test-r-gcc-11|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-gcc-11)](https://github.com/ursacomputing/crossbow/actions/runs/4833907171/jobs/8614489160)|
   |test-r-gcc-12|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-gcc-12)](https://github.com/ursacomputing/crossbow/actions/runs/4833908286/jobs/8614491256)|
   |test-r-install-local|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-install-local)](https://github.com/ursacomputing/crossbow/actions/runs/4833910084/jobs/8614495300)|
   |test-r-install-local-minsizerel|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-install-local-minsizerel)](https://github.com/ursacomputing/crossbow/actions/runs/4833915009/jobs/8614506423)|
   |test-r-library-r-base-latest|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-r-library-r-base-latest)](https://github.com/ursacomputing/crossbow/runs/13108693341)|
   |test-r-linux-as-cran|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-linux-as-cran)](https://github.com/ursacomputing/crossbow/actions/runs/4833911697/jobs/8614500657)|
   |test-r-linux-rchk|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-linux-rchk)](https://github.com/ursacomputing/crossbow/actions/runs/4833909896/jobs/8614496247)|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-r-linux-valgrind)](https://github.com/ursacomputing/crossbow/runs/13108690386)|
   |test-r-minimal-build|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-r-minimal-build)](https://github.com/ursacomputing/crossbow/runs/13108707792)|
   |test-r-offline-maximal|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-offline-maximal)](https://github.com/ursacomputing/crossbow/actions/runs/4833915427/jobs/8614507086)|
   |test-r-offline-minimal|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-r-offline-minimal)](https://github.com/ursacomputing/crossbow/runs/13108701550)|
   |test-r-rhub-debian-gcc-devel-lto-latest|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-r-rhub-debian-gcc-devel-lto-latest)](https://github.com/ursacomputing/crossbow/runs/13108703385)|
   |test-r-rhub-debian-gcc-release-custom-ccache|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-r-rhub-debian-gcc-release-custom-ccache)](https://github.com/ursacomputing/crossbow/runs/13108704172)|
   |test-r-rhub-ubuntu-gcc-release-latest|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-r-rhub-ubuntu-gcc-release-latest)](https://github.com/ursacomputing/crossbow/runs/13108697186)|
   |test-r-rstudio-r-base-4.1-opensuse153|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-r-rstudio-r-base-4.1-opensuse153)](https://github.com/ursacomputing/crossbow/runs/13108712470)|
   |test-r-rstudio-r-base-4.2-centos7-devtoolset-8|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-r-rstudio-r-base-4.2-centos7-devtoolset-8)](https://github.com/ursacomputing/crossbow/runs/13108700792)|
   |test-r-rstudio-r-base-4.2-focal|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-r-rstudio-r-base-4.2-focal)](https://github.com/ursacomputing/crossbow/runs/13108710239)|
   |test-r-ubuntu-22.04|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-ubuntu-22.04)](https://github.com/ursacomputing/crossbow/actions/runs/4833906686/jobs/8614488173)|
   |test-r-versions|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-8ccc8bc92d-github-test-r-versions)](https://github.com/ursacomputing/crossbow/actions/runs/4833905084/jobs/8614486197)|
   |test-ubuntu-r-sanitizer|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-8ccc8bc92d-azure-test-ubuntu-r-sanitizer)](https://github.com/ursacomputing/crossbow/runs/13108694642)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1181982483


##########
r/configure:
##########
@@ -58,191 +100,282 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
+# Test if pkg-config is available to use
+if ${PKG_CONFIG} --version >/dev/null 2>&1; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
+  ARROW_USE_PKG_CONFIG="false"
 fi
 
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
-  ARROW_USE_PKG_CONFIG="false"
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  if brew --prefix openssl >/dev/null 2>&1; then
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  fi
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
-  if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --variable=prefix --silence-errors ${PKG_CONFIG_NAME}`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif brew --prefix ${PKG_BREW_NAME} > /dev/null 2>&1; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   else
-    PKG_CONFIG_AVAILABLE=false
-  fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion ${PKG_CONFIG_NAME}`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    if ! ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    export PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # These would be identified by pkg-config, in Requires.private and Libs.private.
+      # Rather than try to re-implement pkg-config, we can just hard-code them here.
+      S3_LIBS="-lcurl -lssl -lcrypto"
+      GCS_LIBS="-lcurl -lssl -lcrypto"
+    fi
+  else
+    # If the library directory does not exist, the script must not have been successful
+    _LIBARROW_FOUND="false"
+  fi
+}
 
-      # find openssl on macos. macOS ships with libressl. openssl is installable
-      # with brew, but it is generally not linked. We can over-ride this and find
-      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
-      # the installation process). FWIW, arrow's cmake process uses this
-      # same process to find openssl, but doing it now allows us to catch it in
-      # nixlibs.R and throw a nicer error.
-      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
-        brew --prefix openssl >/dev/null 2>&1
-        if [ $? -eq 0 ]; then
-          export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
-          export PKG_CONFIG_PATH="`brew --prefix openssl`/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-        fi
-      fi
+do_autobrew () {
+  echo "*** Downloading ${PKG_BREW_NAME}"
 
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "" ]; then
-        export ARROW_DEPENDENCY_SOURCE=AUTO
-      fi
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ] && \
-           [ "${PKG_CONFIG_AVAILABLE}" = "false" ]; then
-        export ARROW_DEPENDENCY_SOURCE=BUNDLED
-        echo "**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED"
-      fi
+  # Setup for local autobrew testing
+  if [ -f "tools/apache-arrow.rb" ]; then
+    # If you want to use a local apache-arrow.rb formula, do
+    # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
+    # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository).
+    # If you have this, you should use the local autobrew script so they match.
+    cp tools/autobrew .
+  fi
 
-      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
-
-      LIB_DIR="`pwd`/libarrow/arrow-${VERSION}/lib"
-      if [ -d "$LIB_DIR" ]; then
-        if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
-          # Use pkg-config to do static linking of libarrow's dependencies
-          export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-          PKG_CONFIG="pkg-config"
-          # pkg-config on CentOS 7 doesn't have --define-prefix option.
-          if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
-            # --define-prefix is for binary packages. Binary packages
-            # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
-            # match the extracted path. --define-prefix uses a directory
-            # that arrow.pc exists as its prefix instead of
-            # "/arrow/r/libarrow/dist".
-            PKG_CONFIG="${PKG_CONFIG} --define-prefix"
-          else
-            # Rewrite prefix= in arrow.pc on CentOS 7.
-            sed \
-              -i.bak \
-              -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
-              ${LIB_DIR}/pkgconfig/*.pc
-            rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
-          fi
-          PKG_CONFIG="${PKG_CONFIG} --silence-errors"
-          PKG_CFLAGS="`${PKG_CONFIG} --cflags ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
-          PKG_DIRS="`${PKG_CONFIG} --libs-only-L ${PKG_CONFIG_NAME}`"
-          PKG_LIBS="`${PKG_CONFIG} --libs-only-l --libs-only-other ${PKG_CONFIG_NAME}`"
-        else
-          # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
-          PKG_CFLAGS="-I${LIB_DIR}/../include $PKG_CFLAGS"
-          if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
-            PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
-          fi
-          PKG_DIRS="-L${LIB_DIR}"
-          if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
-            PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
-          fi
-          PKG_LIBS="-larrow"
-          if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
-            PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
-          fi
-          S3_LIBS="-lcurl -lssl -lcrypto"
-          GCS_LIBS="-lcurl -lssl -lcrypto"
-        fi
-      fi
+  if [ -f "autobrew" ]; then
+    echo "**** Using local manifest for ${PKG_BREW_NAME}"
+  else
+    if ! curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew; then
+      echo "Failed to download manifest for ${PKG_BREW_NAME}"
+      # Fall back to the local copy
+      cp tools/autobrew .
     fi
   fi
-fi
+  if ! . autobrew; then
+    echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
+  fi
+  # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+  # TODO: move PKG_LIBS and PKG_CFLAGS out of autobrew and use set_pkg_vars
+}
 
-# If on Raspberry Pi, need to manually link against latomic
-# See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
-if grep raspbian /etc/os-release >/dev/null 2>&1; then
-  PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
-  PKG_LIBS="-latomic $PKG_LIBS"
+# Once libarrow is obtained, this function sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+# either from pkg-config or by inferring things about the directory in $1
+set_pkg_vars () {
+  LIB_DIR="$1/lib"
+  if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+    set_pkg_vars_with_pc
+  else
+    set_pkg_vars_without_pc $1
+  fi
+
+  # Set any user-defined CXXFLAGS
+  if [ "$ARROW_R_CXXFLAGS" ]; then
+    PKG_CFLAGS="$PKG_CFLAGS $ARROW_R_CXXFLAGS"
+  fi
+
+  # Finally, check cmake options for enabled features
+  add_feature_flags
+}
+
+# If we have pkg-config, it will tell us what libarrow needs
+set_pkg_vars_with_pc () {
+  PKG_CFLAGS="`${PKG_CONFIG} --cflags --silence-errors ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
+  PKG_LIBS=`${PKG_CONFIG} --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
+  PKG_DIRS=`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+}
+
+# If we don't have pkg-config, we can make some inferences
+set_pkg_vars_without_pc () {
+  PKG_CFLAGS="-I$1/include $PKG_CFLAGS"
+  if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
+    PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
+  fi
+  PKG_DIRS="-L${LIB_DIR}"
+  if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
+    PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
+  fi
+  PKG_LIBS="-larrow"
+  if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
+    PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
+  fi
+
+  # If on Raspberry Pi, need to manually link against latomic
+  # See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
+  # pkg-config will handle this for us automatically, see ARROW-6312
+  if grep raspbian /etc/os-release >/dev/null 2>&1; then
+    PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
+    PKG_LIBS="-latomic $PKG_LIBS"
+  fi
+}
+
+add_feature_flags () {
+  # Now we need to check what features it was built with and enable
+  # the corresponding feature flags in the R bindings (-DARROW_R_WITH_stuff).
+  # We do this by inspecting ArrowOptions.cmake, which the libarrow build
+  # generates.
+  ARROW_OPTS_CMAKE="$LIB_DIR/cmake/Arrow/ArrowOptions.cmake"
+  if [ ! -f "${ARROW_OPTS_CMAKE}" ]; then
+    echo "*** $ARROW_OPTS_CMAKE not found; some features will not be enabled"
+  else
+    if arrow_built_with ARROW_PARQUET; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_PARQUET"
+      PKG_LIBS="-lparquet $PKG_LIBS"
+      # NOTE: parquet is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_DATASET; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_DATASET"
+      PKG_LIBS="-larrow_dataset $PKG_LIBS"
+      # NOTE: arrow-dataset is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_ACERO; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_ACERO"
+      PKG_LIBS="-larrow_acero $PKG_LIBS"
+      # NOTE: arrow-acero is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_SUBSTRAIT; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_SUBSTRAIT"
+      PKG_LIBS="-larrow_substrait $PKG_LIBS"
+      # NOTE: arrow-substrait is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_JSON; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_JSON"
+    fi
+    if arrow_built_with ARROW_S3; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_S3"
+      PKG_LIBS="$PKG_LIBS $S3_LIBS"
+    fi
+    if arrow_built_with ARROW_GCS; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_GCS"
+      PKG_LIBS="$PKG_LIBS $GCS_LIBS"
+    fi
+  fi
+}
+
+arrow_built_with() {
+  # Function to check cmake options for features
+  grep -i 'set('"$1"' "ON")' $ARROW_OPTS_CMAKE >/dev/null 2>&1
+}
+
+##############
+# Main logic #
+##############

Review Comment:
   Yeah I agree that would be nice.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1509279564

   :warning: GitHub issue #35140 **has been automatically assigned in GitHub** to PR creator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531526489

   In the failed build, `PKG_CFLAGS=-DARROW_STATIC -DUTF8PROC_EXPORTS -I/Users/voltrondata/tmp/RtmpQAJbFi/R.INSTALLca181e5538f5/arrow/libarrow/arrow-11.0.0.100000506/include -I/opt/homebrew/include`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531525952

   Ah, OK. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531556879

   I think that this is the same problem as https://github.com/apache/arrow/issues/34523 .
   And this is a C++ part problem not an R part problem.
   So I think that we can work on this as a separated pull request.
     
   `__ZN4absl12lts_2023012510FormatTimeENSt3__117basic_string_viewIcNS1_11char_traitsIcEEEENS0_4TimeENS0_8TimeZoneE` is  `absl::lts_20230125::FormatTime(std::__1::basic_string_view<char, std::__1::char_traits<char> >, absl::lts_20230125::Time, absl::lts_20230125::TimeZone)`. And it's used at https://github.com/apache/arrow/blob/a3a01726b1ca195c3d26fec084c0ff19453e6866/cpp/src/arrow/filesystem/gcsfs_internal.cc#L241 .
   
   Abseil was mixed when we build C++'s `libarrow.a` not R's `arrow.so`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531595020

   @github-actions crossbow submit -g r


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1509279583

   :warning: GitHub issue #35140 **has no components**, please add labels for components.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1176615155


##########
r/configure:
##########
@@ -252,6 +252,10 @@ echo "#include $PKG_TEST_HEADER" | ${TEST_CMD} >/dev/null 2>&1
 
 if [ $? -eq 0 ]; then
   # Check for features
+  if [ "$LIB_DIR" = "" ]; then
+    # In case LIB_DIR wasn't previously set, infer from PKG_DIRS
+    LIB_DIR=`echo $PKG_DIRS | sed -e 's/^-L//'`
+  fi

Review Comment:
   🤦 my mistake, sorry I missed #34229, that explains some of the confusion. I'll tidy this PR up with any remaining useful changes and get it merged. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1176000197


##########
r/configure:
##########
@@ -252,6 +252,10 @@ echo "#include $PKG_TEST_HEADER" | ${TEST_CMD} >/dev/null 2>&1
 
 if [ $? -eq 0 ]; then
   # Check for features
+  if [ "$LIB_DIR" = "" ]; then
+    # In case LIB_DIR wasn't previously set, infer from PKG_DIRS
+    LIB_DIR=`echo $PKG_DIRS | sed -e 's/^-L//'`
+  fi

Review Comment:
   Sorry. Could you explain more the context for it? Can I see a build log for the case? This case https://github.com/apache/arrow/issues/35140 ? Or other case?
   
   If this is for #35140, I think that the isn't related because `FOUND_LIB_DIR` doesn't have any space:
   
   ```text
   *** Arrow C++ libraries found via pkg-config at /opt/homebrew/Cellar/apache-arrow/11.0.0_3/lib
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1529017959

   Revision: d88f71a142b2cd6f68f942bcc5f874dfcc1a5e1c
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-58d272b132](https://github.com/ursacomputing/crossbow/branches/all?query=actions-58d272b132)
   
   |Task|Status|
   |----|------|
   |r-binary-packages|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-58d272b132-github-r-binary-packages)](https://github.com/ursacomputing/crossbow/actions/runs/4844397310/jobs/8632658902)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531524585

   Hmm. It seems that autobrew not Homebrew was used in the passed job.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531548459

   autobrew does not use pkg-config to set PKG_CFLAGS, so `-I/opt/homebrew/include` is not present in that build


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1182615413


##########
r/configure:
##########
@@ -17,39 +17,81 @@
 # specific language governing permissions and limitations
 # under the License.
 
-# Anticonf (tm) script by Jeroen Ooms, Jim Hester (2017)
-# License: MIT
+# This script makes sure we have our C++ dependencies in order
+# so that we can compile the bindings in src/ and build the
+# R package's shared library.
 #
-# This script will query 'pkg-config' for the required cflags and ldflags.
-# If pkg-config is unavailable or does not find the library, try setting
-# INCLUDE_DIR and LIB_DIR manually via e.g:
-# R CMD INSTALL --configure-vars='INCLUDE_DIR=/.../include LIB_DIR=/.../lib'
+# The core logic is:
+#
+# * Find libarrow on the system. If it is present, make sure
+#   that its version is compatible with the R package.
+# * If no suitable libarrow is found, download it (where allowed)
+#   or build it from source.
+# * Determine what features this libarrow has and what other
+#   flags it requires, and set them in src/Makevars for use when
+#   compiling the bindings.
+# * Run a test program to confirm that arrow headers are found
+#
+# It is intended to be flexible enough to account for several
+# workflows, and should be expected to always result in a successful
+# build as long as you haven't set environment variables that
+# would limit its options.
+#
+# * Installing a released version from source, as from CRAN, with
+#   no other prior setup
+#   * On macOS, autobrew is used to retrieve libarrow and dependencies
+#   * On Linux, the nixlibs.R build script will download or build
+# * Installing a released version but first installing libarrow.
+#   It will use pkg-config and brew to search for libraries.
+# * Installing a development version from source as a user.
+#   nixlibs.R will be used to build from source. System packages that
+#   are not on the same dev version will be skipped.

Review Comment:
   ```suggestion
   #   Any libarrow found on the system corresponding to a release will not match
   #   the dev version and will thus be skipped. nixlibs.R will be used to build libarrow.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1528830778

   Revision: 8312ac4ebee5e5b11f814f48ad510087c6622cae
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-e67c203de4](https://github.com/ursacomputing/crossbow/branches/all?query=actions-e67c203de4)
   
   |Task|Status|
   |----|------|
   |conda-linux-aarch64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-conda-linux-aarch64-cpu-r42)](https://github.com/ursacomputing/crossbow/tree/actions-e67c203de4-azure-conda-linux-aarch64-cpu-r42)|
   |conda-linux-x64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-conda-linux-x64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/13121563498)|
   |conda-osx-arm64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-conda-osx-arm64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/13121571232)|
   |conda-osx-x64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-conda-osx-x64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/13121567684)|
   |conda-win-x64-cpu-r41|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-conda-win-x64-cpu-r41)](https://github.com/ursacomputing/crossbow/runs/13121570603)|
   |homebrew-r-autobrew|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-homebrew-r-autobrew)](https://github.com/ursacomputing/crossbow/actions/runs/4839866184/jobs/8625163987)|
   |homebrew-r-brew|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-homebrew-r-brew)](https://github.com/ursacomputing/crossbow/actions/runs/4839865491/jobs/8625162675)|
   |r-binary-packages|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-r-binary-packages)](https://github.com/ursacomputing/crossbow/actions/runs/4839864041/jobs/8625165299)|
   |test-fedora-r-clang-sanitizer|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-fedora-r-clang-sanitizer)](https://github.com/ursacomputing/crossbow/runs/13121566401)|
   |test-r-arrow-backwards-compatibility|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-arrow-backwards-compatibility)](https://github.com/ursacomputing/crossbow/actions/runs/4839864589/jobs/8625161313)|
   |test-r-depsource-bundled|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-r-depsource-bundled)](https://github.com/ursacomputing/crossbow/runs/13121570850)|
   |test-r-depsource-system|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-depsource-system)](https://github.com/ursacomputing/crossbow/actions/runs/4839866509/jobs/8625164211)|
   |test-r-dev-duckdb|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-dev-duckdb)](https://github.com/ursacomputing/crossbow/actions/runs/4839867099/jobs/8625165200)|
   |test-r-devdocs|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-devdocs)](https://github.com/ursacomputing/crossbow/actions/runs/4839863630/jobs/8625159434)|
   |test-r-gcc-11|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-gcc-11)](https://github.com/ursacomputing/crossbow/actions/runs/4839865278/jobs/8625162382)|
   |test-r-gcc-12|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-gcc-12)](https://github.com/ursacomputing/crossbow/actions/runs/4839863744/jobs/8625159633)|
   |test-r-install-local|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-install-local)](https://github.com/ursacomputing/crossbow/actions/runs/4839863481/jobs/8625159208)|
   |test-r-install-local-minsizerel|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-install-local-minsizerel)](https://github.com/ursacomputing/crossbow/actions/runs/4839863324/jobs/8625159026)|
   |test-r-library-r-base-latest|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-r-library-r-base-latest)](https://github.com/ursacomputing/crossbow/runs/13121563807)|
   |test-r-linux-as-cran|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-linux-as-cran)](https://github.com/ursacomputing/crossbow/actions/runs/4839864180/jobs/8625160635)|
   |test-r-linux-rchk|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-linux-rchk)](https://github.com/ursacomputing/crossbow/actions/runs/4839866614/jobs/8625164369)|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-r-linux-valgrind)](https://github.com/ursacomputing/crossbow/runs/13121566553)|
   |test-r-minimal-build|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-r-minimal-build)](https://github.com/ursacomputing/crossbow/runs/13121569110)|
   |test-r-offline-maximal|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-offline-maximal)](https://github.com/ursacomputing/crossbow/actions/runs/4839867375/jobs/8625165768)|
   |test-r-offline-minimal|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-r-offline-minimal)](https://github.com/ursacomputing/crossbow/runs/13121567457)|
   |test-r-rhub-debian-gcc-devel-lto-latest|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-r-rhub-debian-gcc-devel-lto-latest)](https://github.com/ursacomputing/crossbow/runs/13121563662)|
   |test-r-rhub-debian-gcc-release-custom-ccache|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-r-rhub-debian-gcc-release-custom-ccache)](https://github.com/ursacomputing/crossbow/runs/13121569857)|
   |test-r-rhub-ubuntu-gcc-release-latest|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-r-rhub-ubuntu-gcc-release-latest)](https://github.com/ursacomputing/crossbow/runs/13121565163)|
   |test-r-rstudio-r-base-4.1-opensuse153|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-r-rstudio-r-base-4.1-opensuse153)](https://github.com/ursacomputing/crossbow/runs/13121571501)|
   |test-r-rstudio-r-base-4.2-centos7-devtoolset-8|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-r-rstudio-r-base-4.2-centos7-devtoolset-8)](https://github.com/ursacomputing/crossbow/runs/13121563922)|
   |test-r-rstudio-r-base-4.2-focal|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-r-rstudio-r-base-4.2-focal)](https://github.com/ursacomputing/crossbow/runs/13121571670)|
   |test-r-ubuntu-22.04|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-ubuntu-22.04)](https://github.com/ursacomputing/crossbow/actions/runs/4839864124/jobs/8625160378)|
   |test-r-versions|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-e67c203de4-github-test-r-versions)](https://github.com/ursacomputing/crossbow/actions/runs/4839864432/jobs/8625161213)|
   |test-ubuntu-r-sanitizer|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-e67c203de4-azure-test-ubuntu-r-sanitizer)](https://github.com/ursacomputing/crossbow/runs/13121568100)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1528856880

   Conda failures all appear to be in python (perhaps the pandas related tests). There's one job failing in the `r-binary-packages` job, something with the LIB_DIR again, will investigate. Other than that, and inspecting the other jobs to make sure that features are correctly enabled, this is about done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "paleolimbot (via GitHub)" <gi...@apache.org>.
paleolimbot commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1529709446

   Just catching up on this after some PTO! I will give this a test ride on all my local environments today 🙂 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1181558415


##########
r/configure:
##########
@@ -58,191 +99,284 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
-fi
-
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+# Test if pkg-config is available to use
+${PKG_CONFIG} --version >/dev/null 2>&1
+if [ $? -eq 0 ]; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
   ARROW_USE_PKG_CONFIG="false"
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  brew --prefix openssl >/dev/null 2>&1
   if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
-  else
-    PKG_CONFIG_AVAILABLE=false
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+fi
+
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME} | sed -e 's/^-L//' | sed -e 's@/lib\$@@'`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif [ "$UNAME" = "Darwin" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  else
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion arrow`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null
+    if [ $? -eq 1 ]; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # TODO: is there no way we can infer from ArrowOptions.cmake or arrow.pc if these are needed?
+      S3_LIBS="-lcurl -lssl -lcrypto"
+      GCS_LIBS="-lcurl -lssl -lcrypto"
+    fi
+  else
+    # If the library directory does not exist, the script must not have been successful
+    _LIBARROW_FOUND="false"
+  fi
+}
 
-      # find openssl on macos. macOS ships with libressl. openssl is installable
-      # with brew, but it is generally not linked. We can over-ride this and find
-      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
-      # the installation process). FWIW, arrow's cmake process uses this
-      # same process to find openssl, but doing it now allows us to catch it in
-      # nixlibs.R and throw a nicer error.
-      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
-        brew --prefix openssl >/dev/null 2>&1
-        if [ $? -eq 0 ]; then
-          export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
-          export PKG_CONFIG_PATH="`brew --prefix openssl`/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-        fi
-      fi
+do_autobrew () {
+  echo "*** Downloading ${PKG_BREW_NAME}"
 
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "" ]; then
-        export ARROW_DEPENDENCY_SOURCE=AUTO
-      fi
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ] && \
-           [ "${PKG_CONFIG_AVAILABLE}" = "false" ]; then
-        export ARROW_DEPENDENCY_SOURCE=BUNDLED
-        echo "**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED"
-      fi
+  # Setup for local autobrew testing
+  if [ -f "tools/apache-arrow.rb" ]; then
+    # If you want to use a local apache-arrow.rb formula, do
+    # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
+    # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository).
+    # If you have this, you should use the local autobrew script so they match.
+    cp tools/autobrew .
+  fi
 
-      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
-
-      LIB_DIR="`pwd`/libarrow/arrow-${VERSION}/lib"
-      if [ -d "$LIB_DIR" ]; then
-        if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
-          # Use pkg-config to do static linking of libarrow's dependencies
-          export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-          PKG_CONFIG="pkg-config"
-          # pkg-config on CentOS 7 doesn't have --define-prefix option.
-          if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
-            # --define-prefix is for binary packages. Binary packages
-            # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
-            # match the extracted path. --define-prefix uses a directory
-            # that arrow.pc exists as its prefix instead of
-            # "/arrow/r/libarrow/dist".
-            PKG_CONFIG="${PKG_CONFIG} --define-prefix"
-          else
-            # Rewrite prefix= in arrow.pc on CentOS 7.
-            sed \
-              -i.bak \
-              -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
-              ${LIB_DIR}/pkgconfig/*.pc
-            rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
-          fi
-          PKG_CONFIG="${PKG_CONFIG} --silence-errors"
-          PKG_CFLAGS="`${PKG_CONFIG} --cflags ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
-          PKG_DIRS="`${PKG_CONFIG} --libs-only-L ${PKG_CONFIG_NAME}`"
-          PKG_LIBS="`${PKG_CONFIG} --libs-only-l --libs-only-other ${PKG_CONFIG_NAME}`"
-        else
-          # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
-          PKG_CFLAGS="-I${LIB_DIR}/../include $PKG_CFLAGS"
-          if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
-            PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
-          fi
-          PKG_DIRS="-L${LIB_DIR}"
-          if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
-            PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
-          fi
-          PKG_LIBS="-larrow"
-          if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
-            PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
-          fi
-          S3_LIBS="-lcurl -lssl -lcrypto"
-          GCS_LIBS="-lcurl -lssl -lcrypto"
-        fi
-      fi
+  if [ -f "autobrew" ]; then
+    echo "**** Using local manifest for ${PKG_BREW_NAME}"
+  else
+    curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
+    if [ $? -ne 0 ]; then
+      echo "Failed to download manifest for ${PKG_BREW_NAME}"
+    else
+      # Fall back to the local copy
+      cp tools/autobrew .
     fi
   fi
-fi
+  . autobrew
+  if [ $? -ne 0 ]; then
+    echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
+  fi
+  # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+  # TODO: move PKG_LIBS and PKG_CFLAGS out of autobrew and use set_pkg_vars
+}
 
-# If on Raspberry Pi, need to manually link against latomic
-# See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
-if grep raspbian /etc/os-release >/dev/null 2>&1; then
-  PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
-  PKG_LIBS="-latomic $PKG_LIBS"
+# Once libarrow is obtained, this function sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+# either from pkg-config or by inferring things about the directory in $1
+set_pkg_vars () {
+  LIB_DIR="$1/lib"
+  if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+    set_pkg_vars_with_pc
+  else
+    set_pkg_vars_without_pc $1
+  fi
+
+  # If on Raspberry Pi, need to manually link against latomic
+  # See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
+  if grep raspbian /etc/os-release >/dev/null 2>&1; then
+    PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
+    PKG_LIBS="-latomic $PKG_LIBS"
+  fi

Review Comment:
   👍 I'll move it in `set_pkg_vars_without_pc`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1529714385

   > Just catching up on this after some PTO! I will give this a test ride on all my local environments today 🙂
   
   🙏  Some combinations of things I tested locally, with a dev build of arrow in `ARROW_HOME` and having done `brew install apache-arrow`:
   
   * Usual way, with `ARROW_HOME` set: you see the new message warning you that you're on dev versions and if the build fails, update your C++ build
   * Edit the version number in `"$ARROW_HOME/lib/pkgconfig/arrow.pc"` and see that it is rejected because the versions don't match.
   * Unset `ARROW_HOME`: Homebrew apache-arrow is found, but it fails the version check, so it isn't used. Since we're on a dev version, autobrew isn't attempted and instead the bundled build starts (I ctrl-C'd though, not wanting to wait for it).
   * Put `"$(brew --prefix apache-arrow)/lib/pkgconfig` on `PKG_CONFIG_PATH` and see that `pkg-config` finds it (but then it is rejected because the versions don't match)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] paleolimbot commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "paleolimbot (via GitHub)" <gi...@apache.org>.
paleolimbot commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531894184

   Works great on Ubuntu via Docker (arm64 and amd64)!
   
   ```
   # docker run --rm -it ghcr.io/apache/arrow-nanoarrow:ubuntu-arm64 or amd64
   git clone https://github.com/nealrichardson/arrow.git
   cd arrow/r
   git switch lib-dir
   R CMD INSTALL .
   ```
   
   It doesn't seem to be pulling the hosted static libraries for development, which I expected on Ubuntu/x86 (but maybe it's not supposed to?).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1176005072


##########
r/configure:
##########
@@ -252,6 +252,10 @@ echo "#include $PKG_TEST_HEADER" | ${TEST_CMD} >/dev/null 2>&1
 
 if [ $? -eq 0 ]; then
   # Check for features
+  if [ "$LIB_DIR" = "" ]; then
+    # In case LIB_DIR wasn't previously set, infer from PKG_DIRS
+    LIB_DIR=`echo $PKG_DIRS | sed -e 's/^-L//'`
+  fi

Review Comment:
   Ah, #35140 uses Apache Arrow 11.0.0.
   I was seeing the main branch.
   
   Yes. The change https://github.com/apache/arrow/pull/35147#discussion_r1175945751 will fix #35140. And it's already in the main branch: https://github.com/apache/arrow/pull/34229



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1519064884

   If this is in fact the fix for the conda package issues (https://github.com/conda-forge/r-arrow-feedstock/issues/56#issuecomment-1518745545, https://github.com/conda-forge/r-arrow-feedstock/issues/62), then we would have a way to test this in the nightly conda packaging jobs. We currently build the packages but don't install them and assert anything about their capabilities. 
   
   Perhaps we can add something to https://github.com/apache/arrow/blob/main/dev/tasks/conda-recipes/r-arrow/build.sh to load the installed package and test that it has parquet, dataset, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1175945751


##########
r/configure:
##########
@@ -252,6 +252,10 @@ echo "#include $PKG_TEST_HEADER" | ${TEST_CMD} >/dev/null 2>&1
 
 if [ $? -eq 0 ]; then
   # Check for features
+  if [ "$LIB_DIR" = "" ]; then
+    # In case LIB_DIR wasn't previously set, infer from PKG_DIRS
+    LIB_DIR=`echo $PKG_DIRS | sed -e 's/^-L//'`
+  fi

Review Comment:
   @kou is this possibly relevant? `PKG_LIBS` and `PKG_DIRS` seem to be set correctly but for some reason `LIB_DIR` isn't. 
   
   ```diff
   diff --git a/r/configure b/r/configure
   index 65528bdc38..8e130c53b4 100755
   --- a/r/configure
   +++ b/r/configure
   @@ -110,7 +110,7 @@ else
        PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
        PKG_LIBS="${PKGCONFIG_LIBS}"
        PKG_DIRS="${PKGCONFIG_DIRS}"
   -    LIB_DIR=${FOUND_LIB_DIR}
   +    LIB_DIR="${FOUND_LIB_DIR}"
    
        # Check for version mismatch
        PC_LIB_VERSION=`pkg-config --modversion arrow`
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1527970913

   @github-actions crossbow submit -g r


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1528075220

   I will inspect the successful builds to make sure features are enabled as they should be. Of the failures:
   
   * `test-r-offline-maximal` is a download timeout for orc (which we don't even use). https://github.com/ursacomputing/crossbow/actions/runs/4833915427/jobs/8614507086#step:4:50
   * `test-r-arrow-backwards-compatibility` I don't quite understand on first glance, will have to inspect, says it can't find libcurl https://github.com/ursacomputing/crossbow/actions/runs/4833911218/jobs/8614498685#step:7:27
   * `r-binary-packages` has at least one failure in parsing LIB_DIR, in which it was assumed to have only one -L in it but there were two https://github.com/ursacomputing/crossbow/actions/runs/4833911390/jobs/8614512347#step:14:25930
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1528829943

   @github-actions crossbow submit -g r


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531474469

   Also @paleolimbot you don't need to test Windows because this PR doesn't modify `configure.win`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531601477

   Revision: a769f6ae5dd5f5b1e713e0917491a1f464b1ae97
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-674657ab73](https://github.com/ursacomputing/crossbow/branches/all?query=actions-674657ab73)
   
   |Task|Status|
   |----|------|
   |conda-linux-aarch64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-conda-linux-aarch64-cpu-r42)](https://github.com/ursacomputing/crossbow/tree/actions-674657ab73-azure-conda-linux-aarch64-cpu-r42)|
   |conda-linux-x64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-conda-linux-x64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/13175271441)|
   |conda-osx-arm64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-conda-osx-arm64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/13175267662)|
   |conda-osx-x64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-conda-osx-x64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/13175272384)|
   |conda-win-x64-cpu-r41|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-conda-win-x64-cpu-r41)](https://github.com/ursacomputing/crossbow/runs/13175269119)|
   |homebrew-r-autobrew|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-homebrew-r-autobrew)](https://github.com/ursacomputing/crossbow/actions/runs/4862491850/jobs/8668932385)|
   |homebrew-r-brew|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-homebrew-r-brew)](https://github.com/ursacomputing/crossbow/actions/runs/4862487434/jobs/8668920617)|
   |r-binary-packages|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-r-binary-packages)](https://github.com/ursacomputing/crossbow/actions/runs/4862484825/jobs/8668928476)|
   |test-fedora-r-clang-sanitizer|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-fedora-r-clang-sanitizer)](https://github.com/ursacomputing/crossbow/runs/13175258346)|
   |test-r-arrow-backwards-compatibility|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-arrow-backwards-compatibility)](https://github.com/ursacomputing/crossbow/actions/runs/4862490505/jobs/8668928395)|
   |test-r-depsource-bundled|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-r-depsource-bundled)](https://github.com/ursacomputing/crossbow/runs/13175282145)|
   |test-r-depsource-system|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-depsource-system)](https://github.com/ursacomputing/crossbow/actions/runs/4862485718/jobs/8668916186)|
   |test-r-dev-duckdb|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-dev-duckdb)](https://github.com/ursacomputing/crossbow/actions/runs/4862486482/jobs/8668918154)|
   |test-r-devdocs|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-devdocs)](https://github.com/ursacomputing/crossbow/actions/runs/4862485406/jobs/8668915379)|
   |test-r-gcc-11|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-gcc-11)](https://github.com/ursacomputing/crossbow/actions/runs/4862486289/jobs/8668917723)|
   |test-r-gcc-12|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-gcc-12)](https://github.com/ursacomputing/crossbow/actions/runs/4862489803/jobs/8668926425)|
   |test-r-install-local|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-install-local)](https://github.com/ursacomputing/crossbow/actions/runs/4862486738/jobs/8668918967)|
   |test-r-install-local-minsizerel|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-install-local-minsizerel)](https://github.com/ursacomputing/crossbow/actions/runs/4862491548/jobs/8668931279)|
   |test-r-library-r-base-latest|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-r-library-r-base-latest)](https://github.com/ursacomputing/crossbow/runs/13175261559)|
   |test-r-linux-as-cran|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-linux-as-cran)](https://github.com/ursacomputing/crossbow/actions/runs/4862489575/jobs/8668926888)|
   |test-r-linux-rchk|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-linux-rchk)](https://github.com/ursacomputing/crossbow/actions/runs/4862490782/jobs/8668929350)|
   |test-r-linux-valgrind|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-r-linux-valgrind)](https://github.com/ursacomputing/crossbow/runs/13175276202)|
   |test-r-minimal-build|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-r-minimal-build)](https://github.com/ursacomputing/crossbow/runs/13175279362)|
   |test-r-offline-maximal|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-offline-maximal)](https://github.com/ursacomputing/crossbow/actions/runs/4862491280/jobs/8668930625)|
   |test-r-offline-minimal|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-r-offline-minimal)](https://github.com/ursacomputing/crossbow/runs/13175268168)|
   |test-r-rhub-debian-gcc-devel-lto-latest|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-r-rhub-debian-gcc-devel-lto-latest)](https://github.com/ursacomputing/crossbow/runs/13175273799)|
   |test-r-rhub-debian-gcc-release-custom-ccache|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-r-rhub-debian-gcc-release-custom-ccache)](https://github.com/ursacomputing/crossbow/runs/13175264788)|
   |test-r-rhub-ubuntu-gcc-release-latest|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-r-rhub-ubuntu-gcc-release-latest)](https://github.com/ursacomputing/crossbow/runs/13175267101)|
   |test-r-rstudio-r-base-4.1-opensuse153|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-r-rstudio-r-base-4.1-opensuse153)](https://github.com/ursacomputing/crossbow/runs/13175260970)|
   |test-r-rstudio-r-base-4.2-centos7-devtoolset-8|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-r-rstudio-r-base-4.2-centos7-devtoolset-8)](https://github.com/ursacomputing/crossbow/runs/13175259556)|
   |test-r-rstudio-r-base-4.2-focal|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-r-rstudio-r-base-4.2-focal)](https://github.com/ursacomputing/crossbow/runs/13175278243)|
   |test-r-ubuntu-22.04|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-ubuntu-22.04)](https://github.com/ursacomputing/crossbow/actions/runs/4862488700/jobs/8668923613)|
   |test-r-versions|[![Github Actions](https://github.com/ursacomputing/crossbow/workflows/Crossbow/badge.svg?branch=actions-674657ab73-github-test-r-versions)](https://github.com/ursacomputing/crossbow/actions/runs/4862484628/jobs/8668914044)|
   |test-ubuntu-r-sanitizer|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-674657ab73-azure-test-ubuntu-r-sanitizer)](https://github.com/ursacomputing/crossbow/runs/13175270795)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson merged pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson merged PR #35147:
URL: https://github.com/apache/arrow/pull/35147


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1528830816

   The `r-binary-packages` failures the first time on mac could be real issues with the bundled build, which was being triggered because we are on a dev version so now skipping autobrew unless FORCE_AUTOBREW is set. On 10.13, snappy had a compilation error. I'm not inclined to care about that. But on macOS 11, the package built but couldn't find absl symbols, which is odd because `-larrow_bundled_dependencies` was included. Hopefully the other LIB_DIR issue was somehow responsible? In any case, if it's a real issue, it will probably reappear somewhere else.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jonkeane commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "jonkeane (via GitHub)" <gi...@apache.org>.
jonkeane commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1181692315


##########
r/configure:
##########
@@ -17,39 +17,81 @@
 # specific language governing permissions and limitations
 # under the License.
 
-# Anticonf (tm) script by Jeroen Ooms, Jim Hester (2017)
-# License: MIT
+# This script makes sure we have our C++ dependencies in order
+# so that we can compile the bindings in src/ and build the
+# R package's shared library.
 #
-# This script will query 'pkg-config' for the required cflags and ldflags.
-# If pkg-config is unavailable or does not find the library, try setting
-# INCLUDE_DIR and LIB_DIR manually via e.g:
-# R CMD INSTALL --configure-vars='INCLUDE_DIR=/.../include LIB_DIR=/.../lib'
+# The core logic is:
+#
+# * Find libarrow on the system. If it is present, make sure
+#   that its version is compatible with the R package.
+# * If no suitable libarrow is found, download it (where allowed)
+#   or build it from source.
+# * Determine what features this libarrow has and what other
+#   flags it requires, and set them in src/Makevars for use when
+#   compiling the bindings.
+# * Run a test program to confirm that arrow headers are found
+#
+# It is intended to be flexible enough to account for several
+# workflows, and should be expected to always result in a successful
+# build as long as you haven't set environment variables that
+# would limit its options.
+#
+# * Installing a released version from source, as from CRAN, with
+#   no other prior setup
+#   * On macOS, autobrew is used to retrieve libarrow and dependencies
+#   * On Linux, the nixlibs.R build script will download or build
+# * Installing a released version but first installing libarrow.
+#   It will use pkg-config and brew to search for libraries.
+# * Installing a development version from source as a user.
+#   nixlibs.R will be used to build from source. System packages that
+#   are not on the same dev version will be skipped.
+# * TODO(GH-35049): Installing a dev version as an infrequent contributor.
+#   Currently the configure script doesn't offer much to make this easy.
+#   If you expect to rebuild multiple times, you should set up a dev
+#   environment.
+# * Installing a dev version as a regular developer. 
+#   The best way is to maintain your own cmake build and install it
+#   to a directory (not system) that you set as the env var
+#   $ARROW_HOME.  

Review Comment:
   ❤️  to the clarity on system versus not here



##########
r/tools/check-versions.R:
##########
@@ -0,0 +1,56 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+args <- commandArgs(TRUE)
+
+# TESTING is set in test-check-version.R; it won't be set when called from configure
+test_mode <- exists("TESTING")
+
+check_versions <- function(r_version, cpp_version) {
+  r_parsed <- package_version(r_version)
+  r_dev_version <- r_parsed[1, 4]
+  r_is_dev <- !is.na(r_dev_version) && r_dev_version > 100
+  cpp_is_dev <- grepl("SNAPSHOT$", cpp_version)
+  cpp_parsed <- package_version(sub("-SNAPSHOT$", "", cpp_version))
+
+  major <- function(x) as.numeric(x[1, 1])
+  # R and C++ denote dev versions differently
+  # R is current.release.9000, C++ is next.release-SNAPSHOT
+  # So a "match" is if the R major version + 1 = C++ major version
+  if (r_is_dev && cpp_is_dev && major(r_parsed) + 1 == major(cpp_parsed)) {
+    msg <- c(
+      sprintf("*** > Packages are both on development versions (%s, %s)", cpp_version, r_version),
+      "*** > If installation fails, rebuild the C++ library to match the R version",
+      "*** > or retry with FORCE_BUNDLED_BUILD=true"
+    )
+    cat(paste0(msg, "\n", collapse = ""))
+  } else if (r_version != cpp_version) {
+    cat(
+      sprintf(
+        "**** Not using found C++ library: version (%s) does not match R package (%s)\n",

Review Comment:
   ```suggestion
           "**** C++ library conflict: found version (%s) does not match R package (%s)\n",
   ```
   
   Something about "Not using found C++" strikes me as a bit off — almost like malicious compliance style or something?
   
   Separately (and this might be too hard) but would it be possible to say _where_ that thing was found from? either brew|system install or a path might help folks identify what's up? It should be in an earlier log line so it's probably not ultra necessary here, but we might already have that envvar here to use?



##########
r/configure:
##########
@@ -58,191 +100,282 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
+# Test if pkg-config is available to use
+if ${PKG_CONFIG} --version >/dev/null 2>&1; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
+  ARROW_USE_PKG_CONFIG="false"
 fi
 
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
-  ARROW_USE_PKG_CONFIG="false"
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  if brew --prefix openssl >/dev/null 2>&1; then
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  fi
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
-  if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --variable=prefix --silence-errors ${PKG_CONFIG_NAME}`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif brew --prefix ${PKG_BREW_NAME} > /dev/null 2>&1; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   else
-    PKG_CONFIG_AVAILABLE=false
-  fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+    _LIBARROW_FOUND="false"

Review Comment:
   Would it be helpful to have a message here too? Or is one thrown later? I really really like that all the other branches here have clear "this is the branch it's on now" — that's been frustrating to diagnose before.



##########
r/configure:
##########
@@ -17,39 +17,81 @@
 # specific language governing permissions and limitations
 # under the License.
 
-# Anticonf (tm) script by Jeroen Ooms, Jim Hester (2017)
-# License: MIT
+# This script makes sure we have our C++ dependencies in order
+# so that we can compile the bindings in src/ and build the
+# R package's shared library.
 #
-# This script will query 'pkg-config' for the required cflags and ldflags.
-# If pkg-config is unavailable or does not find the library, try setting
-# INCLUDE_DIR and LIB_DIR manually via e.g:
-# R CMD INSTALL --configure-vars='INCLUDE_DIR=/.../include LIB_DIR=/.../lib'
+# The core logic is:
+#
+# * Find libarrow on the system. If it is present, make sure
+#   that its version is compatible with the R package.
+# * If no suitable libarrow is found, download it (where allowed)
+#   or build it from source.
+# * Determine what features this libarrow has and what other
+#   flags it requires, and set them in src/Makevars for use when
+#   compiling the bindings.
+# * Run a test program to confirm that arrow headers are found
+#
+# It is intended to be flexible enough to account for several
+# workflows, and should be expected to always result in a successful
+# build as long as you haven't set environment variables that
+# would limit its options.
+#
+# * Installing a released version from source, as from CRAN, with
+#   no other prior setup
+#   * On macOS, autobrew is used to retrieve libarrow and dependencies
+#   * On Linux, the nixlibs.R build script will download or build
+# * Installing a released version but first installing libarrow.
+#   It will use pkg-config and brew to search for libraries.
+# * Installing a development version from source as a user.
+#   nixlibs.R will be used to build from source. System packages that
+#   are not on the same dev version will be skipped.
+# * TODO(GH-35049): Installing a dev version as an infrequent contributor.
+#   Currently the configure script doesn't offer much to make this easy.
+#   If you expect to rebuild multiple times, you should set up a dev
+#   environment.

Review Comment:
   IMO, this is probably fine. I think we've maintained a lot to try and make this easy, but it never ends up quite succeeding.



##########
r/configure:
##########
@@ -17,39 +17,81 @@
 # specific language governing permissions and limitations
 # under the License.
 
-# Anticonf (tm) script by Jeroen Ooms, Jim Hester (2017)
-# License: MIT
+# This script makes sure we have our C++ dependencies in order
+# so that we can compile the bindings in src/ and build the
+# R package's shared library.
 #
-# This script will query 'pkg-config' for the required cflags and ldflags.
-# If pkg-config is unavailable or does not find the library, try setting
-# INCLUDE_DIR and LIB_DIR manually via e.g:
-# R CMD INSTALL --configure-vars='INCLUDE_DIR=/.../include LIB_DIR=/.../lib'
+# The core logic is:
+#
+# * Find libarrow on the system. If it is present, make sure
+#   that its version is compatible with the R package.
+# * If no suitable libarrow is found, download it (where allowed)
+#   or build it from source.
+# * Determine what features this libarrow has and what other
+#   flags it requires, and set them in src/Makevars for use when
+#   compiling the bindings.
+# * Run a test program to confirm that arrow headers are found
+#
+# It is intended to be flexible enough to account for several
+# workflows, and should be expected to always result in a successful
+# build as long as you haven't set environment variables that
+# would limit its options.
+#
+# * Installing a released version from source, as from CRAN, with
+#   no other prior setup
+#   * On macOS, autobrew is used to retrieve libarrow and dependencies
+#   * On Linux, the nixlibs.R build script will download or build
+# * Installing a released version but first installing libarrow.
+#   It will use pkg-config and brew to search for libraries.
+# * Installing a development version from source as a user.
+#   nixlibs.R will be used to build from source. System packages that
+#   are not on the same dev version will be skipped.

Review Comment:
   skipped meaning features they are needed for will be turned off? Or will be built bundled later? Maybe clarifying that here might be nice?



##########
r/configure:
##########
@@ -58,191 +100,282 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
+# Test if pkg-config is available to use
+if ${PKG_CONFIG} --version >/dev/null 2>&1; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
+  ARROW_USE_PKG_CONFIG="false"
 fi
 
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
-  ARROW_USE_PKG_CONFIG="false"
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  if brew --prefix openssl >/dev/null 2>&1; then
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  fi
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
-  if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --variable=prefix --silence-errors ${PKG_CONFIG_NAME}`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif brew --prefix ${PKG_BREW_NAME} > /dev/null 2>&1; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   else
-    PKG_CONFIG_AVAILABLE=false
-  fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion ${PKG_CONFIG_NAME}`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    if ! ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    export PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # These would be identified by pkg-config, in Requires.private and Libs.private.
+      # Rather than try to re-implement pkg-config, we can just hard-code them here.
+      S3_LIBS="-lcurl -lssl -lcrypto"
+      GCS_LIBS="-lcurl -lssl -lcrypto"
+    fi
+  else
+    # If the library directory does not exist, the script must not have been successful
+    _LIBARROW_FOUND="false"
+  fi
+}
 
-      # find openssl on macos. macOS ships with libressl. openssl is installable
-      # with brew, but it is generally not linked. We can over-ride this and find
-      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
-      # the installation process). FWIW, arrow's cmake process uses this
-      # same process to find openssl, but doing it now allows us to catch it in
-      # nixlibs.R and throw a nicer error.
-      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
-        brew --prefix openssl >/dev/null 2>&1
-        if [ $? -eq 0 ]; then
-          export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
-          export PKG_CONFIG_PATH="`brew --prefix openssl`/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-        fi
-      fi
+do_autobrew () {
+  echo "*** Downloading ${PKG_BREW_NAME}"
 
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "" ]; then
-        export ARROW_DEPENDENCY_SOURCE=AUTO
-      fi
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ] && \
-           [ "${PKG_CONFIG_AVAILABLE}" = "false" ]; then
-        export ARROW_DEPENDENCY_SOURCE=BUNDLED
-        echo "**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED"
-      fi
+  # Setup for local autobrew testing
+  if [ -f "tools/apache-arrow.rb" ]; then
+    # If you want to use a local apache-arrow.rb formula, do
+    # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
+    # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository).
+    # If you have this, you should use the local autobrew script so they match.
+    cp tools/autobrew .
+  fi
 
-      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
-
-      LIB_DIR="`pwd`/libarrow/arrow-${VERSION}/lib"
-      if [ -d "$LIB_DIR" ]; then
-        if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
-          # Use pkg-config to do static linking of libarrow's dependencies
-          export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-          PKG_CONFIG="pkg-config"
-          # pkg-config on CentOS 7 doesn't have --define-prefix option.
-          if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
-            # --define-prefix is for binary packages. Binary packages
-            # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
-            # match the extracted path. --define-prefix uses a directory
-            # that arrow.pc exists as its prefix instead of
-            # "/arrow/r/libarrow/dist".
-            PKG_CONFIG="${PKG_CONFIG} --define-prefix"
-          else
-            # Rewrite prefix= in arrow.pc on CentOS 7.
-            sed \
-              -i.bak \
-              -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
-              ${LIB_DIR}/pkgconfig/*.pc
-            rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
-          fi
-          PKG_CONFIG="${PKG_CONFIG} --silence-errors"
-          PKG_CFLAGS="`${PKG_CONFIG} --cflags ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
-          PKG_DIRS="`${PKG_CONFIG} --libs-only-L ${PKG_CONFIG_NAME}`"
-          PKG_LIBS="`${PKG_CONFIG} --libs-only-l --libs-only-other ${PKG_CONFIG_NAME}`"
-        else
-          # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
-          PKG_CFLAGS="-I${LIB_DIR}/../include $PKG_CFLAGS"
-          if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
-            PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
-          fi
-          PKG_DIRS="-L${LIB_DIR}"
-          if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
-            PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
-          fi
-          PKG_LIBS="-larrow"
-          if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
-            PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
-          fi
-          S3_LIBS="-lcurl -lssl -lcrypto"
-          GCS_LIBS="-lcurl -lssl -lcrypto"
-        fi
-      fi
+  if [ -f "autobrew" ]; then
+    echo "**** Using local manifest for ${PKG_BREW_NAME}"
+  else
+    if ! curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew; then
+      echo "Failed to download manifest for ${PKG_BREW_NAME}"
+      # Fall back to the local copy
+      cp tools/autobrew .
     fi
   fi
-fi
+  if ! . autobrew; then
+    echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
+  fi
+  # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+  # TODO: move PKG_LIBS and PKG_CFLAGS out of autobrew and use set_pkg_vars
+}
 
-# If on Raspberry Pi, need to manually link against latomic
-# See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
-if grep raspbian /etc/os-release >/dev/null 2>&1; then
-  PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
-  PKG_LIBS="-latomic $PKG_LIBS"
+# Once libarrow is obtained, this function sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+# either from pkg-config or by inferring things about the directory in $1
+set_pkg_vars () {
+  LIB_DIR="$1/lib"
+  if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+    set_pkg_vars_with_pc
+  else
+    set_pkg_vars_without_pc $1
+  fi
+
+  # Set any user-defined CXXFLAGS
+  if [ "$ARROW_R_CXXFLAGS" ]; then
+    PKG_CFLAGS="$PKG_CFLAGS $ARROW_R_CXXFLAGS"
+  fi
+
+  # Finally, check cmake options for enabled features
+  add_feature_flags
+}
+
+# If we have pkg-config, it will tell us what libarrow needs
+set_pkg_vars_with_pc () {
+  PKG_CFLAGS="`${PKG_CONFIG} --cflags --silence-errors ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
+  PKG_LIBS=`${PKG_CONFIG} --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
+  PKG_DIRS=`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+}
+
+# If we don't have pkg-config, we can make some inferences
+set_pkg_vars_without_pc () {
+  PKG_CFLAGS="-I$1/include $PKG_CFLAGS"
+  if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
+    PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
+  fi
+  PKG_DIRS="-L${LIB_DIR}"
+  if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
+    PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
+  fi
+  PKG_LIBS="-larrow"
+  if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
+    PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
+  fi
+
+  # If on Raspberry Pi, need to manually link against latomic
+  # See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
+  # pkg-config will handle this for us automatically, see ARROW-6312
+  if grep raspbian /etc/os-release >/dev/null 2>&1; then
+    PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
+    PKG_LIBS="-latomic $PKG_LIBS"
+  fi
+}
+
+add_feature_flags () {
+  # Now we need to check what features it was built with and enable
+  # the corresponding feature flags in the R bindings (-DARROW_R_WITH_stuff).
+  # We do this by inspecting ArrowOptions.cmake, which the libarrow build
+  # generates.
+  ARROW_OPTS_CMAKE="$LIB_DIR/cmake/Arrow/ArrowOptions.cmake"
+  if [ ! -f "${ARROW_OPTS_CMAKE}" ]; then
+    echo "*** $ARROW_OPTS_CMAKE not found; some features will not be enabled"
+  else
+    if arrow_built_with ARROW_PARQUET; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_PARQUET"
+      PKG_LIBS="-lparquet $PKG_LIBS"
+      # NOTE: parquet is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_DATASET; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_DATASET"
+      PKG_LIBS="-larrow_dataset $PKG_LIBS"
+      # NOTE: arrow-dataset is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_ACERO; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_ACERO"
+      PKG_LIBS="-larrow_acero $PKG_LIBS"
+      # NOTE: arrow-acero is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_SUBSTRAIT; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_SUBSTRAIT"
+      PKG_LIBS="-larrow_substrait $PKG_LIBS"
+      # NOTE: arrow-substrait is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_JSON; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_JSON"
+    fi
+    if arrow_built_with ARROW_S3; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_S3"
+      PKG_LIBS="$PKG_LIBS $S3_LIBS"
+    fi
+    if arrow_built_with ARROW_GCS; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_GCS"
+      PKG_LIBS="$PKG_LIBS $GCS_LIBS"
+    fi
+  fi
+}
+
+arrow_built_with() {
+  # Function to check cmake options for features
+  grep -i 'set('"$1"' "ON")' $ARROW_OPTS_CMAKE >/dev/null 2>&1
+}
+
+##############
+# Main logic #
+##############

Review Comment:
   This is a little pedantic, but I wonder if it would be more straightforward to have this up top and the functions down below (kinda like you define `arrow_built_with` after the function that uses it)? 
   
   I'm not absolutely tied to it, but having the operative thing up top and supporting things below it is a nice way to read things



##########
r/tools/check-versions.R:
##########
@@ -0,0 +1,56 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+args <- commandArgs(TRUE)
+
+# TESTING is set in test-check-version.R; it won't be set when called from configure
+test_mode <- exists("TESTING")
+
+check_versions <- function(r_version, cpp_version) {
+  r_parsed <- package_version(r_version)
+  r_dev_version <- r_parsed[1, 4]
+  r_is_dev <- !is.na(r_dev_version) && r_dev_version > 100
+  cpp_is_dev <- grepl("SNAPSHOT$", cpp_version)
+  cpp_parsed <- package_version(sub("-SNAPSHOT$", "", cpp_version))
+
+  major <- function(x) as.numeric(x[1, 1])
+  # R and C++ denote dev versions differently
+  # R is current.release.9000, C++ is next.release-SNAPSHOT
+  # So a "match" is if the R major version + 1 = C++ major version
+  if (r_is_dev && cpp_is_dev && major(r_parsed) + 1 == major(cpp_parsed)) {
+    msg <- c(
+      sprintf("*** > Packages are both on development versions (%s, %s)", cpp_version, r_version),
+      "*** > If installation fails, rebuild the C++ library to match the R version",
+      "*** > or retry with FORCE_BUNDLED_BUILD=true"
+    )
+    cat(paste0(msg, "\n", collapse = ""))
+  } else if (r_version != cpp_version) {
+    cat(
+      sprintf(
+        "**** Not using found C++ library: version (%s) does not match R package (%s)\n",
+        cpp_version,
+        r_version
+      )
+    )
+    stop("version mismatch")
+    # Add ALLOW_VERSION_MISMATCH env var to override stop()? (Could be useful for debugging)

Review Comment:
   🙏 Not necessary to ship this, of course, but I will add a 🗳️ to this I have needed to | wanted to do this while debugging in the past.



##########
r/configure:
##########
@@ -58,191 +100,282 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
+# Test if pkg-config is available to use
+if ${PKG_CONFIG} --version >/dev/null 2>&1; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
+  ARROW_USE_PKG_CONFIG="false"
 fi
 
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
-  ARROW_USE_PKG_CONFIG="false"
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  if brew --prefix openssl >/dev/null 2>&1; then
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  fi
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
-  if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --variable=prefix --silence-errors ${PKG_CONFIG_NAME}`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif brew --prefix ${PKG_BREW_NAME} > /dev/null 2>&1; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   else
-    PKG_CONFIG_AVAILABLE=false
-  fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion ${PKG_CONFIG_NAME}`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    if ! ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    export PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # These would be identified by pkg-config, in Requires.private and Libs.private.
+      # Rather than try to re-implement pkg-config, we can just hard-code them here.
+      S3_LIBS="-lcurl -lssl -lcrypto"
+      GCS_LIBS="-lcurl -lssl -lcrypto"
+    fi
+  else
+    # If the library directory does not exist, the script must not have been successful
+    _LIBARROW_FOUND="false"
+  fi
+}
 
-      # find openssl on macos. macOS ships with libressl. openssl is installable
-      # with brew, but it is generally not linked. We can over-ride this and find
-      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
-      # the installation process). FWIW, arrow's cmake process uses this
-      # same process to find openssl, but doing it now allows us to catch it in
-      # nixlibs.R and throw a nicer error.
-      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
-        brew --prefix openssl >/dev/null 2>&1
-        if [ $? -eq 0 ]; then
-          export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
-          export PKG_CONFIG_PATH="`brew --prefix openssl`/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-        fi
-      fi
+do_autobrew () {
+  echo "*** Downloading ${PKG_BREW_NAME}"
 
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "" ]; then
-        export ARROW_DEPENDENCY_SOURCE=AUTO
-      fi
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ] && \
-           [ "${PKG_CONFIG_AVAILABLE}" = "false" ]; then
-        export ARROW_DEPENDENCY_SOURCE=BUNDLED
-        echo "**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED"
-      fi
+  # Setup for local autobrew testing
+  if [ -f "tools/apache-arrow.rb" ]; then
+    # If you want to use a local apache-arrow.rb formula, do
+    # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
+    # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository).
+    # If you have this, you should use the local autobrew script so they match.
+    cp tools/autobrew .
+  fi
 
-      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
-
-      LIB_DIR="`pwd`/libarrow/arrow-${VERSION}/lib"
-      if [ -d "$LIB_DIR" ]; then
-        if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
-          # Use pkg-config to do static linking of libarrow's dependencies
-          export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-          PKG_CONFIG="pkg-config"
-          # pkg-config on CentOS 7 doesn't have --define-prefix option.
-          if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
-            # --define-prefix is for binary packages. Binary packages
-            # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
-            # match the extracted path. --define-prefix uses a directory
-            # that arrow.pc exists as its prefix instead of
-            # "/arrow/r/libarrow/dist".
-            PKG_CONFIG="${PKG_CONFIG} --define-prefix"
-          else
-            # Rewrite prefix= in arrow.pc on CentOS 7.
-            sed \
-              -i.bak \
-              -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
-              ${LIB_DIR}/pkgconfig/*.pc
-            rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
-          fi
-          PKG_CONFIG="${PKG_CONFIG} --silence-errors"
-          PKG_CFLAGS="`${PKG_CONFIG} --cflags ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
-          PKG_DIRS="`${PKG_CONFIG} --libs-only-L ${PKG_CONFIG_NAME}`"
-          PKG_LIBS="`${PKG_CONFIG} --libs-only-l --libs-only-other ${PKG_CONFIG_NAME}`"
-        else
-          # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
-          PKG_CFLAGS="-I${LIB_DIR}/../include $PKG_CFLAGS"
-          if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
-            PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
-          fi
-          PKG_DIRS="-L${LIB_DIR}"
-          if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
-            PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
-          fi
-          PKG_LIBS="-larrow"
-          if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
-            PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
-          fi
-          S3_LIBS="-lcurl -lssl -lcrypto"
-          GCS_LIBS="-lcurl -lssl -lcrypto"
-        fi
-      fi
+  if [ -f "autobrew" ]; then
+    echo "**** Using local manifest for ${PKG_BREW_NAME}"
+  else
+    if ! curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew; then
+      echo "Failed to download manifest for ${PKG_BREW_NAME}"
+      # Fall back to the local copy
+      cp tools/autobrew .
     fi
   fi
-fi
+  if ! . autobrew; then
+    echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
+  fi
+  # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+  # TODO: move PKG_LIBS and PKG_CFLAGS out of autobrew and use set_pkg_vars
+}
 
-# If on Raspberry Pi, need to manually link against latomic
-# See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
-if grep raspbian /etc/os-release >/dev/null 2>&1; then
-  PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
-  PKG_LIBS="-latomic $PKG_LIBS"
+# Once libarrow is obtained, this function sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+# either from pkg-config or by inferring things about the directory in $1
+set_pkg_vars () {
+  LIB_DIR="$1/lib"
+  if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+    set_pkg_vars_with_pc
+  else
+    set_pkg_vars_without_pc $1
+  fi
+
+  # Set any user-defined CXXFLAGS
+  if [ "$ARROW_R_CXXFLAGS" ]; then
+    PKG_CFLAGS="$PKG_CFLAGS $ARROW_R_CXXFLAGS"
+  fi
+
+  # Finally, check cmake options for enabled features
+  add_feature_flags
+}
+
+# If we have pkg-config, it will tell us what libarrow needs
+set_pkg_vars_with_pc () {
+  PKG_CFLAGS="`${PKG_CONFIG} --cflags --silence-errors ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
+  PKG_LIBS=`${PKG_CONFIG} --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
+  PKG_DIRS=`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+}
+
+# If we don't have pkg-config, we can make some inferences
+set_pkg_vars_without_pc () {
+  PKG_CFLAGS="-I$1/include $PKG_CFLAGS"
+  if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
+    PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
+  fi
+  PKG_DIRS="-L${LIB_DIR}"
+  if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
+    PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
+  fi
+  PKG_LIBS="-larrow"
+  if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
+    PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
+  fi
+
+  # If on Raspberry Pi, need to manually link against latomic
+  # See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
+  # pkg-config will handle this for us automatically, see ARROW-6312
+  if grep raspbian /etc/os-release >/dev/null 2>&1; then
+    PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
+    PKG_LIBS="-latomic $PKG_LIBS"
+  fi
+}
+
+add_feature_flags () {
+  # Now we need to check what features it was built with and enable
+  # the corresponding feature flags in the R bindings (-DARROW_R_WITH_stuff).
+  # We do this by inspecting ArrowOptions.cmake, which the libarrow build
+  # generates.
+  ARROW_OPTS_CMAKE="$LIB_DIR/cmake/Arrow/ArrowOptions.cmake"
+  if [ ! -f "${ARROW_OPTS_CMAKE}" ]; then
+    echo "*** $ARROW_OPTS_CMAKE not found; some features will not be enabled"
+  else
+    if arrow_built_with ARROW_PARQUET; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_PARQUET"
+      PKG_LIBS="-lparquet $PKG_LIBS"
+      # NOTE: parquet is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_DATASET; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_DATASET"
+      PKG_LIBS="-larrow_dataset $PKG_LIBS"
+      # NOTE: arrow-dataset is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_ACERO; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_ACERO"
+      PKG_LIBS="-larrow_acero $PKG_LIBS"
+      # NOTE: arrow-acero is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_SUBSTRAIT; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_SUBSTRAIT"
+      PKG_LIBS="-larrow_substrait $PKG_LIBS"
+      # NOTE: arrow-substrait is assumed to have the same -L flag as arrow
+      # so there is no need to add its location to PKG_DIRS
+    fi
+    if arrow_built_with ARROW_JSON; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_JSON"
+    fi
+    if arrow_built_with ARROW_S3; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_S3"
+      PKG_LIBS="$PKG_LIBS $S3_LIBS"
+    fi
+    if arrow_built_with ARROW_GCS; then
+      PKG_CFLAGS="$PKG_CFLAGS -DARROW_R_WITH_GCS"
+      PKG_LIBS="$PKG_LIBS $GCS_LIBS"
+    fi
+  fi
+}
+
+arrow_built_with() {
+  # Function to check cmake options for features
+  grep -i 'set('"$1"' "ON")' $ARROW_OPTS_CMAKE >/dev/null 2>&1
+}
+
+##############
+# Main logic #
+##############
+
+# First, find or build libarrow
+if [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
+  do_bundled_build
+elif [ "$FORCE_AUTOBREW" = "true" ]; then
+  do_autobrew
+else
+  find_arrow
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # If we haven't found a suitable version of libarrow, build it
+    if [ "$UNAME" = "Darwin" ] && ! echo $VERSION | grep -q "000"; then
+      # Only autobrew on release version (for testing, use FORCE_AUTOBREW above)
+      # (dev versions end in .9000, and nightly gets something like .10000xxx)
+      do_autobrew
+    else
+      do_bundled_build
+    fi
+  fi

Review Comment:
   Is it overkill to call out this order for the envvars in the top-matter of this file? That `FORCE_BUNDLED_BUILD` takes precedence over `FORCE_AUTOBREW` which takes precedence over the natural order of things. 
   
   It's "obvious" when you know what those things do in detail to a certain extent, but not everyone has that in their heads.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1181982269


##########
r/configure:
##########
@@ -58,191 +100,282 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
+# Test if pkg-config is available to use
+if ${PKG_CONFIG} --version >/dev/null 2>&1; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
+  ARROW_USE_PKG_CONFIG="false"
 fi
 
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
-  ARROW_USE_PKG_CONFIG="false"
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  if brew --prefix openssl >/dev/null 2>&1; then
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  fi
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
-  if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --variable=prefix --silence-errors ${PKG_CONFIG_NAME}`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif brew --prefix ${PKG_BREW_NAME} > /dev/null 2>&1; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   else
-    PKG_CONFIG_AVAILABLE=false
-  fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+    _LIBARROW_FOUND="false"

Review Comment:
   If you get here, the next message you'll see is something about building libarrow, either with autobrew or something that says which directory the C++ source has been found in. I think it will be clear enough where we are in the flow, but I don't feel strongly about it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1529102218

   I checked a handful of the builds and they all showed the correct features enabled. We could update https://github.com/apache/arrow/blob/main/dev/tasks/macros.jinja#L278 and perhaps other places to ensure that features other than just parquet are enabled, like what I added to the conda packaging. But I'll wait for review before poking this further.
   
   The segfault in the pyarrow interaction (this time on flight tests) is happening a lot on the `R / AMD64 Ubuntu 20.04 R 4.2 Force-Tests true`, but it is also happening on `main`, so I don't think it is related.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531472870

   > I did get a build error after unsetting ARROW_HOME (see below); however, it worked beautifully on MacOS x86! I don't think my M1 problem should block this...this rewrite makes it much easier to fix if it does turn out to be a script issue.
   > 
   > I'll try on Windows and on a few suitable Ubuntu docker images tomorrow.
   > 
   > ```
   > unable to load shared object '/Users/deweydunnington/Library/R/arm64/4.2/library/00LOCK-r/00new/arrow/libs/arrow.so':
   >   dlopen(/Users/deweydunnington/Library/R/arm64/4.2/library/00LOCK-r/00new/arrow/libs/arrow.so, 0x0006): symbol not found in flat namespace (__ZN4absl12lts_2023012510FormatTimeENSt3__117basic_string_viewIcNS1_11char_traitsIcEEEENS0_4TimeENS0_8TimeZoneE)
   > Error: loading failed
   > Execution halted
   > ERROR: loading failed
   > * removing ‘/Users/deweydunnington/Library/R/arm64/4.2/library/arrow’
   > * restoring previous ‘/Users/deweydunnington/Library/R/arm64/4.2/library/arrow’
   > 
   > Exited with status 1.
   > ```
   
   That looks similar to the build failure I saw [here](https://github.com/ursacomputing/crossbow/actions/runs/4833911390/jobs/8614891010), where it appears that we are bundling absl because there is a (newer) version installed by homebrew but we require an exact version match (because absl is very unstable, it seems): https://github.com/ursacomputing/crossbow/actions/runs/4833911390/jobs/8614891010#step:13:416
   
   I don't understand well enough how the linker resolves things to understand how we would insist that we prefer the static symbols in `-larrow_bundled_dependencies` over shared libraries found in `-L/opt/homebrew/lib`. @kou would reordering the `-L` and `-l`s help? In that build, [PKG_LIBS](https://github.com/ursacomputing/crossbow/actions/runs/4833911390/jobs/8614891010#step:13:1638) is `-L/Users/voltrondata/tmp/RtmpQAJbFi/R.INSTALLca181e5538f5/arrow/libarrow/arrow-11.0.0.100000506/lib -L/opt/homebrew/lib -larrow /opt/homebrew/lib/libsnappy.1.1.10.dylib /opt/homebrew/lib/libre2.10.0.0.dylib /Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/lib/libbz2.tbd /opt/homebrew/lib/libaws-cpp-sdk-config.dylib /opt/homebrew/lib/libaws-cpp-sdk-transfer.dylib /opt/homebrew/lib/libaws-cpp-sdk-identity-management.dylib /opt/homebrew/lib/libaws-cpp-sdk-cognito-identity.dylib /opt/homebrew/lib/libaws-cpp-sdk-sts.dylib /opt/homebrew/lib/libaws-cpp-sdk-s3.dylib /opt/homebrew/lib/libaws-cpp
 -sdk-core.dylib -larrow_bundled_dependencies -R/opt/homebrew/lib -lbrotlidec -R/opt/homebrew/lib -lbrotlienc -lthrift -lz -llz4 -lzstd -lutf8proc`
   
   In any case, I doubt this is a regression introduced in this PR but I could be wrong.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #35147:
URL: https://github.com/apache/arrow/pull/35147#discussion_r1181987458


##########
r/configure:
##########
@@ -58,191 +100,282 @@ if [ ! "`${R_HOME}/bin/R CMD config CXX17`" ]; then
   exit 1
 fi
 
-if [ -f "tools/apache-arrow.rb" ]; then
-  # If you want to use a local apache-arrow.rb formula, do
-  # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
-  # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository)
-  cp tools/autobrew .
-  if [ "$FORCE_AUTOBREW" != "false" ]; then
-    # It is possible to turn off forced autobrew if the formula is included,
-    # but most likely you shouldn't because the included formula will reference
-    # the C++ library at the version that matches the R package.
-    FORCE_AUTOBREW="true"
-  fi
+# Test if pkg-config is available to use
+if ${PKG_CONFIG} --version >/dev/null 2>&1; then
+  PKG_CONFIG_AVAILABLE="true"
+else
+  PKG_CONFIG_AVAILABLE="false"
+  ARROW_USE_PKG_CONFIG="false"
 fi
 
-if [ "$FORCE_AUTOBREW" = "true" ] || [ "$FORCE_BUNDLED_BUILD" = "true" ]; then
-  ARROW_USE_PKG_CONFIG="false"
+# find openssl on macos. macOS ships with libressl. openssl is installable
+# with brew, but it is generally not linked. We can over-ride this and find
+# openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
+# the installation process). FWIW, arrow's cmake process uses this
+# same process to find openssl, but doing it now allows us to catch it in
+# nixlibs.R and throw a nicer error.
+if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ] && [ "`command -v brew`" ]; then
+  if brew --prefix openssl >/dev/null 2>&1; then
+    export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
+    export PKG_CONFIG_PATH="${OPENSSL_ROOT_DIR}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  fi
 fi
 
-S3_LIBS=""
-GCS_LIBS=""
-# Note that cflags may be empty in case of success
-if [ "$ARROW_HOME" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-  echo "*** Using ARROW_HOME as the source of libarrow"
-  PKG_CFLAGS="-I$ARROW_HOME/include $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  LIB_DIR="$ARROW_HOME/lib"
-  PKG_DIRS="-L$LIB_DIR"
-elif [ "$INCLUDE_DIR" ] && [ "$LIB_DIR" ]; then
-  echo "*** Using INCLUDE_DIR/LIB_DIR as the source of libarrow"
-  PKG_CFLAGS="-I$INCLUDE_DIR $PKG_CFLAGS"
-  PKG_LIBS="-larrow"
-  PKG_DIRS="-L$LIB_DIR"
-else
-  # Use pkg-config to find libarrow if available and allowed
-  pkg-config --version >/dev/null 2>&1
-  if [ $? -eq 0 ]; then
-    PKG_CONFIG_AVAILABLE=true
+#############
+# Functions #
+#############
+
+# This function looks in a few places for libarrow on the system already.
+# If the found library version is not compatible with the R package,
+# it won't be used.
+find_arrow () {
+  # Preserve original PKG_CONFIG_PATH. We'll add ${LIB_DIR}/pkgconfig to it if needed
+  OLD_PKG_CONFIG_PATH="${PKG_CONFIG_PATH}"
+
+  if [ "$ARROW_HOME" ] && [ -d "$ARROW_HOME" ]; then
+    # 1. ARROW_HOME is a directory you've built and installed libarrow into.
+    #    If the env var is set, we use it
+    _LIBARROW_FOUND="${ARROW_HOME}"
+    echo "*** Trying Arrow C++ in ARROW_HOME: $_LIBARROW_FOUND"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+  elif [ "$ARROW_USE_PKG_CONFIG" != "false" ] && ${PKG_CONFIG} ${PKG_CONFIG_NAME}; then
+    # 2. Use pkg-config to find arrow on the system
+    _LIBARROW_FOUND="`${PKG_CONFIG} --variable=prefix --silence-errors ${PKG_CONFIG_NAME}`"
+    echo "*** Trying Arrow C++ found by pkg-config: $_LIBARROW_FOUND"
+  elif brew --prefix ${PKG_BREW_NAME} > /dev/null 2>&1; then
+    # 3. On macOS, look for Homebrew apache-arrow
+    #    (note that if you have pkg-config, homebrew arrow may have already been found)
+    _LIBARROW_FOUND=`brew --prefix ${PKG_BREW_NAME}`
+    echo "*** Trying Arrow C++ found by Homebrew: ${_LIBARROW_FOUND}"
+    export PKG_CONFIG_PATH="${_LIBARROW_FOUND}/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
   else
-    PKG_CONFIG_AVAILABLE=false
-  fi
-  if [ "$PKG_CONFIG_AVAILABLE" = "true" ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then
-    # Set the search paths and compile flags
-    PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_LIBS=`pkg-config --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
-    PKGCONFIG_DIRS=`pkg-config --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+    _LIBARROW_FOUND="false"
   fi
 
-  if [ "$PKGCONFIG_LIBS" != "" ]; then
-    FOUND_LIB_DIR=`echo $PKGCONFIG_DIRS | sed -e 's/^-L//'`
-    echo "*** Arrow C++ libraries found via pkg-config at $FOUND_LIB_DIR"
-    PKG_CFLAGS="$PKGCONFIG_CFLAGS $PKG_CFLAGS"
-    PKG_LIBS="${PKGCONFIG_LIBS}"
-    PKG_DIRS="${PKGCONFIG_DIRS}"
-    LIB_DIR=${FOUND_LIB_DIR}
-
-    # Check for version mismatch
-    PC_LIB_VERSION=`pkg-config --modversion arrow`
-    echo $PC_LIB_VERSION | grep -e 'SNAPSHOT$' >/dev/null 2>&1
-    # If on a release (i.e. not SNAPSHOT) and version != R package version, warn
-    if [ $? -eq 1 ] && [ "$PC_LIB_VERSION" != "$VERSION" ]; then
-      echo "**** Warning: library version mismatch"
-      echo "**** C++ is $PC_LIB_VERSION but R is $VERSION"
-      echo "**** If installation fails, upgrade the C++ library to match"
-      echo "**** or retry with ARROW_USE_PKG_CONFIG=false"
+  if [ "$_LIBARROW_FOUND" != "false" ]; then
+    # We found a library, so check for version mismatch
+    if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+      PC_LIB_VERSION=`${PKG_CONFIG} --modversion ${PKG_CONFIG_NAME}`
+    else
+      PC_LIB_VERSION=`grep '^Version' ${_LIBARROW_FOUND}/lib/pkgconfig/arrow.pc | sed s/Version:\ //`
     fi
-  else
-    if [ "$UNAME" = "Darwin" ] && [ "$FORCE_BUNDLED_BUILD" != "true" ]; then
-      if [ "$FORCE_AUTOBREW" != "true" ] && [ "`command -v brew`" ] && [ "`brew ls --versions ${PKG_BREW_NAME}`" != "" ]; then
-        echo "*** Using Homebrew ${PKG_BREW_NAME}"
-        BREWDIR=`brew --prefix`
-        PKG_LIBS="-larrow -larrow_bundled_dependencies"
-        PKG_DIRS="-L$BREWDIR/opt/$PKG_BREW_NAME/lib $PKG_DIRS"
-        PKG_CFLAGS="-I$BREWDIR/opt/$PKG_BREW_NAME/include $PKG_CFLAGS"
+    # This is in an R script for convenience and testability. 
+    # Success means the found C++ library is ok to use.
+    # Error means the versions don't line up and we shouldn't use it.
+    # More specific messaging to the user is in the R script
+    if ! ${R_HOME}/bin/Rscript tools/check-versions.R $VERSION $PC_LIB_VERSION 2> /dev/null; then
+      _LIBARROW_FOUND="false"
+    fi
+  fi
+
+  if [ "$_LIBARROW_FOUND" = "false" ]; then
+    # We didn't find a suitable library, so reset the pkg-config search path
+    export PKG_CONFIG_PATH="${OLD_PKG_CONFIG_PATH}"
+  fi
+}
+
+do_bundled_build () {
+  ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
+
+  # Handle a few special cases, using what we know about the bundled build
+  # and our ability to make edits to it since we "own" it.
+  _LIBARROW_FOUND="`pwd`/libarrow/arrow-${VERSION}"
+  LIB_DIR="${_LIBARROW_FOUND}/lib"
+  if [ -d "$LIB_DIR" ]; then
+    if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
+      # Use pkg-config to do static linking of libarrow's dependencies
+      export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
+      # pkg-config on CentOS 7 doesn't have --define-prefix option.
+      if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
+        # --define-prefix is for binary packages. Binary packages
+        # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
+        # match the extracted path. --define-prefix uses a directory
+        # that arrow.pc exists as its prefix instead of
+        # "/arrow/r/libarrow/dist".
+        PKG_CONFIG="${PKG_CONFIG} --define-prefix"
       else
-        echo "*** Downloading ${PKG_BREW_NAME}"
-        if [ -f "autobrew" ]; then
-          echo "**** Using local manifest for ${PKG_BREW_NAME}"
-        else
-          curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew
-          if [ $? -ne 0 ]; then
-            echo "Failed to download manifest for ${PKG_BREW_NAME}"
-          fi
-        fi
-        . autobrew
-        if [ $? -ne 0 ]; then
-          echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
-        fi
-        # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+        # Rewrite prefix= in arrow.pc on CentOS 7.
+        sed \
+          -i.bak \
+          -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
+          ${LIB_DIR}/pkgconfig/*.pc
+        rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
       fi
     else
-      if [ "${NOT_CRAN}" = "true" ]; then
-        # Set some default values
-        if [ "${LIBARROW_BINARY}" = "" ]; then
-          export LIBARROW_BINARY=true
-        fi
-        if [ "${LIBARROW_MINIMAL}" = "" ]; then
-          export LIBARROW_MINIMAL=false
-        fi
-      fi
+      # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
+      # These would be identified by pkg-config, in Requires.private and Libs.private.
+      # Rather than try to re-implement pkg-config, we can just hard-code them here.
+      S3_LIBS="-lcurl -lssl -lcrypto"
+      GCS_LIBS="-lcurl -lssl -lcrypto"
+    fi
+  else
+    # If the library directory does not exist, the script must not have been successful
+    _LIBARROW_FOUND="false"
+  fi
+}
 
-      # find openssl on macos. macOS ships with libressl. openssl is installable
-      # with brew, but it is generally not linked. We can over-ride this and find
-      # openssl but setting OPENSSL_ROOT_DIR (which cmake will pick up later in
-      # the installation process). FWIW, arrow's cmake process uses this
-      # same process to find openssl, but doing it now allows us to catch it in
-      # nixlibs.R and throw a nicer error.
-      if [ "$UNAME" = "Darwin" ] && [ "${OPENSSL_ROOT_DIR}" = "" ]; then
-        brew --prefix openssl >/dev/null 2>&1
-        if [ $? -eq 0 ]; then
-          export OPENSSL_ROOT_DIR="`brew --prefix openssl`"
-          export PKG_CONFIG_PATH="`brew --prefix openssl`/lib/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-        fi
-      fi
+do_autobrew () {
+  echo "*** Downloading ${PKG_BREW_NAME}"
 
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "" ]; then
-        export ARROW_DEPENDENCY_SOURCE=AUTO
-      fi
-      if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ] && \
-           [ "${PKG_CONFIG_AVAILABLE}" = "false" ]; then
-        export ARROW_DEPENDENCY_SOURCE=BUNDLED
-        echo "**** pkg-config not installed, setting ARROW_DEPENDENCY_SOURCE=BUNDLED"
-      fi
+  # Setup for local autobrew testing
+  if [ -f "tools/apache-arrow.rb" ]; then
+    # If you want to use a local apache-arrow.rb formula, do
+    # $ cp ../dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb tools/apache-arrow.rb
+    # before R CMD build or INSTALL (assuming a local checkout of the apache/arrow repository).
+    # If you have this, you should use the local autobrew script so they match.
+    cp tools/autobrew .
+  fi
 
-      ${R_HOME}/bin/Rscript tools/nixlibs.R $VERSION
-
-      LIB_DIR="`pwd`/libarrow/arrow-${VERSION}/lib"
-      if [ -d "$LIB_DIR" ]; then
-        if [ "${PKG_CONFIG_AVAILABLE}" = "true" ]; then
-          # Use pkg-config to do static linking of libarrow's dependencies
-          export PKG_CONFIG_PATH="${LIB_DIR}/pkgconfig${PKG_CONFIG_PATH:+:${PKG_CONFIG_PATH}}"
-          PKG_CONFIG="pkg-config"
-          # pkg-config on CentOS 7 doesn't have --define-prefix option.
-          if ${PKG_CONFIG} --help | grep -- --define-prefix >/dev/null 2>&1; then
-            # --define-prefix is for binary packages. Binary packages
-            # uses "/arrow/r/libarrow/dist" as prefix but it doesn't
-            # match the extracted path. --define-prefix uses a directory
-            # that arrow.pc exists as its prefix instead of
-            # "/arrow/r/libarrow/dist".
-            PKG_CONFIG="${PKG_CONFIG} --define-prefix"
-          else
-            # Rewrite prefix= in arrow.pc on CentOS 7.
-            sed \
-              -i.bak \
-              -e "s,prefix=/arrow/r/libarrow/dist,prefix=${LIB_DIR}/..,g" \
-              ${LIB_DIR}/pkgconfig/*.pc
-            rm -f ${LIB_DIR}/pkgconfig/*.pc.bak
-          fi
-          PKG_CONFIG="${PKG_CONFIG} --silence-errors"
-          PKG_CFLAGS="`${PKG_CONFIG} --cflags ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
-          PKG_DIRS="`${PKG_CONFIG} --libs-only-L ${PKG_CONFIG_NAME}`"
-          PKG_LIBS="`${PKG_CONFIG} --libs-only-l --libs-only-other ${PKG_CONFIG_NAME}`"
-        else
-          # This case must be ARROW_DEPENDENCY_SOURCE=BUNDLED.
-          PKG_CFLAGS="-I${LIB_DIR}/../include $PKG_CFLAGS"
-          if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
-            PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
-          fi
-          PKG_DIRS="-L${LIB_DIR}"
-          if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
-            PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
-          fi
-          PKG_LIBS="-larrow"
-          if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
-            PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
-          fi
-          S3_LIBS="-lcurl -lssl -lcrypto"
-          GCS_LIBS="-lcurl -lssl -lcrypto"
-        fi
-      fi
+  if [ -f "autobrew" ]; then
+    echo "**** Using local manifest for ${PKG_BREW_NAME}"
+  else
+    if ! curl -sfL "https://autobrew.github.io/scripts/$PKG_BREW_NAME" > autobrew; then
+      echo "Failed to download manifest for ${PKG_BREW_NAME}"
+      # Fall back to the local copy
+      cp tools/autobrew .
     fi
   fi
-fi
+  if ! . autobrew; then
+    echo "Failed to retrieve binary for ${PKG_BREW_NAME}"
+  fi
+  # autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+  # TODO: move PKG_LIBS and PKG_CFLAGS out of autobrew and use set_pkg_vars
+}
 
-# If on Raspberry Pi, need to manually link against latomic
-# See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
-if grep raspbian /etc/os-release >/dev/null 2>&1; then
-  PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
-  PKG_LIBS="-latomic $PKG_LIBS"
+# Once libarrow is obtained, this function sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
+# either from pkg-config or by inferring things about the directory in $1
+set_pkg_vars () {
+  LIB_DIR="$1/lib"
+  if [ "$PKG_CONFIG_AVAILABLE" = "true" ]; then
+    set_pkg_vars_with_pc
+  else
+    set_pkg_vars_without_pc $1
+  fi
+
+  # Set any user-defined CXXFLAGS
+  if [ "$ARROW_R_CXXFLAGS" ]; then
+    PKG_CFLAGS="$PKG_CFLAGS $ARROW_R_CXXFLAGS"
+  fi
+
+  # Finally, check cmake options for enabled features
+  add_feature_flags
+}
+
+# If we have pkg-config, it will tell us what libarrow needs
+set_pkg_vars_with_pc () {
+  PKG_CFLAGS="`${PKG_CONFIG} --cflags --silence-errors ${PKG_CONFIG_NAME}` $PKG_CFLAGS"
+  PKG_LIBS=`${PKG_CONFIG} --libs-only-l --libs-only-other --silence-errors ${PKG_CONFIG_NAME}`
+  PKG_DIRS=`${PKG_CONFIG} --libs-only-L --silence-errors ${PKG_CONFIG_NAME}`
+}
+
+# If we don't have pkg-config, we can make some inferences
+set_pkg_vars_without_pc () {
+  PKG_CFLAGS="-I$1/include $PKG_CFLAGS"
+  if grep -q "_GLIBCXX_USE_CXX11_ABI=0" "${LIB_DIR}/pkgconfig/arrow.pc"; then
+    PKG_CFLAGS="${PKG_CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
+  fi
+  PKG_DIRS="-L${LIB_DIR}"
+  if [ "${OPENSSL_ROOT_DIR}" != "" ]; then
+    PKG_DIRS="${PKG_DIRS} -L${OPENSSL_ROOT_DIR}/lib"
+  fi
+  PKG_LIBS="-larrow"
+  if [ -n "$(find "$LIB_DIR" -name 'libarrow_bundled_dependencies.*')" ]; then
+    PKG_LIBS="$PKG_LIBS -larrow_bundled_dependencies"
+  fi
+
+  # If on Raspberry Pi, need to manually link against latomic
+  # See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 for similar example
+  # pkg-config will handle this for us automatically, see ARROW-6312
+  if grep raspbian /etc/os-release >/dev/null 2>&1; then
+    PKG_CFLAGS="$PKG_CFLAGS -DARROW_CXXFLAGS=-latomic"
+    PKG_LIBS="-latomic $PKG_LIBS"

Review Comment:
   I think that we need to APPEND `-latomic` not PREPEND because `-larrow` needs `-latomic`:
   
   ```suggestion
       PKG_LIBS="$PKG_LIBS -latomic"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] Rewrite configure script and ensure we don't use mismatched libarrow

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1531523689

   > Ah, wait. The job was passed at [#35147 (comment)](https://github.com/apache/arrow/pull/35147#issuecomment-1529017959) . It may not be related to `-I`. I'll take a look at this.
   
   It passed there because I changed it not to use the bundled build. That said, it didn't pick up the wrong absl, so maybe there's still something to learn from it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1519073377

   Revision: 1bc1a0132178aff5ce942a18f7a62bda4a689454
   
   Submitted crossbow builds: [ursacomputing/crossbow @ actions-02dfd6461d](https://github.com/ursacomputing/crossbow/branches/all?query=actions-02dfd6461d)
   
   |Task|Status|
   |----|------|
   |conda-linux-aarch64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-02dfd6461d-azure-conda-linux-aarch64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/12956093116)|
   |conda-linux-x64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-02dfd6461d-azure-conda-linux-x64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/12956092966)|
   |conda-osx-arm64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-02dfd6461d-azure-conda-osx-arm64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/12956092444)|
   |conda-osx-x64-cpu-r42|[![Azure](https://dev.azure.com/ursacomputing/crossbow/_apis/build/status/ursacomputing.crossbow?branchName=actions-02dfd6461d-azure-conda-osx-x64-cpu-r42)](https://github.com/ursacomputing/crossbow/runs/12956092613)|


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] nealrichardson commented on pull request #35147: GH-35140: [R] configure script needs LIB_DIR to find ArrowOptions.cmake

Posted by "nealrichardson (via GitHub)" <gi...@apache.org>.
nealrichardson commented on PR #35147:
URL: https://github.com/apache/arrow/pull/35147#issuecomment-1519072895

   @github-actions crossbow submit conda-linux-*-r4* conda-osx-*-r4*


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org