You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/03 16:39:44 UTC

[GitHub] [arrow] rolweber opened a new issue #10226: [R] Install from CRAN started to fail

rolweber opened a new issue #10226:
URL: https://github.com/apache/arrow/issues/10226


   Hello,
   
   I'm building container images with conda environments that include both pyarrow, and R arrow from CRAN. The builds were stable for several weeks. Then there were problems two weeks ago, which eventually resolved (see [ARROW-12502](https://issues.apache.org/jira/browse/ARROW-12502) in JIRA). Today, builds are breaking again, not related to the previous problem afaict. I'm looking for advice to
   1. Get the builds working again.
   2. Make the installation less fragile for the future.
   
   On Linux x86, I'm installing first pyarrow 3.0.0 from PyPI, which uses a wheel with pre-built native libs. Then I'm installing Arrow 3.0.0 from CRAN, which tries to build its own native libs, I think. Then I'm running a few unit tests to make sure that both pyarrow and R arrow are working, and can exchange data. When there are problems with installing R from CRAN, the build doesn't necessarily fail at that step, but only later in the unit tests. That's what's happening today.
   
   I've set ARROW_R_DEV=true to get some information about the installation problem, as the default output doesn't even print an error message. This is the problem today (builds were still working last Friday):
   ```txt
   -- thrift_ep configure command succeeded.  See also /tmp/RtmpE3CJCi/file19d083c11a8/thrift_ep-prefix/src/thrift_ep-stamp/thrift_ep-configure-*.log
   [ 50%] Performing build step for 'thrift_ep'
   -- thrift_ep build command succeeded.  See also /tmp/RtmpE3CJCi/file19d083c11a8/thrift_ep-prefix/src/thrift_ep-stamp/thrift_ep-build-*.log
   [ 51%] Performing install step for 'thrift_ep'
   CMake Error at /tmp/RtmpE3CJCi/file19d083c11a8/thrift_ep-prefix/src/thrift_ep-stamp/thrift_ep-install-RELEASE.cmake:37 (message):
     Command failed: 2
      'make' 'install'
     See also
       /tmp/RtmpE3CJCi/file19d083c11a8/thrift_ep-prefix/src/thrift_ep-stamp/thrift_ep-install-*.log
   -- stdout output is:
   -- stderr output is:
   make[3]: *** No rule to make target 'install'.  Stop.
   CMake Error at /tmp/RtmpE3CJCi/file19d083c11a8/thrift_ep-prefix/src/thrift_ep-stamp/thrift_ep-install-RELEASE.cmake:47 (message):
     Stopping after outputting logs.
   make[2]: *** [CMakeFiles/thrift_ep.dir/build.make:93: thrift_ep-prefix/src/thrift_ep-stamp/thrift_ep-install] Error 1
   make[1]: *** [CMakeFiles/Makefile2:758: CMakeFiles/thrift_ep.dir/all] Error 2
   gmake: *** [Makefile:160: all] Error 2
   + popd
   /tmp/RtmpFiYDeK/R.INSTALL19b11d5af287/arrow
   ------------------------- NOTE ---------------------------
   See https://arrow.apache.org/docs/r/articles/install.html
   for help installing Arrow C++ libraries
   ```
   
   I'm also building images for Linux on Power (ppc64le). There, I couldn't install pyarrow from PyPI, because there are no wheels for that platform, and the source compilation failed. I eventually built a custom version of conda packages pyarrow and arrow-cpp. Then I'm installing Arrow from CRAN. This is still working today. I've enabled debug output here as well, to compare with x86. This is what I see there:
   ```txt
   inst/build_arrow_static.sh: line 54: /tmp/RtmpR54GjL/file28b55b5e1923/cmake-3.19.2-Linux-x86_64/bin/cmake: cannot execute binary file: Exec format error
   + /tmp/RtmpR54GjL/file28b55b5e1923/cmake-3.19.2-Linux-x86_64/bin/cmake --build . --target install
   inst/build_arrow_static.sh: line 84: /tmp/RtmpR54GjL/file28b55b5e1923/cmake-3.19.2-Linux-x86_64/bin/cmake: cannot execute binary file: Exec format error
   + popd
   /tmp/RtmpfCx9q8/R.INSTALL28965204fdde/arrow
   PKG_CFLAGS=-I/tmp/RtmpfCx9q8/R.INSTALL28965204fdde/arrow/libarrow/arrow-3.0.0/include  -DARROW_R_WITH_ARROW
   PKG_LIBS=-larrow_dataset -lparquet -larrow
   ** libs
   ```
   So `cmake` is not even running on that platform, yet I get Docker images that work and pass the unit tests. Apparently, the native libs from the arrow-cpp conda package are found automatically, and satisfy whatever the R installation needs to compile its `arrow.so` library.
   
   Ideally, I'd want the native libs from the PyPI wheel to be used by R arrow on the x86 platform. But symlinking the files into the lib directory where arrow-cpp puts them on ppc64le didn't do the trick.
   
   Is there a way to tell R Arrow to use existing libs, and where those libs are located?
   If I have to install from a source tarball instead of CRAN, that would be OK. I'm more concerned about robustness than comfort.
   
   I'll try to collect more information about the thrift_ep problem. Because the installation does not fail, temporary files get removed. Maybe the problem even auto-resolves in a day or two, just as suddenly as it has appeared. But I'm afraid this wasn't the last time that something breaks during the R arrow installation, so I'd prefer to reduce the number of things that need to be downloaded and compiled at installation time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson edited a comment on issue #10226: [R] Install from CRAN started to fail

Posted by GitBox <gi...@apache.org>.
nealrichardson edited a comment on issue #10226:
URL: https://github.com/apache/arrow/issues/10226#issuecomment-831396026


   If thrift just started to fail to build, it's probably because bintray was shut down on May 1 (thrift requires boost, boost was hosted on bintray). Upgrading to arrow 4.0 indeed will fix that as we no longer refer to bintray anywhere. 
   
   pyarrow wheels are built using the old C++ ABI, so if you want to try to link R to the libarrow that ships in wheels, you'll need to set the env var `ARROW_USE_OLD_ABI` (https://github.com/apache/arrow/blob/master/r/configure#L185-L188). I haven't done this in a while (tried to link to pyarrow's libarrow) so I can't promise that it works 100%, but that flag is definitely required. 
   
   Re: ppc64le, we have an open issue about it: https://issues.apache.org/jira/browse/ARROW-12085. Please comment there if you have any insight.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rolweber closed issue #10226: [R] Install from CRAN started to fail

Posted by GitBox <gi...@apache.org>.
rolweber closed issue #10226:
URL: https://github.com/apache/arrow/issues/10226


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rolweber commented on issue #10226: [R] Install from CRAN started to fail

Posted by GitBox <gi...@apache.org>.
rolweber commented on issue #10226:
URL: https://github.com/apache/arrow/issues/10226#issuecomment-831420530


   Thanks, that's very helpful!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rolweber commented on issue #10226: [R] Install from CRAN started to fail

Posted by GitBox <gi...@apache.org>.
rolweber commented on issue #10226:
URL: https://github.com/apache/arrow/issues/10226#issuecomment-831383728


   Grmpf... not such a fine solution as I thought. pyarrow from Anaconda is built against an old version of grpc-cpp which doesn't allow to disable server cert verification. That's something which pyarrow from PyPI supports, and which my team needs for tests.
   
   So I'm still interested in a way to let Arrow from CRAN use the libraries that come with a pyarrow wheel from PyPI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson closed issue #10226: [R] Install from CRAN started to fail

Posted by GitBox <gi...@apache.org>.
nealrichardson closed issue #10226:
URL: https://github.com/apache/arrow/issues/10226


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on issue #10226: [R] Install from CRAN started to fail

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on issue #10226:
URL: https://github.com/apache/arrow/issues/10226#issuecomment-831396026


   If thrift just started to fail to build, it's probably because bintray was shut down on May 1 (thrift requires boost, boost was hosted on bintray). Upgrading to 4.0 indeed will fix that. 
   
   pyarrow wheels are built using the old C++ ABI, so if you want to try to link R to the libarrow that ships in wheels, you'll need to set the env var `ARROW_USE_OLD_ABI` (https://github.com/apache/arrow/blob/master/r/configure#L185-L188). I haven't done this in a while (tried to link to pyarrow's libarrow) so I can't promise that it works 100%, but that flag is definitely required. 
   
   Re: ppc64le, we have an open issue about it: https://issues.apache.org/jira/browse/ARROW-12085. Please comment there if you have any insight.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson edited a comment on issue #10226: [R] Install from CRAN started to fail

Posted by GitBox <gi...@apache.org>.
nealrichardson edited a comment on issue #10226:
URL: https://github.com/apache/arrow/issues/10226#issuecomment-831396026


   If thrift just started to fail to build, it's probably because bintray was shut down on May 1 (thrift requires boost, boost was hosted on bintray). Upgrading to arrow 4.0 indeed will fix that. 
   
   pyarrow wheels are built using the old C++ ABI, so if you want to try to link R to the libarrow that ships in wheels, you'll need to set the env var `ARROW_USE_OLD_ABI` (https://github.com/apache/arrow/blob/master/r/configure#L185-L188). I haven't done this in a while (tried to link to pyarrow's libarrow) so I can't promise that it works 100%, but that flag is definitely required. 
   
   Re: ppc64le, we have an open issue about it: https://issues.apache.org/jira/browse/ARROW-12085. Please comment there if you have any insight.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rolweber commented on issue #10226: Install from CRAN started to fail

Posted by GitBox <gi...@apache.org>.
rolweber commented on issue #10226:
URL: https://github.com/apache/arrow/issues/10226#issuecomment-831272919


   I know that Arrow 4.0.0 has been released last week. But upgrading is currently not an option for me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rolweber commented on issue #10226: [R] Install from CRAN started to fail

Posted by GitBox <gi...@apache.org>.
rolweber commented on issue #10226:
URL: https://github.com/apache/arrow/issues/10226#issuecomment-831359793


   I found a way out. When I install the pyarrow package from Anaconda instead of the one from PyPI, R arrow uses the provided native libraries. The conda package pulls in about two dozen dependencies, but that's preferable to chasing more build breaks.
   
   Thanks for letting me vent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org