You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by ko...@apache.org on 2019/05/24 22:13:53 UTC

[arrow] branch master updated: ARROW-5332: [R] Update R package README with richer installation instructions

This is an automated email from the ASF dual-hosted git repository.

kou pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 184b8de  ARROW-5332: [R] Update R package README with richer installation instructions
184b8de is described below

commit 184b8deb651c6f6308c0fa2a595f5a40f5da8ce8
Author: Neal Richardson <ne...@gmail.com>
AuthorDate: Sat May 25 07:13:37 2019 +0900

    ARROW-5332: [R] Update R package README with richer installation instructions
    
    These extra flags got dropped in https://github.com/apache/arrow/commit/a44d1f553a3d1716dfafa7bb02753b8c4ed13d3f#diff-2816240325a8e9e6df50fb7c1faa8113R21. This change resolves the issue for me locally on macOS, and I'll confirm that it fixes the Ubuntu build for sparklyr and that it doesn't break our Travis build.
    
    Author: Neal Richardson <ne...@gmail.com>
    
    Closes #4322 from nealrichardson/r-pkg-rpath and squashes the following commits:
    
    132bd152 <Neal Richardson> https
    baa16156 <Neal Richardson> Tweak C++ debug instructions
    0cdd141b <Neal Richardson> Merge upstream/master
    e21732ac <Neal Richardson> Revise README to include more thorough installation and other dev instructions
---
 r/DESCRIPTION |  12 +++--
 r/README.Rmd  |  84 ++++++++++++++++++++++----------
 r/README.md   | 151 +++++++++++++++++++++++++++++++++++-----------------------
 3 files changed, 158 insertions(+), 89 deletions(-)

diff --git a/r/DESCRIPTION b/r/DESCRIPTION
index c48a1f0..2f0adc6 100644
--- a/r/DESCRIPTION
+++ b/r/DESCRIPTION
@@ -1,5 +1,5 @@
 Package: arrow
-Title: R Integration to 'Apache' 'Arrow'
+Title: Integration to 'Apache' 'Arrow'
 Version: 0.13.0.9000
 Authors@R: c(
     person("Romain", "François", email = "romain@rstudio.com", role = c("aut", "cre")),
@@ -8,7 +8,11 @@ Authors@R: c(
     person("Jeroen", "Ooms", email = "jeroen@berkeley.edu", role = c("aut")),
     person("Apache Arrow", email = "dev@arrow.apache.org", role = c("aut", "cph"))
   )
-Description: R Integration to 'Apache' 'Arrow'.
+Description: 'Apache' 'Arrow' <https://arrow.apache.org/> is a cross-language
+    development platform for in-memory data. It specifies a standardized
+    language-independent columnar memory format for flat and hierarchical data,
+    organized for efficient analytic operations on modern hardware. This
+    package provides an interface to the Arrow C++ library.
 Depends: R (>= 3.1)
 License: Apache License (>= 2.0)
 Encoding: UTF-8
@@ -31,8 +35,10 @@ Imports:
 Roxygen: list(markdown = TRUE)
 RoxygenNote: 6.1.1
 Suggests:
+    rmarkdown,
+    roxygen2,
     testthat,
-    lubridate, 
+    lubridate,
     hms
 Collate:
     'enums.R'
diff --git a/r/README.Rmd b/r/README.Rmd
index b5365c2..af20340 100644
--- a/r/README.Rmd
+++ b/r/README.Rmd
@@ -1,5 +1,7 @@
 ---
-output: github_document
+output:
+  github_document:
+    html_preview: false
 ---
 
 <!-- README.md is generated from README.Rmd. Please edit that file -->
@@ -16,48 +18,77 @@ knitr::opts_chunk$set(
 
 [![conda-forge](https://img.shields.io/conda/vn/conda-forge/r-arrow.svg)](https://anaconda.org/conda-forge/r-arrow)
 
-R integration with Apache Arrow.
+[Apache Arrow](https://arrow.apache.org/) is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication.
+
+The `arrow` package exposes an interface to the Arrow C++ library.
 
 ## Installation
 
-You first need to install the C++ library:
+Installing `arrow` is a two-step process: first install the Arrow C++ library, then the R package.
 
-### macOS
+### Release version
 
-On macOS, you may install using homebrew:
+The current release of Apache Arrow is 0.13.
 
-```
+On macOS, you may install the C++ library using [Homebrew](https://brew.sh/):
+
+```shell
 brew install apache-arrow
 ```
 
-### From source
+On Linux, see the [Arrow project installation page](https://arrow.apache.org/install/).
+
+Then, you can install the release version of the package from GitHub using the [`remotes`](https://remotes.r-lib.org/) package. From within an R session,
+
+```r
+# install.packages("remotes") # Or install "devtools", which includes remotes
+remotes::install_github("apache/arrow/r", ref="76e1bc5dfb9d08e31eddd5cbcc0b1bab934da2c7")
+```
 
-First install a release build of the C++ bindings to arrow.
+or if you prefer to stay at the command line,
 
 ```shell
-git clone https://github.com/apache/arrow.git
-cd arrow/cpp && mkdir release && cd release
+R -e 'remotes::install_github("apache/arrow/r", ref="76e1bc5dfb9d08e31eddd5cbcc0b1bab934da2c7")'
+```
+
+### Development version
+
+First, clone the repository and install a release build of the C++ library.
 
-# It is important to statically link to boost libraries
-cmake .. -DARROW_PARQUET=ON -DCMAKE_BUILD_TYPE=Release -DARROW_BOOST_USE_SHARED:BOOL=Off
+```shell
+git clone https://github.com/apache/arrow.git
+mkdir arrow/cpp/build && cd arrow/cpp/build
+cmake .. -DARROW_PARQUET=ON -DARROW_BOOST_USE_SHARED:BOOL=Off -DARROW_INSTALL_NAME_RPATH=OFF
 make install
 ```
 
-## Then the R package
+Then, you can install the R package and its dependencies from the git checkout:
+
+```shell
+cd ../../r
+R -e 'install.packages("remotes"); remotes::install_deps()'
+R CMD INSTALL .
+```
 
-### From github (defaults to master branch)
+If the package fails to install/load with an error like this:
 
-```r
-library("devtools")
-devtools::install_github("apache/arrow/r")
 ```
+** testing if installed package can be loaded from temporary location
+Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...):
+unable to load shared object '/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so':
+dlopen(/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so, 6): Library not loaded: @rpath/libarrow.14.dylib
+```
+
+try setting the environment variable `LD_LIBRARY_PATH` (or `DYLD_LIBRARY_PATH` on macOS) to wherever Arrow C++ was put in `make install`, e.g. `export LD_LIBRARY_PATH=/usr/local/lib`, and retry installing the R package.
+
+For any other build/configuration challenges, see the [C++ developer guide](https://arrow.apache.org/docs/developers/cpp.html#building).
 
 ## Example
 
 ```{r}
 library(arrow)
 
-(tib <- tibble::tibble(x = 1:10, y = rnorm(10)))
+tib <- tibble::tibble(x = 1:10, y = rnorm(10))
 tab <- table(tib)
 tab$schema
 tab
@@ -66,20 +97,21 @@ as_tibble(tab)
 
 ## Developing
 
-The arrow package is using devtools for package related manipulations. To install:
+It is recommended to use the `devtools` package to help with some package-related manipulations. You'll need a few more R packages, which you can install using
 
 ```r
-library.packages("devtools")
+install.packages("devtools")
+devtools::install_dev_deps()
 ```
 
-Within a local clone, one can test changes by running the test suite.
-
-### Running the test suite
+### Useful functions
 
 ```r
-library("devtools")
-devtools::install_dev_deps()
-devtools::test()
+devtools::load_all() # Load the dev package
+devtools::test(filter="^regexp$") # Run the test suite, optionally filtering file names
+devtools::document() # Update roxygen documentation
+rmarkdown::render("README.Rmd") # To rebuild README.md
+devtools::check() # All package checks; see also below
 ```
 
 ### Full package validation
diff --git a/r/README.md b/r/README.md
index 79fa301..2ed3feb 100644
--- a/r/README.md
+++ b/r/README.md
@@ -1,48 +1,90 @@
 
 <!-- README.md is generated from README.Rmd. Please edit that file -->
-arrow
-=====
+
+# arrow
 
 [![conda-forge](https://img.shields.io/conda/vn/conda-forge/r-arrow.svg)](https://anaconda.org/conda-forge/r-arrow)
 
-R integration with Apache Arrow.
+[Apache Arrow](https://arrow.apache.org/) is a cross-language
+development platform for in-memory data. It specifies a standardized
+language-independent columnar memory format for flat and hierarchical
+data, organized for efficient analytic operations on modern hardware. It
+also provides computational libraries and zero-copy streaming messaging
+and interprocess communication.
+
+The `arrow` package exposes an interface to the Arrow C++ library.
 
-Installation
-------------
+## Installation
 
-You first need to install the C++ library:
+Installing `arrow` is a two-step process: first install the Arrow C++
+library, then the R package.
 
-### macOS
+### Release version
 
-On macOS, you may install using homebrew:
+The current release of Apache Arrow is 0.13.
 
-    brew install apache-arrow
+On macOS, you may install the C++ library using
+[Homebrew](https://brew.sh/):
 
-### From source
+``` shell
+brew install apache-arrow
+```
 
-First install a release build of the C++ bindings to arrow.
+On Linux, see the [Arrow project installation
+page](https://arrow.apache.org/install/).
+
+Then, you can install the release version of the package from GitHub
+using the [`remotes`](https://remotes.r-lib.org/) package. From within
+an R session,
+
+``` r
+# install.packages("remotes") # Or install "devtools", which includes remotes
+remotes::install_github("apache/arrow/r", ref="76e1bc5dfb9d08e31eddd5cbcc0b1bab934da2c7")
+```
+
+or if you prefer to stay at the command line,
 
 ``` shell
-git clone https://github.com/apache/arrow.git
-cd arrow/cpp && mkdir release && cd release
+R -e 'remotes::install_github("apache/arrow/r", ref="76e1bc5dfb9d08e31eddd5cbcc0b1bab934da2c7")'
+```
+
+### Development version
+
+First, clone the repository and install a release build of the C++
+library.
 
-# It is important to statically link to boost libraries
-cmake .. -DARROW_PARQUET=ON -DCMAKE_BUILD_TYPE=Release -DARROW_BOOST_USE_SHARED:BOOL=Off
+``` shell
+git clone https://github.com/apache/arrow.git
+mkdir arrow/cpp/build && cd arrow/cpp/build
+cmake .. -DARROW_PARQUET=ON -DARROW_BOOST_USE_SHARED:BOOL=Off -DARROW_INSTALL_NAME_RPATH=OFF
 make install
 ```
 
-Then the R package
-------------------
+Then, you can install the R package and its dependencies from the git
+checkout:
 
-### From github (defaults to master branch)
-
-``` r
-library("devtools")
-devtools::install_github("apache/arrow/r")
+``` shell
+cd ../../r
+R -e 'install.packages("remotes"); remotes::install_deps()'
+R CMD INSTALL .
 ```
 
-Example
--------
+If the package fails to install/load with an error like this:
+
+    ** testing if installed package can be loaded from temporary location
+    Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...):
+    unable to load shared object '/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so':
+    dlopen(/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so, 6): Library not loaded: @rpath/libarrow.14.dylib
+
+try setting the environment variable `LD_LIBRARY_PATH` (or
+`DYLD_LIBRARY_PATH` on macOS) to wherever Arrow C++ was put in `make
+install`, e.g. `export LD_LIBRARY_PATH=/usr/local/lib`, and retry
+installing the R package.
+
+For any other build/configuration challenges, see the [C++ developer
+guide](https://arrow.apache.org/docs/developers/cpp.html#building).
+
+## Example
 
 ``` r
 library(arrow)
@@ -55,20 +97,7 @@ library(arrow)
 #> 
 #>     array, table
 
-(tib <- tibble::tibble(x = 1:10, y = rnorm(10)))
-#> # A tibble: 10 x 2
-#>        x      y
-#>    <int>  <dbl>
-#>  1     1 -1.59 
-#>  2     2  1.31 
-#>  3     3 -0.513
-#>  4     4  2.64 
-#>  5     5  0.389
-#>  6     6  0.660
-#>  7     7  1.66 
-#>  8     8 -0.120
-#>  9     9  2.43 
-#> 10    10  1.53
+tib <- tibble::tibble(x = 1:10, y = rnorm(10))
 tab <- table(tib)
 tab$schema
 #> arrow::Schema 
@@ -78,37 +107,39 @@ tab
 #> arrow::Table
 as_tibble(tab)
 #> # A tibble: 10 x 2
-#>        x      y
-#>    <int>  <dbl>
-#>  1     1 -1.59 
-#>  2     2  1.31 
-#>  3     3 -0.513
-#>  4     4  2.64 
-#>  5     5  0.389
-#>  6     6  0.660
-#>  7     7  1.66 
-#>  8     8 -0.120
-#>  9     9  2.43 
-#> 10    10  1.53
+#>        x        y
+#>    <int>    <dbl>
+#>  1     1 -1.70   
+#>  2     2  0.764  
+#>  3     3 -0.00735
+#>  4     4 -0.192  
+#>  5     5  0.513  
+#>  6     6 -1.81   
+#>  7     7 -0.371  
+#>  8     8 -1.40   
+#>  9     9 -0.0599 
+#> 10    10 -0.359
 ```
 
-Developing
-----------
+## Developing
 
-The arrow package is using devtools for package related manipulations. To install:
+It is recommended to use the `devtools` package to help with some
+package-related manipulations. You’ll need a few more R packages, which
+you can install using
 
 ``` r
-library.packages("devtools")
+install.packages("devtools")
+devtools::install_dev_deps()
 ```
 
-Within a local clone, one can test changes by running the test suite.
-
-### Running the test suite
+### Useful functions
 
 ``` r
-library("devtools")
-devtools::install_dev_deps()
-devtools::test()
+devtools::load_all() # Load the dev package
+devtools::test(filter="^regexp$") # Run the test suite, optionally filtering file names
+devtools::document() # Update roxygen documentation
+rmarkdown::render("README.Rmd") # To rebuild README.md
+devtools::check() # All package checks; see also below
 ```
 
 ### Full package validation