You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by ko...@apache.org on 2019/05/24 22:13:53 UTC
[arrow] branch master updated: ARROW-5332: [R] Update R package
README with richer installation instructions
This is an automated email from the ASF dual-hosted git repository.
kou pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new 184b8de ARROW-5332: [R] Update R package README with richer installation instructions
184b8de is described below
commit 184b8deb651c6f6308c0fa2a595f5a40f5da8ce8
Author: Neal Richardson <ne...@gmail.com>
AuthorDate: Sat May 25 07:13:37 2019 +0900
ARROW-5332: [R] Update R package README with richer installation instructions
These extra flags got dropped in https://github.com/apache/arrow/commit/a44d1f553a3d1716dfafa7bb02753b8c4ed13d3f#diff-2816240325a8e9e6df50fb7c1faa8113R21. This change resolves the issue for me locally on macOS, and I'll confirm that it fixes the Ubuntu build for sparklyr and that it doesn't break our Travis build.
Author: Neal Richardson <ne...@gmail.com>
Closes #4322 from nealrichardson/r-pkg-rpath and squashes the following commits:
132bd152 <Neal Richardson> https
baa16156 <Neal Richardson> Tweak C++ debug instructions
0cdd141b <Neal Richardson> Merge upstream/master
e21732ac <Neal Richardson> Revise README to include more thorough installation and other dev instructions
---
r/DESCRIPTION | 12 +++--
r/README.Rmd | 84 ++++++++++++++++++++++----------
r/README.md | 151 +++++++++++++++++++++++++++++++++++-----------------------
3 files changed, 158 insertions(+), 89 deletions(-)
diff --git a/r/DESCRIPTION b/r/DESCRIPTION
index c48a1f0..2f0adc6 100644
--- a/r/DESCRIPTION
+++ b/r/DESCRIPTION
@@ -1,5 +1,5 @@
Package: arrow
-Title: R Integration to 'Apache' 'Arrow'
+Title: Integration to 'Apache' 'Arrow'
Version: 0.13.0.9000
Authors@R: c(
person("Romain", "François", email = "romain@rstudio.com", role = c("aut", "cre")),
@@ -8,7 +8,11 @@ Authors@R: c(
person("Jeroen", "Ooms", email = "jeroen@berkeley.edu", role = c("aut")),
person("Apache Arrow", email = "dev@arrow.apache.org", role = c("aut", "cph"))
)
-Description: R Integration to 'Apache' 'Arrow'.
+Description: 'Apache' 'Arrow' <https://arrow.apache.org/> is a cross-language
+ development platform for in-memory data. It specifies a standardized
+ language-independent columnar memory format for flat and hierarchical data,
+ organized for efficient analytic operations on modern hardware. This
+ package provides an interface to the Arrow C++ library.
Depends: R (>= 3.1)
License: Apache License (>= 2.0)
Encoding: UTF-8
@@ -31,8 +35,10 @@ Imports:
Roxygen: list(markdown = TRUE)
RoxygenNote: 6.1.1
Suggests:
+ rmarkdown,
+ roxygen2,
testthat,
- lubridate,
+ lubridate,
hms
Collate:
'enums.R'
diff --git a/r/README.Rmd b/r/README.Rmd
index b5365c2..af20340 100644
--- a/r/README.Rmd
+++ b/r/README.Rmd
@@ -1,5 +1,7 @@
---
-output: github_document
+output:
+ github_document:
+ html_preview: false
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
@@ -16,48 +18,77 @@ knitr::opts_chunk$set(
[![conda-forge](https://img.shields.io/conda/vn/conda-forge/r-arrow.svg)](https://anaconda.org/conda-forge/r-arrow)
-R integration with Apache Arrow.
+[Apache Arrow](https://arrow.apache.org/) is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication.
+
+The `arrow` package exposes an interface to the Arrow C++ library.
## Installation
-You first need to install the C++ library:
+Installing `arrow` is a two-step process: first install the Arrow C++ library, then the R package.
-### macOS
+### Release version
-On macOS, you may install using homebrew:
+The current release of Apache Arrow is 0.13.
-```
+On macOS, you may install the C++ library using [Homebrew](https://brew.sh/):
+
+```shell
brew install apache-arrow
```
-### From source
+On Linux, see the [Arrow project installation page](https://arrow.apache.org/install/).
+
+Then, you can install the release version of the package from GitHub using the [`remotes`](https://remotes.r-lib.org/) package. From within an R session,
+
+```r
+# install.packages("remotes") # Or install "devtools", which includes remotes
+remotes::install_github("apache/arrow/r", ref="76e1bc5dfb9d08e31eddd5cbcc0b1bab934da2c7")
+```
-First install a release build of the C++ bindings to arrow.
+or if you prefer to stay at the command line,
```shell
-git clone https://github.com/apache/arrow.git
-cd arrow/cpp && mkdir release && cd release
+R -e 'remotes::install_github("apache/arrow/r", ref="76e1bc5dfb9d08e31eddd5cbcc0b1bab934da2c7")'
+```
+
+### Development version
+
+First, clone the repository and install a release build of the C++ library.
-# It is important to statically link to boost libraries
-cmake .. -DARROW_PARQUET=ON -DCMAKE_BUILD_TYPE=Release -DARROW_BOOST_USE_SHARED:BOOL=Off
+```shell
+git clone https://github.com/apache/arrow.git
+mkdir arrow/cpp/build && cd arrow/cpp/build
+cmake .. -DARROW_PARQUET=ON -DARROW_BOOST_USE_SHARED:BOOL=Off -DARROW_INSTALL_NAME_RPATH=OFF
make install
```
-## Then the R package
+Then, you can install the R package and its dependencies from the git checkout:
+
+```shell
+cd ../../r
+R -e 'install.packages("remotes"); remotes::install_deps()'
+R CMD INSTALL .
+```
-### From github (defaults to master branch)
+If the package fails to install/load with an error like this:
-```r
-library("devtools")
-devtools::install_github("apache/arrow/r")
```
+** testing if installed package can be loaded from temporary location
+Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...):
+unable to load shared object '/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so':
+dlopen(/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so, 6): Library not loaded: @rpath/libarrow.14.dylib
+```
+
+try setting the environment variable `LD_LIBRARY_PATH` (or `DYLD_LIBRARY_PATH` on macOS) to wherever Arrow C++ was put in `make install`, e.g. `export LD_LIBRARY_PATH=/usr/local/lib`, and retry installing the R package.
+
+For any other build/configuration challenges, see the [C++ developer guide](https://arrow.apache.org/docs/developers/cpp.html#building).
## Example
```{r}
library(arrow)
-(tib <- tibble::tibble(x = 1:10, y = rnorm(10)))
+tib <- tibble::tibble(x = 1:10, y = rnorm(10))
tab <- table(tib)
tab$schema
tab
@@ -66,20 +97,21 @@ as_tibble(tab)
## Developing
-The arrow package is using devtools for package related manipulations. To install:
+It is recommended to use the `devtools` package to help with some package-related manipulations. You'll need a few more R packages, which you can install using
```r
-library.packages("devtools")
+install.packages("devtools")
+devtools::install_dev_deps()
```
-Within a local clone, one can test changes by running the test suite.
-
-### Running the test suite
+### Useful functions
```r
-library("devtools")
-devtools::install_dev_deps()
-devtools::test()
+devtools::load_all() # Load the dev package
+devtools::test(filter="^regexp$") # Run the test suite, optionally filtering file names
+devtools::document() # Update roxygen documentation
+rmarkdown::render("README.Rmd") # To rebuild README.md
+devtools::check() # All package checks; see also below
```
### Full package validation
diff --git a/r/README.md b/r/README.md
index 79fa301..2ed3feb 100644
--- a/r/README.md
+++ b/r/README.md
@@ -1,48 +1,90 @@
<!-- README.md is generated from README.Rmd. Please edit that file -->
-arrow
-=====
+
+# arrow
[![conda-forge](https://img.shields.io/conda/vn/conda-forge/r-arrow.svg)](https://anaconda.org/conda-forge/r-arrow)
-R integration with Apache Arrow.
+[Apache Arrow](https://arrow.apache.org/) is a cross-language
+development platform for in-memory data. It specifies a standardized
+language-independent columnar memory format for flat and hierarchical
+data, organized for efficient analytic operations on modern hardware. It
+also provides computational libraries and zero-copy streaming messaging
+and interprocess communication.
+
+The `arrow` package exposes an interface to the Arrow C++ library.
-Installation
-------------
+## Installation
-You first need to install the C++ library:
+Installing `arrow` is a two-step process: first install the Arrow C++
+library, then the R package.
-### macOS
+### Release version
-On macOS, you may install using homebrew:
+The current release of Apache Arrow is 0.13.
- brew install apache-arrow
+On macOS, you may install the C++ library using
+[Homebrew](https://brew.sh/):
-### From source
+``` shell
+brew install apache-arrow
+```
-First install a release build of the C++ bindings to arrow.
+On Linux, see the [Arrow project installation
+page](https://arrow.apache.org/install/).
+
+Then, you can install the release version of the package from GitHub
+using the [`remotes`](https://remotes.r-lib.org/) package. From within
+an R session,
+
+``` r
+# install.packages("remotes") # Or install "devtools", which includes remotes
+remotes::install_github("apache/arrow/r", ref="76e1bc5dfb9d08e31eddd5cbcc0b1bab934da2c7")
+```
+
+or if you prefer to stay at the command line,
``` shell
-git clone https://github.com/apache/arrow.git
-cd arrow/cpp && mkdir release && cd release
+R -e 'remotes::install_github("apache/arrow/r", ref="76e1bc5dfb9d08e31eddd5cbcc0b1bab934da2c7")'
+```
+
+### Development version
+
+First, clone the repository and install a release build of the C++
+library.
-# It is important to statically link to boost libraries
-cmake .. -DARROW_PARQUET=ON -DCMAKE_BUILD_TYPE=Release -DARROW_BOOST_USE_SHARED:BOOL=Off
+``` shell
+git clone https://github.com/apache/arrow.git
+mkdir arrow/cpp/build && cd arrow/cpp/build
+cmake .. -DARROW_PARQUET=ON -DARROW_BOOST_USE_SHARED:BOOL=Off -DARROW_INSTALL_NAME_RPATH=OFF
make install
```
-Then the R package
-------------------
+Then, you can install the R package and its dependencies from the git
+checkout:
-### From github (defaults to master branch)
-
-``` r
-library("devtools")
-devtools::install_github("apache/arrow/r")
+``` shell
+cd ../../r
+R -e 'install.packages("remotes"); remotes::install_deps()'
+R CMD INSTALL .
```
-Example
--------
+If the package fails to install/load with an error like this:
+
+ ** testing if installed package can be loaded from temporary location
+ Error: package or namespace load failed for 'arrow' in dyn.load(file, DLLpath = DLLpath, ...):
+ unable to load shared object '/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so':
+ dlopen(/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so, 6): Library not loaded: @rpath/libarrow.14.dylib
+
+try setting the environment variable `LD_LIBRARY_PATH` (or
+`DYLD_LIBRARY_PATH` on macOS) to wherever Arrow C++ was put in `make
+install`, e.g. `export LD_LIBRARY_PATH=/usr/local/lib`, and retry
+installing the R package.
+
+For any other build/configuration challenges, see the [C++ developer
+guide](https://arrow.apache.org/docs/developers/cpp.html#building).
+
+## Example
``` r
library(arrow)
@@ -55,20 +97,7 @@ library(arrow)
#>
#> array, table
-(tib <- tibble::tibble(x = 1:10, y = rnorm(10)))
-#> # A tibble: 10 x 2
-#> x y
-#> <int> <dbl>
-#> 1 1 -1.59
-#> 2 2 1.31
-#> 3 3 -0.513
-#> 4 4 2.64
-#> 5 5 0.389
-#> 6 6 0.660
-#> 7 7 1.66
-#> 8 8 -0.120
-#> 9 9 2.43
-#> 10 10 1.53
+tib <- tibble::tibble(x = 1:10, y = rnorm(10))
tab <- table(tib)
tab$schema
#> arrow::Schema
@@ -78,37 +107,39 @@ tab
#> arrow::Table
as_tibble(tab)
#> # A tibble: 10 x 2
-#> x y
-#> <int> <dbl>
-#> 1 1 -1.59
-#> 2 2 1.31
-#> 3 3 -0.513
-#> 4 4 2.64
-#> 5 5 0.389
-#> 6 6 0.660
-#> 7 7 1.66
-#> 8 8 -0.120
-#> 9 9 2.43
-#> 10 10 1.53
+#> x y
+#> <int> <dbl>
+#> 1 1 -1.70
+#> 2 2 0.764
+#> 3 3 -0.00735
+#> 4 4 -0.192
+#> 5 5 0.513
+#> 6 6 -1.81
+#> 7 7 -0.371
+#> 8 8 -1.40
+#> 9 9 -0.0599
+#> 10 10 -0.359
```
-Developing
-----------
+## Developing
-The arrow package is using devtools for package related manipulations. To install:
+It is recommended to use the `devtools` package to help with some
+package-related manipulations. You’ll need a few more R packages, which
+you can install using
``` r
-library.packages("devtools")
+install.packages("devtools")
+devtools::install_dev_deps()
```
-Within a local clone, one can test changes by running the test suite.
-
-### Running the test suite
+### Useful functions
``` r
-library("devtools")
-devtools::install_dev_deps()
-devtools::test()
+devtools::load_all() # Load the dev package
+devtools::test(filter="^regexp$") # Run the test suite, optionally filtering file names
+devtools::document() # Update roxygen documentation
+rmarkdown::render("README.Rmd") # To rebuild README.md
+devtools::check() # All package checks; see also below
```
### Full package validation