You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by Colin McLean <cm...@staffmail.ed.ac.uk> on 2020/10/12 08:49:41 UTC
[R and C++] passing arrow::Arrow from R to C++ for reading and
writing?
Dear Arrow users,
I was wondering if anyone can help me understand how I can create an
arrow::Array object in R, then pass this into C++ (using the Rcpp
library) for both reading and writing too? Similar what is done using
the R bigmemory
(https://privefl.github.io/blog/Tip-Optimize-your-Rcpp-loops/) or
bigstatsr packages.
Kindest Regards,
Colin Mclean.
R script:
library(Rcpp)
library(arrow)
## compile c++ code
Sys.setenv("PKG_CXXFLAGS" = "-larrow")
sourceCpp("utils.cpp")
N = 10
X = arrow::Array$create(rep(0,N*N))
test( X$pointer() )
utils.cpp:
// define headers
// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::depends(arrow)]]
#include <arrow/api.h>
#include <arrow/array.h>
#include <arrow/array/array_base.h>
#include <Rcpp.h>
#include <stdio.h>
#include <iostream>
#include <string>
using namespace Rcpp;
using namespace std;
// [[Rcpp::export]]
void test( XPtr<arrow::Array> aAPtr ){
cout << "read & write arrow::Array in test " << endl;
cout << aAPtr << endl;
}
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Re: [R and C++] passing arrow::Arrow from R to C++ for reading and
writing?
Posted by Colin McLean <cm...@staffmail.ed.ac.uk>.
Hi Neal,
Thanks for your reply. I've attached below how I installed the arrow
R package on my system - first downloading and installing the C/C++
code, then installing the R package contained in this [arrow.git]
download.
refs/points:
1) https://github.com/apache/arrow/tree/master/r
2) https://arrow.apache.org/docs/developers/cpp/building.html
3) requires R version 3.6 or higher.
[1]:
git clone https://github.com/apache/arrow.git
cd arrow/cpp
mkdir release
cd release
[2]:
sudo cmake ../ -DARROW_COMPUTE=ON -DARROW_CSV=ON -DARROW_DATASET=ON
-DARROW_FILESYSTEM=ON -DARROW_JEMALLOC=ON -DARROW_JSON=ON
-DARROW_PARQUET=ON -DARROW_BUILD_TYPE=release -DARROW_WITH_BROTLI=ON
-DARROW_WITH_BZ2=ON -DARROW_WITH_LZ4=ON -DARROW_WITH_ZLIB=ON
-DARROW_WITH_ZSTD=ON -DARROW_EXTRA_ERROR_CONTEXT=ON -DARROW_PLASMA=ON
[3]:
sudo make install
[4]:
#then run this to pick-up the new libs in /usr/local/lib
sudo ldconfig
[5]:
# and make sure LD_LIBRARY_PATH and R_LD_LIBRARY_PATH contains /usr/local/lib
echo $LD_LIBRARY_PATH
echo $R_LD_LIBRARY_PATH
[6]:
#then try installing the R arrow package
cd ../../r
R -e 'install.packages(c("devtools", "roxygen2", "pkgdown", "covr"));
devtools::install_dev_deps()'
R CMD INSTALL .
#------------------------
I believe the code I wrote works... in the sense that the
value/address of the pointer passed is printed to screen - I also
tested this by modifying the code to include the libraries suggested
in your last email:
## compile c++ code
Sys.setenv("PKG_CXXFLAGS" = "-larrow_bundled_dependencies
-larrow_dataset -lparquet -larrow")
Running the R code, 'arrowTest.R', then generates:
> source('arrowTest.R')
Attaching package: ‘arrow’
The following object is masked from ‘package:utils’:
timestamp
in test
0x5652319be670
>
I guess my question was is what I'm doing correct? By this I mean from
the code i have written, is (or how is) it possible to print the
values of the arrow::Array, called 'aAPtr', that i've passed to
function 'test' - and similarity write values to aAptr?
Thanks for your time to read my email, and for your reply.
All the Best,
Colin Mclean.
Quoting Neal Richardson <ne...@gmail.com> on Mon, 12 Oct
2020 08:09:44 -0700:
> Hi Colin,
> Does the code you shared run? If not, how does it fail?
>
> One guess is that you're probably getting undefined symbols errors because
> you need more than just -larrow. See
> https://github.com/apache/arrow/blob/master/r/configure#L35 for others you
> need, and depending on how you installed arrow, you likely also need
> -larrow_bundled_dependencies.
>
> Neal
>
> On Mon, Oct 12, 2020 at 1:49 AM Colin McLean <cm...@staffmail.ed.ac.uk>
> wrote:
>
>> Dear Arrow users,
>>
>> I was wondering if anyone can help me understand how I can create an
>> arrow::Array object in R, then pass this into C++ (using the Rcpp
>> library) for both reading and writing too? Similar what is done using
>> the R bigmemory
>> (https://privefl.github.io/blog/Tip-Optimize-your-Rcpp-loops/) or
>> bigstatsr packages.
>>
>> Kindest Regards,
>> Colin Mclean.
>>
>> R script:
>> library(Rcpp)
>> library(arrow)
>>
>> ## compile c++ code
>> Sys.setenv("PKG_CXXFLAGS" = "-larrow")
>> sourceCpp("utils.cpp")
>>
>> N = 10
>> X = arrow::Array$create(rep(0,N*N))
>>
>> test( X$pointer() )
>>
>>
>> utils.cpp:
>> // define headers
>> // [[Rcpp::plugins(cpp11)]]
>> // [[Rcpp::depends(arrow)]]
>>
>> #include <arrow/api.h>
>> #include <arrow/array.h>
>> #include <arrow/array/array_base.h>
>> #include <Rcpp.h>
>> #include <stdio.h>
>> #include <iostream>
>> #include <string>
>>
>>
>> using namespace Rcpp;
>> using namespace std;
>>
>> // [[Rcpp::export]]
>> void test( XPtr<arrow::Array> aAPtr ){
>>
>> cout << "read & write arrow::Array in test " << endl;
>> cout << aAPtr << endl;
>>
>> }
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>>
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Re: [R and C++] passing arrow::Arrow from R to C++ for reading and writing?
Posted by Neal Richardson <ne...@gmail.com>.
Hi Colin,
Does the code you shared run? If not, how does it fail?
One guess is that you're probably getting undefined symbols errors because
you need more than just -larrow. See
https://github.com/apache/arrow/blob/master/r/configure#L35 for others you
need, and depending on how you installed arrow, you likely also need
-larrow_bundled_dependencies.
Neal
On Mon, Oct 12, 2020 at 1:49 AM Colin McLean <cm...@staffmail.ed.ac.uk>
wrote:
> Dear Arrow users,
>
> I was wondering if anyone can help me understand how I can create an
> arrow::Array object in R, then pass this into C++ (using the Rcpp
> library) for both reading and writing too? Similar what is done using
> the R bigmemory
> (https://privefl.github.io/blog/Tip-Optimize-your-Rcpp-loops/) or
> bigstatsr packages.
>
> Kindest Regards,
> Colin Mclean.
>
> R script:
> library(Rcpp)
> library(arrow)
>
> ## compile c++ code
> Sys.setenv("PKG_CXXFLAGS" = "-larrow")
> sourceCpp("utils.cpp")
>
> N = 10
> X = arrow::Array$create(rep(0,N*N))
>
> test( X$pointer() )
>
>
> utils.cpp:
> // define headers
> // [[Rcpp::plugins(cpp11)]]
> // [[Rcpp::depends(arrow)]]
>
> #include <arrow/api.h>
> #include <arrow/array.h>
> #include <arrow/array/array_base.h>
> #include <Rcpp.h>
> #include <stdio.h>
> #include <iostream>
> #include <string>
>
>
> using namespace Rcpp;
> using namespace std;
>
> // [[Rcpp::export]]
> void test( XPtr<arrow::Array> aAPtr ){
>
> cout << "read & write arrow::Array in test " << endl;
> cout << aAPtr << endl;
>
> }
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
>