You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ian Cook (Jira)" <ji...@apache.org> on 2021/02/27 13:38:00 UTC
[jira] [Comment Edited] (ARROW-11735) [R] Allow parquet to be an
optional component like S3
[ https://issues.apache.org/jira/browse/ARROW-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291960#comment-17291960 ]
Ian Cook edited comment on ARROW-11735 at 2/27/21, 1:37 PM:
------------------------------------------------------------
One problem here: pkg-config reports that the Arrow Dataset module depends on Parquet
{code:java}
$ pkg-config --libs arrow-dataset
-L/usr/local/lib -larrow_dataset -lparquet -larrow {code}
Does it actually? From {{CMakeLists.txt}}, it doesn't look like it:
[https://github.com/apache/arrow/blob/master/cpp/CMakeLists.txt#L339-L342]
But there is a {{find_package(Parquet)}} call in {{FindArrowDataset.cmake}}:
[https://github.com/apache/arrow/blob/master/cpp/cmake_modules/FindArrowDataset.cmake#L47]
{{arrow-dataset.pc.in}} has {{parquet}} in the {{Requires:}}:
[https://github.com/apache/arrow/blob/master/cpp/src/arrow/dataset/arrow-dataset.pc.in#L24]
{{c_glib/configure.ac}} puts {{-lparquet}} in {{ARROW_DATASET_LIBS}}:
[https://github.com/apache/arrow/blob/master/c_glib/configure.ac#L225]
was (Author: icook):
One problem here: pkg-config reports that the Arrow Dataset module depends on Parquet
{code:java}
$ pkg-config --libs arrow-dataset
-L/usr/local/lib -larrow_dataset -lparquet -larrow {code}
Does it actually? From {{CMakeLists.txt}}, it doesn't look like it:
[https://github.com/apache/arrow/blob/master/cpp/CMakeLists.txt#L339-L342]
But there is a {{find_package(Parquet)}} call in {{FindArrowDataset.cmake}}:
[https://github.com/apache/arrow/blob/master/cpp/cmake_modules/FindArrowDataset.cmake#L47]
> [R] Allow parquet to be an optional component like S3
> -----------------------------------------------------
>
> Key: ARROW-11735
> URL: https://issues.apache.org/jira/browse/ARROW-11735
> Project: Apache Arrow
> Issue Type: Sub-task
> Components: R
> Reporter: Neal Richardson
> Assignee: Ian Cook
> Priority: Major
> Fix For: 4.0.0
>
>
> Parquet requires thrift and it seems that thrift (at least as of version 0.12) does not compile on Solaris:
> {code}
> /export/home/X1svPYR/Rtemp/RtmptF1MlN/file75097d284891/thrift_ep-prefix/src/thrift_ep/lib/cpp/src/thrift/transport/THttpServer.cpp: In member function virtual void apache::thrift::transport::THttpServer::parseHeader(char*):
> /export/home/X1svPYR/Rtemp/RtmptF1MlN/file75097d284891/thrift_ep-prefix/src/thrift_ep/lib/cpp/src/thrift/transport/THttpServer.cpp:50:74: error: strcasestr was not declared in this scope
> #define THRIFT_strcasestr(haystack, needle) strcasestr(haystack, needle)
> ^
> /export/home/X1svPYR/Rtemp/RtmptF1MlN/file75097d284891/thrift_ep-prefix/src/thrift_ep/lib/cpp/src/thrift/transport/THttpServer.cpp:62:9: note: in expansion of macro THRIFT_strcasestr
> if (THRIFT_strcasestr(value, "chunked") != NULL) {
> {code}
> (along with some boost endian header deprecation warnings)
> We could debug/patch that, or we could also make Parquet an optional feature in the R bindings. That might have some value anyway so that one could build a lighter/minimal R package, if that were helpful.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)