You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ian Cook (Jira)" <ji...@apache.org> on 2021/02/27 13:38:00 UTC

[jira] [Comment Edited] (ARROW-11735) [R] Allow parquet to be an optional component like S3

    [ https://issues.apache.org/jira/browse/ARROW-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291960#comment-17291960 ] 

Ian Cook edited comment on ARROW-11735 at 2/27/21, 1:37 PM:
------------------------------------------------------------

One problem here: pkg-config reports that the Arrow Dataset module depends on Parquet
{code:java}
$ pkg-config --libs arrow-dataset
-L/usr/local/lib -larrow_dataset -lparquet -larrow {code}
Does it actually? From {{CMakeLists.txt}}, it doesn't look like it:

[https://github.com/apache/arrow/blob/master/cpp/CMakeLists.txt#L339-L342]

But there is a {{find_package(Parquet)}} call in {{FindArrowDataset.cmake}}:

[https://github.com/apache/arrow/blob/master/cpp/cmake_modules/FindArrowDataset.cmake#L47]

{{arrow-dataset.pc.in}} has {{parquet}} in the {{Requires:}}:
[https://github.com/apache/arrow/blob/master/cpp/src/arrow/dataset/arrow-dataset.pc.in#L24]

{{c_glib/configure.ac}} puts {{-lparquet}} in {{ARROW_DATASET_LIBS}}:

[https://github.com/apache/arrow/blob/master/c_glib/configure.ac#L225]


was (Author: icook):
One problem here: pkg-config reports that the Arrow Dataset module depends on Parquet
{code:java}
$ pkg-config --libs arrow-dataset
-L/usr/local/lib -larrow_dataset -lparquet -larrow {code}
Does it actually? From {{CMakeLists.txt}}, it doesn't look like it:

[https://github.com/apache/arrow/blob/master/cpp/CMakeLists.txt#L339-L342]

But there is a {{find_package(Parquet)}} call in {{FindArrowDataset.cmake}}:

[https://github.com/apache/arrow/blob/master/cpp/cmake_modules/FindArrowDataset.cmake#L47]

 

> [R] Allow parquet to be an optional component like S3
> -----------------------------------------------------
>
>                 Key: ARROW-11735
>                 URL: https://issues.apache.org/jira/browse/ARROW-11735
>             Project: Apache Arrow
>          Issue Type: Sub-task
>          Components: R
>            Reporter: Neal Richardson
>            Assignee: Ian Cook
>            Priority: Major
>             Fix For: 4.0.0
>
>
> Parquet requires thrift and it seems that thrift (at least as of version 0.12) does not compile on Solaris:
> {code}
> /export/home/X1svPYR/Rtemp/RtmptF1MlN/file75097d284891/thrift_ep-prefix/src/thrift_ep/lib/cpp/src/thrift/transport/THttpServer.cpp: In member function virtual void apache::thrift::transport::THttpServer::parseHeader(char*):
> /export/home/X1svPYR/Rtemp/RtmptF1MlN/file75097d284891/thrift_ep-prefix/src/thrift_ep/lib/cpp/src/thrift/transport/THttpServer.cpp:50:74: error: strcasestr was not declared in this scope
>    #define THRIFT_strcasestr(haystack, needle) strcasestr(haystack, needle)
>                                                                           ^
> /export/home/X1svPYR/Rtemp/RtmptF1MlN/file75097d284891/thrift_ep-prefix/src/thrift_ep/lib/cpp/src/thrift/transport/THttpServer.cpp:62:9: note: in expansion of macro THRIFT_strcasestr
>      if (THRIFT_strcasestr(value, "chunked") != NULL) {
> {code}
> (along with some boost endian header deprecation warnings)
> We could debug/patch that, or we could also make Parquet an optional feature in the R bindings. That might have some value anyway so that one could build a lighter/minimal R package, if that were helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)