You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2016/08/01 18:35:20 UTC

[jira] [Commented] (ARROW-243) Use generic HDFS component instead of libhdfs

    [ https://issues.apache.org/jira/browse/ARROW-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402593#comment-15402593 ] 

Wes McKinney commented on ARROW-243:
------------------------------------

Yes -- part of the reason for using dlopen for libhdfs is that it's typically part of a Hadoop distribution, so {{libhdfs.so}} will likely not be in LD_LIBRARY_PATH. Since we already have this code in place (https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/libhdfs_shim.cc#L500), making libhdfs3.so also a soft (i.e. not loaded when libarrow_io.so is loaded) dependency isn't too much extra work. 

This code would benefit from some refactoring (for example: there is only one possible set of function pointers available in the shim layer -- you could put these in static shim structs) to make switching between the libraries as seamless as possible. 

> Use generic HDFS component instead of libhdfs
> ---------------------------------------------
>
>                 Key: ARROW-243
>                 URL: https://issues.apache.org/jira/browse/ARROW-243
>             Project: Apache Arrow
>          Issue Type: New Feature
>            Reporter: Ryan Lewis
>
> I would like to use for example libhdfs3 from pivotal to read apache parquet files. This would be a small change to the hdfs layer of apache arrow to support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)