You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Richards Peter <hb...@gmail.com> on 2014/07/31 10:32:19 UTC

Using apache commons-vfs to read files from hdfs

Hi,

I am having a usecase to read files from hdfs and local file system
depending on a configuration parameter. I found that apache commons-vfs
supports various file systems and the latest developer release has an
implementation for hdfs also (though only read support is provided
currently). I find commons-vfs really useful to access txt files from hdfs
and local file system.

However I am not able to access RC Files because commons-vfs exposes only
java.io.InputStream and java.io.OutputStream on FileContent interface to
create readers and writers:
http://commons.apache.org/proper/commons-vfs/apidocs/index.html?org/apache/commons/vfs2/FileContent.html

Since the constructors of RCFile.Reader and RCFile.Writer classes do not
accept any InputStreams or OutputStreams, I am not able to access such
files using commons-vfs.

Is it possible to have constructors accepting FSDataInputStream and
FSDataOutputStream as arguments in these classes?

Is there a better way to access files(of any format) from hdfs and local
file system using a common api?

Thanks,
Richards.