You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by "Junegunn Choi (JIRA)" <ji...@apache.org> on 2016/11/21 02:39:58 UTC

[jira] [Created] (ORC-111) Make Java tool support DFS

Junegunn Choi created ORC-111:
---------------------------------

             Summary: Make Java tool support DFS
                 Key: ORC-111
                 URL: https://issues.apache.org/jira/browse/ORC-111
             Project: Orc
          Issue Type: Improvement
          Components: Java
            Reporter: Junegunn Choi


Java tool currently does not work with HDFS paths:

{noformat}
$ mvn clean package -DskipTests
$ export UBERJAR=tools/target/orc-tools-1.3.0-SNAPSHOT-uber.jar
$ java -jar $UBERJAR meta hdfs://nn-host/foobar.snappy.orc

log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.io.IOException: No FileSystem for scheme: hdfs
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
        at org.apache.orc.tools.FileDump.getAllFilesInPath(FileDump.java:248)
        at org.apache.orc.tools.FileDump.main(FileDump.java:121)
        at org.apache.orc.tools.Driver.main(Driver.java:100)
{noformat}

This is because DistributedFileSystem is not registered as a subclass of FileSystem SPI.

{noformat}
$ unzip -p tools/target/orc-tools-1.3.0-SNAPSHOT-uber.jar META-INF/services/org.apache.hadoop.fs.FileSystem

# ...
org.apache.hadoop.fs.LocalFileSystem
org.apache.hadoop.fs.viewfs.ViewFileSystem
org.apache.hadoop.fs.ftp.FTPFileSystem
org.apache.hadoop.fs.HarFileSystem
{noformat}

We can fix this by simply adding containerDescriptorHandler to the assembly descriptor. Now the same command gives:

{noformat}
org.apache.hadoop.fs.LocalFileSystem
org.apache.hadoop.fs.viewfs.ViewFileSystem
org.apache.hadoop.fs.ftp.FTPFileSystem
org.apache.hadoop.fs.HarFileSystem
org.apache.hadoop.hdfs.DistributedFileSystem
org.apache.hadoop.hdfs.web.HftpFileSystem
org.apache.hadoop.hdfs.web.HsftpFileSystem
org.apache.hadoop.hdfs.web.WebHdfsFileSystem
org.apache.hadoop.hdfs.web.SWebHdfsFileSystem
{noformat}

and we can use the uberjar with HDFS paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)