You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Tom De Leu (JIRA)" <ji...@apache.org> on 2016/09/15 20:28:22 UTC
[jira] [Created] (CRUNCH-622) From.avroFile fails if path not on
default filesystem
Tom De Leu created CRUNCH-622:
---------------------------------
Summary: From.avroFile fails if path not on default filesystem
Key: CRUNCH-622
URL: https://issues.apache.org/jira/browse/CRUNCH-622
Project: Crunch
Issue Type: Bug
Components: Core
Affects Versions: 0.14.0, 0.13.0
Reporter: Tom De Leu
Assignee: Josh Wills
{noformat}
MemPipeline.getInstance().read(From.avroFile(new Path("s3:///something")));
{noformat}
Fails with:
{noformat}
java.lang.IllegalArgumentException: Wrong FS: s3:/something, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:80)
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:519)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:737)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:514)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
at org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1424)
at org.apache.crunch.io.From.getSchemaFromPath(From.java:351)
at org.apache.crunch.io.From.avroFile(From.java:306)
at org.apache.crunch.io.From.avroFile(From.java:280)
{noformat}
I noticed this in the From class, method getSchemaFromPath:
{noformat}
FileSystem fs = FileSystem.get(conf);
{noformat}
Shouldn't that be changed to this?
{noformat}
FileSystem fs = path.getFileSystem(conf);
{noformat}
We ran into this in a usecase where the file was on a valid path on S3 but the Configuration was pointing to HDFS, which I believe should just work.
After some googling, I also found CRUNCH-47 which seems related, but the patch there couldn't fix the From/At/To helpers as they were introduced later...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)