You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Geoffry Roberts <ge...@gmail.com> on 2009/09/15 17:05:31 UTC
Reading Files
All,
I have an issue wrt common file access from within a map reduce job. I have
tried to do this two ways and wind up with either a FileNotFoundException or
a EOFException.
1. I copy the file into the hadoop hdfs using the -copyFromLocal utility.
I then attempt the following:
JobConf conf = new JobConf(...;
Path path = new Path("mypathisindeedcorrect");
FileSystem fs = FileSystem.get(conf);
FSDataInputStream input = fs.open(path);
String s = input.readUTF();
This throws a FileNotFoundException. The path is quite correct. The
following command:
$ bin/hadoop fs -lsr \
shows the file in its proper place, but:
fs.listStatus(path);
shows only the dfs/ and the mapred/ directories.
2. I copy the file to the same location (as per the path) but this time its
in my local file system. I try the same thing except I use
FileSystem.getLocal(... as follows:
JobConf conf = new JobConf(...;
Path path = new Path("mypathisindeedcorrect");
FileSystem fs = FileSystem.getLocal(conf);
FSDataInputStream input = fs.open(path);
String s = input.readUTF();
This throws a EOFException.
It's as if the first character encountered is the EOF, but the
ContentSummary as in:
ContentSummary cs = fs.getContentSummary(path);
shows the file as having the proper length.
I have taken pains in both cases to ensure the file has been saved in UTF-8
format.
What gives?