You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Geoffry Roberts <ge...@gmail.com> on 2009/09/15 17:05:31 UTC

Reading Files

All,

I have an issue wrt common file access from within a map reduce job.  I have
tried to do this two ways and wind up with either a FileNotFoundException or
a EOFException.

1.  I copy the file into the hadoop hdfs using the -copyFromLocal utility.
I then attempt the following:

JobConf conf = new JobConf(...;
Path path = new Path("mypathisindeedcorrect");
FileSystem fs = FileSystem.get(conf);
FSDataInputStream input = fs.open(path);
String s = input.readUTF();

This throws a FileNotFoundException.  The path is quite correct.  The
following command:

$ bin/hadoop fs -lsr \

shows the file in its proper place, but:

fs.listStatus(path);

shows only the dfs/ and the mapred/ directories.

2. I copy the file to the same location (as per the path) but this time its
in my local file system.  I try the same thing except I use
FileSystem.getLocal(...  as follows:

JobConf conf = new JobConf(...;
Path path = new Path("mypathisindeedcorrect");
FileSystem fs = FileSystem.getLocal(conf);
FSDataInputStream input = fs.open(path);
String s = input.readUTF();

This throws a EOFException.

It's as if the first character encountered is the EOF, but the
ContentSummary as in:

ContentSummary cs = fs.getContentSummary(path);

shows the file as having the proper length.

I have taken pains in both cases to ensure the file has been saved in UTF-8
format.

What gives?