You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Boyu Zhang <bo...@gmail.com> on 2009/09/11 23:17:45 UTC
Hadoop Input File Directory
Dear all,
I have an input file hierarchy of depth 3, something like
/data/user/dir_0/file0, /data/user/dir_1/file0, /data/user/dir_2/file0. I
want to run a mapreduce job to process all the files in the deepest levels.
One way of doing so is to specify the input path like /data/user/dir_0,
/data/user/dir_1, /data/user/dir_2, but this becomes infeasible when the
hierarchy grows.
I tried to specify the input path as /data/user, but I got errors like:
cannot open filename /data/user/dir_0.
My question is that is there any way that I can process all the files with
specifying the input data to the top level?
Thanks a lot!
Boyu Zhang
University of Delaware