You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Raimon Bosch <ra...@gmail.com> on 2011/10/10 17:03:05 UTC
How to iterate over a hdfs folder with hadoop
Hi,
I'm wondering how can I browse an hdfs folder using the classes
in org.apache.hadoop.fs package. The operation that I'm looking for is
'hadoop dfs -ls'
The standard file system equivalent would be:
File f = new File(outputPath);
if(f.isDirectory()){
String files[] = f.list();
for(String file : files){
//Do your logic
}
}
Thanks in advance,
Raimon Bosch.
Re: How to iterate over a hdfs folder with hadoop
Posted by Uma Maheswara Rao G 72686 <ma...@huawei.com>.
Yes, FileStatus class would be trhe equavalent for list.
FileStstus has the API's isDir and getPath. This both api's can satify for your futher usage.:-)
I think small difference would be, FileStatus will ensure the sorted order.
Regards,
Uma
----- Original Message -----
From: John Conwell <jo...@iamjohn.me>
Date: Monday, October 10, 2011 8:40 pm
Subject: Re: How to iterate over a hdfs folder with hadoop
To: common-user@hadoop.apache.org
> FileStatus[] files = fs.listStatus(new Path(path));
>
> for (FileStatus fileStatus : files)
>
> {
>
> //...do stuff ehre
>
> }
>
> On Mon, Oct 10, 2011 at 8:03 AM, Raimon Bosch
> <ra...@gmail.com>wrote:
> > Hi,
> >
> > I'm wondering how can I browse an hdfs folder using the classes
> > in org.apache.hadoop.fs package. The operation that I'm looking
> for is
> > 'hadoop dfs -ls'
> >
> > The standard file system equivalent would be:
> >
> > File f = new File(outputPath);
> > if(f.isDirectory()){
> > String files[] = f.list();
> > for(String file : files){
> > //Do your logic
> > }
> > }
> >
> > Thanks in advance,
> > Raimon Bosch.
> >
>
>
>
> --
>
> Thanks,
> John C
>
Re: How to iterate over a hdfs folder with hadoop
Posted by Raimon Bosch <ra...@gmail.com>.
Thanks John!
There is the complete solution:
Configuration jc = new Configuration();
Object files[] = null;
List files_in_hdfs = new ArrayList();
FileSystem fs = FileSystem.get(jc);
FileStatus[] file_status = fs.listStatus(new Path(outputPath));
for (FileStatus fileStatus : file_status) {
files_in_hdfs.add(fileStatus.getPath().getName());
}
files = files_in_hdfs.toArray();
2011/10/10 John Conwell <jo...@iamjohn.me>
> FileStatus[] files = fs.listStatus(new Path(path));
>
> for (FileStatus fileStatus : files)
>
> {
>
> //...do stuff ehre
>
> }
>
> On Mon, Oct 10, 2011 at 8:03 AM, Raimon Bosch <raimon.bosch@gmail.com
> >wrote:
>
> > Hi,
> >
> > I'm wondering how can I browse an hdfs folder using the classes
> > in org.apache.hadoop.fs package. The operation that I'm looking for is
> > 'hadoop dfs -ls'
> >
> > The standard file system equivalent would be:
> >
> > File f = new File(outputPath);
> > if(f.isDirectory()){
> > String files[] = f.list();
> > for(String file : files){
> > //Do your logic
> > }
> > }
> >
> > Thanks in advance,
> > Raimon Bosch.
> >
>
>
>
> --
>
> Thanks,
> John C
>
Re: How to iterate over a hdfs folder with hadoop
Posted by John Conwell <jo...@iamjohn.me>.
FileStatus[] files = fs.listStatus(new Path(path));
for (FileStatus fileStatus : files)
{
//...do stuff ehre
}
On Mon, Oct 10, 2011 at 8:03 AM, Raimon Bosch <ra...@gmail.com>wrote:
> Hi,
>
> I'm wondering how can I browse an hdfs folder using the classes
> in org.apache.hadoop.fs package. The operation that I'm looking for is
> 'hadoop dfs -ls'
>
> The standard file system equivalent would be:
>
> File f = new File(outputPath);
> if(f.isDirectory()){
> String files[] = f.list();
> for(String file : files){
> //Do your logic
> }
> }
>
> Thanks in advance,
> Raimon Bosch.
>
--
Thanks,
John C