You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Raimon Bosch <ra...@gmail.com> on 2011/10/10 17:03:05 UTC

How to iterate over a hdfs folder with hadoop

Hi,

I'm wondering how can I browse an hdfs folder using the classes
in org.apache.hadoop.fs package. The operation that I'm looking for is
'hadoop dfs -ls'

The standard file system equivalent would be:

File f = new File(outputPath);
if(f.isDirectory()){
  String files[] = f.list();
  for(String file : files){
    //Do your logic
  }
}

Thanks in advance,
Raimon Bosch.

Re: How to iterate over a hdfs folder with hadoop

Posted by Uma Maheswara Rao G 72686 <ma...@huawei.com>.
Yes, FileStatus class would be trhe equavalent for list.
 FileStstus has the API's isDir and getPath. This both api's can satify for your futher usage.:-)

I think small difference would be, FileStatus will ensure the sorted order.

Regards,
Uma
----- Original Message -----
From: John Conwell <jo...@iamjohn.me>
Date: Monday, October 10, 2011 8:40 pm
Subject: Re: How to iterate over a hdfs folder with hadoop
To: common-user@hadoop.apache.org

> FileStatus[] files = fs.listStatus(new Path(path));
> 
> for (FileStatus fileStatus : files)
> 
> {
> 
> //...do stuff ehre
> 
> }
> 
> On Mon, Oct 10, 2011 at 8:03 AM, Raimon Bosch 
> <ra...@gmail.com>wrote:
> > Hi,
> >
> > I'm wondering how can I browse an hdfs folder using the classes
> > in org.apache.hadoop.fs package. The operation that I'm looking 
> for is
> > 'hadoop dfs -ls'
> >
> > The standard file system equivalent would be:
> >
> > File f = new File(outputPath);
> > if(f.isDirectory()){
> >  String files[] = f.list();
> >  for(String file : files){
> >    //Do your logic
> >  }
> > }
> >
> > Thanks in advance,
> > Raimon Bosch.
> >
> 
> 
> 
> -- 
> 
> Thanks,
> John C
> 

Re: How to iterate over a hdfs folder with hadoop

Posted by Raimon Bosch <ra...@gmail.com>.
Thanks John!

There is the complete solution:


Configuration jc = new Configuration();
Object files[] = null;
List files_in_hdfs = new ArrayList();

FileSystem fs = FileSystem.get(jc);
FileStatus[] file_status = fs.listStatus(new Path(outputPath));
for (FileStatus fileStatus : file_status) {
  files_in_hdfs.add(fileStatus.getPath().getName());
}

files = files_in_hdfs.toArray();

2011/10/10 John Conwell <jo...@iamjohn.me>

> FileStatus[] files = fs.listStatus(new Path(path));
>
> for (FileStatus fileStatus : files)
>
> {
>
> //...do stuff ehre
>
> }
>
> On Mon, Oct 10, 2011 at 8:03 AM, Raimon Bosch <raimon.bosch@gmail.com
> >wrote:
>
> > Hi,
> >
> > I'm wondering how can I browse an hdfs folder using the classes
> > in org.apache.hadoop.fs package. The operation that I'm looking for is
> > 'hadoop dfs -ls'
> >
> > The standard file system equivalent would be:
> >
> > File f = new File(outputPath);
> > if(f.isDirectory()){
> >  String files[] = f.list();
> >  for(String file : files){
> >    //Do your logic
> >  }
> > }
> >
> > Thanks in advance,
> > Raimon Bosch.
> >
>
>
>
> --
>
> Thanks,
> John C
>

Re: How to iterate over a hdfs folder with hadoop

Posted by John Conwell <jo...@iamjohn.me>.
FileStatus[] files = fs.listStatus(new Path(path));

for (FileStatus fileStatus : files)

{

//...do stuff ehre

}

On Mon, Oct 10, 2011 at 8:03 AM, Raimon Bosch <ra...@gmail.com>wrote:

> Hi,
>
> I'm wondering how can I browse an hdfs folder using the classes
> in org.apache.hadoop.fs package. The operation that I'm looking for is
> 'hadoop dfs -ls'
>
> The standard file system equivalent would be:
>
> File f = new File(outputPath);
> if(f.isDirectory()){
>  String files[] = f.list();
>  for(String file : files){
>    //Do your logic
>  }
> }
>
> Thanks in advance,
> Raimon Bosch.
>



-- 

Thanks,
John C