You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Nick Cen <ce...@gmail.com> on 2009/06/16 07:18:54 UTC

Re: How to use DFS API to travel across the directory tree and retrieve content of a DFS file?

I think you can take a look at the following classes. FileSystem, Path,
FileStatus.

*and the listStatus(Path path)* method in FileSystem.



2009/6/16 Wenrui Guo <we...@ericsson.com>

> Hi, all
>
>
> As I know, hadoop fs -ls / can list files and directory of root
> directory, so I am wondering How could I write a Java program to travel
> across the whole DFS directory structure?
>
> That is, if the directory structure at the moment like the following:
>
> /
>  |
>  |
>  +----home
>          |
>          |
>         + anderson
>                 |
>                 |
>                + samples.dat
>
>
> Is it possible to write a Java program that can visit from the /
> directory and list subdirectory, and find if it reaches a dat file?
>
> Afterwards, how could I obtain the content of the samples.dat ? So far,
> I know the starting point is constructing a Configuration object,
> however, What's the necessary information should be included in the
> Configuration object?
> Shall I specify the hadoop-defaults.xml and hadoop-sites.xml inside it.
>
> I'll appreciate if a simple sample program is provided.
>
> BR/anderson
>



-- 
http://daily.appspot.com/food/

Re: How to use DFS API to travel across the directory tree and retrieve content of a DFS file?

Posted by Nick Cen <ce...@gmail.com>.
I think you can take a look at the Configuration class.

2009/6/17 Wenrui Guo <we...@ericsson.com>

> Hi, Nick
>
> I think the listStatus(Path) is really what I want.
>
> Meanwhile, I also asked How to set the Configuration object when
> constructing the FileSystem object. As I know, in order to make Hadoop
> client programs runs (like ./hadoop fs ls / command), the hadoop
> configuration files, e.g hadoop-default.xml and hadoop-sites.xml must be
> parsed to obtain information of NameNode and DataNode.
>
> So, If I'd like to run the directory traversal class as a standalone
> Java application on a Machine rather than Nodes within the Hadoop
> cluster, Do I need to copy hadoop configuration files to client side and
> load them at runtime?
>
> BR/anderson
>
> -----Original Message-----
> From: Nick Cen [mailto:cenyongh@gmail.com]
> Sent: Tuesday, June 16, 2009 1:19 PM
> To: core-user@hadoop.apache.org
> Subject: Re: How to use DFS API to travel across the directory tree and
> retrieve content of a DFS file?
>
> I think you can take a look at the following classes. FileSystem, Path,
> FileStatus.
>
> *and the listStatus(Path path)* method in FileSystem.
>
>
>
> 2009/6/16 Wenrui Guo <we...@ericsson.com>
>
> > Hi, all
> >
> >
> > As I know, hadoop fs -ls / can list files and directory of root
> > directory, so I am wondering How could I write a Java program to
> > travel across the whole DFS directory structure?
> >
> > That is, if the directory structure at the moment like the following:
> >
> > /
> >  |
> >  |
> >  +----home
> >          |
> >          |
> >         + anderson
> >                 |
> >                 |
> >                + samples.dat
> >
> >
> > Is it possible to write a Java program that can visit from the /
> > directory and list subdirectory, and find if it reaches a dat file?
> >
> > Afterwards, how could I obtain the content of the samples.dat ? So
> > far, I know the starting point is constructing a Configuration object,
>
> > however, What's the necessary information should be included in the
> > Configuration object?
> > Shall I specify the hadoop-defaults.xml and hadoop-sites.xml inside
> it.
> >
> > I'll appreciate if a simple sample program is provided.
> >
> > BR/anderson
> >
>
>
>
> --
> http://daily.appspot.com/food/
>



-- 
http://daily.appspot.com/food/

RE: How to use DFS API to travel across the directory tree and retrieve content of a DFS file?

Posted by Wenrui Guo <we...@ericsson.com>.
Hi, Nick

I think the listStatus(Path) is really what I want.

Meanwhile, I also asked How to set the Configuration object when
constructing the FileSystem object. As I know, in order to make Hadoop
client programs runs (like ./hadoop fs ls / command), the hadoop
configuration files, e.g hadoop-default.xml and hadoop-sites.xml must be
parsed to obtain information of NameNode and DataNode.

So, If I'd like to run the directory traversal class as a standalone
Java application on a Machine rather than Nodes within the Hadoop
cluster, Do I need to copy hadoop configuration files to client side and
load them at runtime?

BR/anderson

-----Original Message-----
From: Nick Cen [mailto:cenyongh@gmail.com] 
Sent: Tuesday, June 16, 2009 1:19 PM
To: core-user@hadoop.apache.org
Subject: Re: How to use DFS API to travel across the directory tree and
retrieve content of a DFS file?

I think you can take a look at the following classes. FileSystem, Path,
FileStatus.

*and the listStatus(Path path)* method in FileSystem.



2009/6/16 Wenrui Guo <we...@ericsson.com>

> Hi, all
>
>
> As I know, hadoop fs -ls / can list files and directory of root 
> directory, so I am wondering How could I write a Java program to 
> travel across the whole DFS directory structure?
>
> That is, if the directory structure at the moment like the following:
>
> /
>  |
>  |
>  +----home
>          |
>          |
>         + anderson
>                 |
>                 |
>                + samples.dat
>
>
> Is it possible to write a Java program that can visit from the / 
> directory and list subdirectory, and find if it reaches a dat file?
>
> Afterwards, how could I obtain the content of the samples.dat ? So 
> far, I know the starting point is constructing a Configuration object,

> however, What's the necessary information should be included in the 
> Configuration object?
> Shall I specify the hadoop-defaults.xml and hadoop-sites.xml inside
it.
>
> I'll appreciate if a simple sample program is provided.
>
> BR/anderson
>



--
http://daily.appspot.com/food/