You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Foss User <fo...@gmail.com> on 2009/05/19 09:13:37 UTC

Finding where the file blocks are

I know that if a file is very large, it will be split into blocks and
the blocks would be spread out in various data nodes. I want to know
whether I can find out through GUI or logs exactly where which data
nodes contain which file blocks of a particular huge text file?

Re: Finding where the file blocks are

Posted by Arun C Murthy <ac...@yahoo-inc.com>.

On May 19, 2009, at 12:13 AM, Foss User wrote:

> I know that if a file is very large, it will be split into blocks and
> the blocks would be spread out in various data nodes. I want to know
> whether I can find out through GUI or logs exactly where which data
> nodes contain which file blocks of a particular huge text file?

http://hadoop.apache.org/core/docs/r0.20.0/api/org/apache/hadoop/fs/FileSystem.html#listStatus(org.apache.hadoop.fs.Path)
followed by
http://hadoop.apache.org/core/docs/r0.20.0/api/org/apache/hadoop/fs/FileSystem.html#getFileBlockLocations(org.apache.hadoop.fs.FileStatus,%20long,%20long)

Arun

Re: Finding where the file blocks are

Posted by Philip Zeyliger <ph...@cloudera.com>.

On Tue, May 19, 2009 at 1:00 AM, Foss User <fo...@gmail.com> wrote:

> On Tue, May 19, 2009 at 12:53 PM, Ravi Phulari <rp...@yahoo-inc.com>
> wrote:
> > If you have hadoop superuser/administrative  permissions you can use fsck
> > with correct options to view block report and locations for every block.
> >
> > For further information please refer -
> > http://hadoop.apache.org/core/docs/r0.20.0/commands_manual.html#fsck
> >
>
> Thanks for the response. What about the Hadoop namenode web interface:
>
> http://hadoop-slave:50075/browseDirectory.jsp?dir=%2Fuser%2Fhadoop&namenodeInfoPort=50070
>
> Note that this URL is that of a slave machine called hadoop-slave and
> I am trying to view details of /user/hadoop. If I view this URL, and
> click some files, would it show me the complete file in the cluster or
> only those portions (blocks) of the file which are on this particular
> slave data node?


It'll send you to a data node which has at least one block, and that data
node will proxy other blocks.  So, you'll see the complete file.

-- Philip

Re: Finding where the file blocks are

Posted by Foss User <fo...@gmail.com>.

On Tue, May 19, 2009 at 12:53 PM, Ravi Phulari <rp...@yahoo-inc.com> wrote:
> If you have hadoop superuser/administrative  permissions you can use fsck
> with correct options to view block report and locations for every block.
>
> For further information please refer -
> http://hadoop.apache.org/core/docs/r0.20.0/commands_manual.html#fsck
>

Thanks for the response. What about the Hadoop namenode web interface:
http://hadoop-slave:50075/browseDirectory.jsp?dir=%2Fuser%2Fhadoop&namenodeInfoPort=50070

Note that this URL is that of a slave machine called hadoop-slave and
I am trying to view details of /user/hadoop. If I view this URL, and
click some files, would it show me the complete file in the cluster or
only those portions (blocks) of the file which are on this particular
slave data node?

Re: Finding where the file blocks are

Posted by Ravi Phulari <rp...@yahoo-inc.com>.

If you have hadoop superuser/administrative  permissions you can use fsck with correct options to view block report and locations for every block.

For further information please refer -
http://hadoop.apache.org/core/docs/r0.20.0/commands_manual.html#fsck



On 5/19/09 12:13 AM, "Foss User" <fo...@gmail.com> wrote:

I know that if a file is very large, it will be split into blocks and
the blocks would be spread out in various data nodes. I want to know
whether I can find out through GUI or logs exactly where which data
nodes contain which file blocks of a particular huge text file?


-
Ravi