You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by mo...@richmondinformatics.com on 2006/03/23 20:46:53 UTC

DFSck - fsck for hadoop

Thanks Andrzej,

Works a treat, which is more than I can say for my thoroughly broken DFS :(

I used:

# bin/hadoop org.apache.hadoop.dfs.DFSck /user/root/crawl

and got!

Status: CORRUPT
 Total size:    35381601016 B
 Total blocks:  2719 (avg. block size 13012725 B)
 Total dirs:    459
 Total files:   1751
  ********************************
  CORRUPT FILES:        1082
  MISSING BLOCKS:       1579
  MISSING SIZE: 20189790423 B
  ********************************
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Target replication factor:     3
 Real replication factor:       3.0

This of course is true.

For other users wanting to know how to break DFS (or not), one way is to add a
new master node to the cluster, and misconfigure "fs.default.name" in
nutch/hadoop-site.xml:

  <name>fs.default.name</name>
  <value>nutch0:50000</value>

My error was that I intended to run nutch0 as job.tracker, but not as a
datanode.  So, when I ran bin/start-all.sh to start the cluster, it seemed to
replicate the non-existent filesystem on nutch0; thereby starting to delete all
my precious data.

One way to learn!

Thanks,

Monu Ogbe

----- Original Message -----
From: "Andrzej Bialecki (JIRA)" <ji...@apache.org>
To: <ha...@lucene.apache.org>
Sent: Thursday, March 23, 2006 6:52 PM
Subject: [jira] Updated: (HADOOP-101) DFSck - fsck-like utility for checking DFS
volumes


>     [ http://issues.apache.org/jira/browse/HADOOP-101?page=all ]
>
> Andrzej Bialecki  updated HADOOP-101:
> -------------------------------------
>
>    Attachment: DFSck.java
>
>> DFSck - fsck-like utility for checking DFS volumes
>> --------------------------------------------------
>>
>>          Key: HADOOP-101
>>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>>      Project: Hadoop
>>         Type: New Feature
>>   Components: dfs
>>     Versions: 0.2
>>     Reporter: Andrzej Bialecki
>>     Assignee: Andrzej Bialecki
>>  Attachments: DFSck.java
>>
>> This is a utility to check health status of a DFS volume, and collect some
additional statistics.
>
> --
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the administrators:
>   http://issues.apache.org/jira/secure/Administrators.jspa
> -
> For more information on JIRA, see:
>   http://www.atlassian.com/software/jira
>
>


Re: Hadoop File Capacity

Posted by Eric Baldeschwieler <er...@yahoo-inc.com>.
yes

On Mar 26, 2006, at 12:11 PM, Dennis Kubes wrote:

> For the Hadoop filesystem, I know that it is basically unlimited in  
> terms of
> storage because one can always add new hardware, but it is  
> unlimited in
> terms of a single file?
>
> What I mean by this is if I store a file /user/dir/a.index and this  
> file has
> say 100 blocks in it where there is only enough space on any server  
> for 10
> blocks; will the Hadoop filesystem store and replicate different  
> blocks on
> different servers and give the client a single file view or does a  
> whole
> file have to be stored and replicated across machines.
>
> Dennis
>


RE: Hadoop File Capacity

Posted by Yoram Arnon <ya...@yahoo-inc.com>.
Each block of each file is scattered on (currently) three random data nodes,
not related to the previous block placement. So no, no limits on file size
until you reach the FS limits, which are reasonably high and growing
(probably a couple 100 TB currently).

-----Original Message-----
From: Dennis Kubes [mailto:nutch-dev@dragonflymc.com] 
Sent: Sunday, March 26, 2006 12:12 PM
To: hadoop-user@lucene.apache.org
Subject: Hadoop File Capacity

For the Hadoop filesystem, I know that it is basically unlimited in terms of
storage because one can always add new hardware, but it is unlimited in
terms of a single file?

What I mean by this is if I store a file /user/dir/a.index and this file has
say 100 blocks in it where there is only enough space on any server for 10
blocks; will the Hadoop filesystem store and replicate different blocks on
different servers and give the client a single file view or does a whole
file have to be stored and replicated across machines.

Dennis



Hadoop File Capacity

Posted by Dennis Kubes <nu...@dragonflymc.com>.
For the Hadoop filesystem, I know that it is basically unlimited in terms of
storage because one can always add new hardware, but it is unlimited in
terms of a single file?

What I mean by this is if I store a file /user/dir/a.index and this file has
say 100 blocks in it where there is only enough space on any server for 10
blocks; will the Hadoop filesystem store and replicate different blocks on
different servers and give the client a single file view or does a whole
file have to be stored and replicated across machines.

Dennis


Re: DFSck - fsck for hadoop

Posted by Eric Baldeschwieler <er...@yahoo-inc.com>.
+1 again 8-)

On Mar 23, 2006, at 2:26 PM, Yoram Arnon wrote:

> Another idea, in addition to an explicit format command, is to  
> configure the
> name node with the cluster's data nodes, rather than allowing any  
> node to
> connect ad hoc. A name node would then ignore an unexpected data  
> node. It
> would also be able to report when a data node is missing and could  
> make
> operational decisions  based on the number and identity of nodes  
> that are up
> vs. down.
>
> -----Original Message-----
> From: Doug Cutting [mailto:cutting@apache.org]
> Sent: Thursday, March 23, 2006 12:27 PM
> To: hadoop-user@lucene.apache.org
> Subject: Re: DFSck - fsck for hadoop
>
> monu.ogbe@richmondinformatics.com wrote:
>> My error was that I intended to run nutch0 as job.tracker, but not as
>> a datanode.  So, when I ran bin/start-all.sh to start the cluster, it
>> seemed to replicate the non-existent filesystem on nutch0; thereby
>> starting to delete all my precious data.
>
> It would be nice if this were harder to do.  A simple solution I  
> proposed
> would be to make it so that a new filesystem is not created  
> automatically
> when a namenode is started in an empty directory.  Rather a  
> 'format' command
> could be required.  A more complex solution might be to have a  
> filesystem
> id.  For example, some bits from each block id issued could be the
> filesystem id.  When datanodes report blocks from a different  
> filesystem,
> the namenode would ignore them rather than delete them.
>
> Doug
>
>


RE: DFSck - fsck for hadoop

Posted by Yoram Arnon <ya...@yahoo-inc.com>.
Another idea, in addition to an explicit format command, is to configure the
name node with the cluster's data nodes, rather than allowing any node to
connect ad hoc. A name node would then ignore an unexpected data node. It
would also be able to report when a data node is missing and could make
operational decisions  based on the number and identity of nodes that are up
vs. down.

-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org] 
Sent: Thursday, March 23, 2006 12:27 PM
To: hadoop-user@lucene.apache.org
Subject: Re: DFSck - fsck for hadoop

monu.ogbe@richmondinformatics.com wrote:
> My error was that I intended to run nutch0 as job.tracker, but not as 
> a datanode.  So, when I ran bin/start-all.sh to start the cluster, it 
> seemed to replicate the non-existent filesystem on nutch0; thereby 
> starting to delete all my precious data.

It would be nice if this were harder to do.  A simple solution I proposed
would be to make it so that a new filesystem is not created automatically
when a namenode is started in an empty directory.  Rather a 'format' command
could be required.  A more complex solution might be to have a filesystem
id.  For example, some bits from each block id issued could be the
filesystem id.  When datanodes report blocks from a different filesystem,
the namenode would ignore them rather than delete them.

Doug



Re: DFSck - fsck for hadoop

Posted by Doug Cutting <cu...@apache.org>.
monu.ogbe@richmondinformatics.com wrote:
> My error was that I intended to run nutch0 as job.tracker, but not as a
> datanode.  So, when I ran bin/start-all.sh to start the cluster, it seemed to
> replicate the non-existent filesystem on nutch0; thereby starting to delete all
> my precious data.

It would be nice if this were harder to do.  A simple solution I 
proposed would be to make it so that a new filesystem is not created 
automatically when a namenode is started in an empty directory.  Rather 
a 'format' command could be required.  A more complex solution might be 
to have a filesystem id.  For example, some bits from each block id 
issued could be the filesystem id.  When datanodes report blocks from a 
different filesystem, the namenode would ignore them rather than delete 
them.

Doug