You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2006/03/23 19:52:18 UTC

[jira] Created: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

DFSck - fsck-like utility for checking DFS volumes
--------------------------------------------------

         Key: HADOOP-101
         URL: http://issues.apache.org/jira/browse/HADOOP-101
     Project: Hadoop
        Type: New Feature
  Components: dfs  
    Versions: 0.2    
    Reporter: Andrzej Bialecki 
 Assigned to: Andrzej Bialecki  
 Attachments: DFSck.java

This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-101?page=all ]

Andrzej Bialecki  updated HADOOP-101:
-------------------------------------

    Attachment: DFSck.java

Updated version. Added options to treat inconsistencies: ignore, move to /lost+found, delete.

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371689 ] 

Andrzej Bialecki  commented on HADOOP-101:
------------------------------------------

Wow, lots of comments, let me address some of them:

* re: locking. I also see this as an advantage, fsck can run in parallel with normal operations. If someone else deletes a file, no big deal - the name is removed from the namesystem, so if we suddenly detect missing blocks we could always check if a file with this name still exists in the namesystem.

* re: performance. Sure, we could parallelize this, which should speed things up (currently it's rather slow, checking ~1TB takes > 2 hours), but then it would put a higher load on the namenode. Perhaps we could make this an option, e.g. start a configurable pool of fsck threads in parallel.

* re: blocks not in use by any file. I think this is already handled internally by namenode<->datanode protocol (for good and for bad), i.e. namenode detects orphaned blocks and tells datanodes to remove them. See FSNamesystem:924 .

* handling the reverse situation (missing blocks in existing files) should be straightforward, with the use of /lost+found directory: for each corrupted file a directory would be created there, and remaining chains of consecutive blocks would be stored in that directory.

* re: checking blocks through streaming: +1, I like the concept, could you perhaps implement it? ;) Also, what happens if a mapred task tries to retrieve a missing/corrupted block? I think currently this hangs the task, due to a missing break in the while loop in DFSClient:354

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371808 ] 

Andrzej Bialecki  commented on HADOOP-101:
------------------------------------------

There were around 40,000 directories, and 200,000 files.

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371625 ] 

Konstantin Shvachko commented on HADOOP-101:
--------------------------------------------

Looks like a good utility for collecting meta-data statistics.

It does not detect missing blocks, though.
The missing block problem is when the namenode "thinks" a block
is stored on one or several datanodes, while it is not.
In order to detect this the utility needs to actually read blocks from
each of the specified locations. Is there an API call for that?
And it does not have an option to convert the system into a consistent state,
which is usually provided by fscks.

It would be very useful to integrate this code with the DFSShell.report(),
providing more reporting options for the system.

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371806 ] 

Hairong Kuang commented on HADOOP-101:
--------------------------------------

Andrzei, for your benchmark experiment (~1TB takes > 2 hours), how many directories and files were there in your dfs? For the performance of your dfsck, I think the number of files and dirs matters rather than the size of the data in dfs.

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Closed: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-101?page=all ]
     
Andrzej Bialecki  closed HADOOP-101:
------------------------------------

    Resolution: Fixed

Added as 'bin/hadoop fsck'.

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-101?page=all ]

Andrzej Bialecki  updated HADOOP-101:
-------------------------------------

    Attachment: DFSck.java

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371705 ] 

Konstantin Shvachko commented on HADOOP-101:
--------------------------------------------

Just posted a map-reduce test that checks all blocks of all files.
See HADOOP-95.
The infinite loop in DFSClient is fixed now, so it works.

With respect to some of the previous comments.
Restarting the cluster (a big one) just to check its consistency is not an exciting option.
This means that we will have to wait up to 55 minutes before missing blocks will
be detected by examining just the namenode data.

A drawback of the map-reduce test is that we cannot force the system to check all replicas 
of the block. So corrupted block is reported only if all of its replicas are bad.
But yes this is better than nothing.

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371707 ] 

Andrzej Bialecki  commented on HADOOP-101:
------------------------------------------

Regarding the restarting time - yes, but I couldn't find any other way to force datanodes to update their block reports with the namenode, perhaps we should extend BlockCommand to support this (the namenode can't call datanodes, it can just return BlockCommands when datanodes call in).

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Yoram Arnon (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371658 ] 

Yoram Arnon commented on HADOOP-101:
------------------------------------

This is a good and awaited feature, filed previously as bug hadoop-95. I vote to check it in, because as you say, it's much better than anything we have, and of critical importance.

Regarding performance, clearly the nameserver will not be overwhelmed, but the operation may take a very long time to execute. It's one thing to traverse a million entries in memory (for a modest 32TB FS), but another matter to execute a hundred thousand RPC calls from a single client. Also, when we change the open command to not return the entire list of blocks, in the interest of shortening the time of opening a file, especially when reading just a few blocks from a very large file, the implementation will need to change.

Lastly, there's extensibility. We'll want to test for things that are available only on the name server, like blocks that are not used by any file.

Wouldn't it be better to request the server to execute this code internally, and report results either to the client or to a local file?

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371657 ] 

Doug Cutting commented on HADOOP-101:
-------------------------------------

I think not locking the fs is a feature.  It should probably, if it gets file-not-found error from the namenode, backup and recheck to see if the file still exists, and, if it doesn't check whether its parent still exists, etc., attempting to ignore errors that are only the result of changes to the FS while checking it.  In the short-term, this tool is superior to anything else we currently have, and I'd vote for adding it as a 'bin/hadoop dfs -check' command.  As for performance, as a single-threaded client process it shouldn't be able to overwhelm the namenode.


> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371628 ] 

Andrzej Bialecki  commented on HADOOP-101:
------------------------------------------

It does detect missing blocks, that's the whole point. However, you need to restart the namenode first to be sure you get the latest block reports from datanodes.

Current failure modes for DFS involve blocks that are completely missing. The only way to "fix" them would be to recover chains of blocks and put them into lost+found - and here we could do better than fsck, because we know the file name they belong to.

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371653 ] 

Hairong Kuang commented on HADOOP-101:
--------------------------------------

The dfsck is done at the client side and hence does not lock the file system. It has a potential inconsitency problem. What if a client deletes a directory when  tool is checking it?

Another concern is the performance. The tool needs to issue a RPC to the namenode for each dir/file in the system. Any benchmark?

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-101?page=all ]

Andrzej Bialecki  updated HADOOP-101:
-------------------------------------

    Attachment:     (was: DFSck.java)

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Re: [jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by Eric Baldeschwieler <er...@yahoo-inc.com>.
Perhaps we should implement a system by which removed blocks linger  
on the data nodes for a long period of time (a week?) (or until the  
space is needed fifo)?

Ditto with the meta data?  That would make archeology around failures  
easier.

On Mar 23, 2006, at 4:56 PM, Doug Cutting (JIRA) wrote:

>     [ http://issues.apache.org/jira/browse/HADOOP-101? 
> page=comments#action_12371659 ]
>
> Doug Cutting commented on HADOOP-101:
> -------------------------------------
>
> I like that this does not use anything more than the client API to  
> check the server.  That keeps the server core lean and mean.  The  
> use of RPC's effectively restricts the impact of the scan on the FS.
>
> A datanode operation that streams through a block without  
> transferring it over the wire won't correctly check checksums using  
> our existing mechanism.  To check file content we could instead  
> simply implement a map-reduce job that streams through all the  
> files in the fs.  This would not take much code: nothing additional  
> in the core.  MapReduce should handle the locality, so that most  
> data shouldn't go over the wire.
>
> BTW, blocks not used by any file are not known to the name node,  
> are they?  When they're reported by a datanode the datanode is told  
> to remove them.
>
>
>> DFSck - fsck-like utility for checking DFS volumes
>> --------------------------------------------------
>>
>>          Key: HADOOP-101
>>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>>      Project: Hadoop
>>         Type: New Feature
>>   Components: dfs
>>     Versions: 0.2
>>     Reporter: Andrzej Bialecki
>>     Assignee: Andrzej Bialecki
>>  Attachments: DFSck.java
>>
>> This is a utility to check health status of a DFS volume, and  
>> collect some additional statistics.
>
> -- 
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the  
> administrators:
>    http://issues.apache.org/jira/secure/Administrators.jspa
> -
> For more information on JIRA, see:
>    http://www.atlassian.com/software/jira
>


[jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371659 ] 

Doug Cutting commented on HADOOP-101:
-------------------------------------

I like that this does not use anything more than the client API to check the server.  That keeps the server core lean and mean.  The use of RPC's effectively restricts the impact of the scan on the FS.

A datanode operation that streams through a block without transferring it over the wire won't correctly check checksums using our existing mechanism.  To check file content we could instead simply implement a map-reduce job that streams through all the files in the fs.  This would not take much code: nothing additional in the core.  MapReduce should handle the locality, so that most data shouldn't go over the wire.

BTW, blocks not used by any file are not known to the name node, are they?  When they're reported by a datanode the datanode is told to remove them.


> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371631 ] 

Andrzej Bialecki  commented on HADOOP-101:
------------------------------------------

Oops, forgot to comment on another point you make: yes, it would be great if we could add an API call to request a datanode to check the status of a given block without sending it to the client, e..g read it physically off the disk and report success/failure. We can do this now only by actually retrieving the block, which is too costly.

This is the first version of the tool, any contributions are of course welcome.

> DFSck - fsck-like utility for checking DFS volumes
> --------------------------------------------------
>
>          Key: HADOOP-101
>          URL: http://issues.apache.org/jira/browse/HADOOP-101
>      Project: Hadoop
>         Type: New Feature
>   Components: dfs
>     Versions: 0.2
>     Reporter: Andrzej Bialecki 
>     Assignee: Andrzej Bialecki 
>  Attachments: DFSck.java
>
> This is a utility to check health status of a DFS volume, and collect some additional statistics.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira