You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "zhangwei (JIRA)" <ji...@apache.org> on 2009/01/13 10:35:00 UTC

[jira] Created: (HADOOP-5019) add querying block's info in the fsck facility

add querying block's info in the fsck facility
----------------------------------------------

                 Key: HADOOP-5019
                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
             Project: Hadoop Core
          Issue Type: New Feature
          Components: dfs
            Reporter: zhangwei
            Priority: Minor


As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "zhangwei (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663650#action_12663650 ] 

zhangwei commented on HADOOP-5019:
----------------------------------

I think an explicit option to fsck would be better,I'll modify it now.

the permission  is to display the file's permission  like this:
$ ./hadoop fsck blk_293_1316
"/tmp/usertestfile.txt":nobody:root:rw-r--r--
Loaction of block:
 jx-hadoop-data04.jx.baidu.com 

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "zhangwei (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zhangwei updated HADOOP-5019:
-----------------------------

    Status: Open  (was: Patch Available)

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663905#action_12663905 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5019:
------------------------------------------------

> I agree with Steve. If a user has write permission to the file, then he/she should be able to find information about this block.

I disagree.  Write permission should only mean write to the file but not anything else.

The permission requirement for reading file meta data such as file size is x on the parent directory.  It seems that this block query operation should do the same.

However, considering that block details like block IDs, generation stamps, etc. should be invisible to users in general.  I think it makes sense to require superuser.

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663728#action_12663728 ] 

Steve Loughran commented on HADOOP-5019:
----------------------------------------

Security: what about having whoever has write access to the file can do it? This would let the owner of a file check it. 

Similarly, you may want to give people the right to fsck a specific file, which would look at every block in it.

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663500#action_12663500 ] 

dhruba borthakur commented on HADOOP-5019:
------------------------------------------

I like an explicit option to the "fsck" command. 

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "zhangwei (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zhangwei updated HADOOP-5019:
-----------------------------

    Attachment: HADOOP-5019.patch

If the path arg starts with "blk_" ,then fsck examine the followed blkid  and ignore the genernation stamp.
Then get the block's inode through blocksMap.getINode(b) ,
and get the inode's filename and it's parnet's filename recursively to fetch the full path string 
and print out the full path ,datanode's locations and permission status finaly.


Or if the path not start with it,it go to check as nomal.

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663543#action_12663543 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5019:
------------------------------------------------

What kind of permissions should be required for this new operation?  I am thinking it should require superuser.

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "zhangwei (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665068#action_12665068 ] 

zhangwei commented on HADOOP-5019:
----------------------------------

> Security: what about having whoever has write access to the file can do it? This would let the owner of a file check it.
I don't think so. Given the block No,how do you know who own it before running check?
Super user may be  suitable to get the block's info.

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019-2.patch, HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "zhangwei (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663971#action_12663971 ] 

zhangwei commented on HADOOP-5019:
----------------------------------

In recently fsck implement,a guest can use fsck to check all the path and not consider what  the file/dir 's permission is or who own it. It's a real problem .
Maybe the fsck should only be used by super user like linux's e2fsck.and we could check the user if super user or not at beginning. 
If this done ,i thinks is OK to show permissions when running "./hadoop fsck -getblockinfo blk_293_1314"

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-5019:
----------------------------------

    Assignee: zhangwei
      Status: Open  (was: Patch Available)

* The patch contains commented-out code; would you mind removing it?
* Ignoring the AccessControlException here is probably not correct:
{noformat}
+    } catch (AccessControlException e) {
+      e.printStackTrace();
+      // something went wrong getting this ugi...
+      LOG.warn(" - could not get ugi ");
+      
+    }
{noformat}
Since fsck already enforces permissions after HADOOP-4268, is this the correct way to do it?
* Is it necessary to make BlocksMap::getINode(Block) public?

I don't think Raghu's question about why this belongs in fsck was ever answered...

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Assignee: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019-2.patch, HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663830#action_12663830 ] 

dhruba borthakur commented on HADOOP-5019:
------------------------------------------

I agree with Steve. If a user has write permission to the file, then he/she should be able to find information about this block. Can this be done?

Also, it would be nice to have the command as follows

./hadoop fsck -getblockinfo blk_293_1314
"/tmp/usertestfile.txt":nobody:root:   rw-r-r-      jx-hadoop-data04.jx.baidu.com:50010

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "zhangwei (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zhangwei updated HADOOP-5019:
-----------------------------

    Attachment: HADOOP-5019-2.patch

Hi,all
I modified it based newly trunk,the implement is these:

ONLY super user or one of the user's group is super group can run "./hadoop fsck <blockname>  -getblockinfo" 
If the block found , then will show STATUS: BLOCK FOUND.
else they will show BLOCK NOT FOUND ,BLOCK INVALID or PERMISSION DENIED

And add new test case : testFsckWithBlockInfo

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019-2.patch, HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663428#action_12663428 ] 

Raghu Angadi commented on HADOOP-5019:
--------------------------------------

"blk_" to indicate a different option fsck would work. But I think other options might be cleaner than this hack :
 
- first preference : it is not fsck. It could just be another command ("-blockInfo" or some such).
- second : add an explicit option to fsck.


> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "zhangwei (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zhangwei updated HADOOP-5019:
-----------------------------

    Status: Patch Available  (was: Open)

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663840#action_12663840 ] 

Raghu Angadi commented on HADOOP-5019:
--------------------------------------

Any reason why this is part of fsck?

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "zhangwei (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zhangwei updated HADOOP-5019:
-----------------------------

    Status: Patch Available  (was: Open)

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019-2.patch, HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5019) add querying block's info in the fsck facility

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663977#action_12663977 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5019:
------------------------------------------------

> In recently fsck implement,a guest can use fsck to check all the path and not consider what the file/dir 's permission is or who own it.

This is no longer true after HADOOP-4268.

> add querying block's info in the fsck facility
> ----------------------------------------------
>
>                 Key: HADOOP-5019
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5019
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: zhangwei
>            Priority: Minor
>         Attachments: HADOOP-5019.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> As now the fsck can do pretty well,but when the developer happened to the log such Block blk_28622148 is not valid.etc
> We wish to know which file and the datanodes the block belongs to.It  can be solved by running "bin/hadoop fsck -files -blocks -locations / | grep <blockid>" ,but as mentioned early in the HADOOP-4945 ,it's not an effective way in a big product cluster.
> so maybe we could do something to let the fsck more convenience .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.