You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Raghu Angadi (JIRA)" <ji...@apache.org> on 2009/01/24 02:22:02 UTC
[jira] Issue Comment Edited: (HADOOP-4103) Alert for missing blocks
[ https://issues.apache.org/jira/browse/HADOOP-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666819#action_12666819 ]
rangadi edited comment on HADOOP-4103 at 1/23/09 5:21 PM:
---------------------------------------------------------------
(Edit : formatting only)
The scope of the fix is narrowed to the following :
* NameNode webui shows in (probably in red) indicating if there are any missing blocks.
** will mostly add simon stats for such a number.
* 'dfsadmin -metasave' can be used to find all the missing blocks
** a later jira will enhance -metasave or have different command that is more user friendly. currently -metasave is mainly meant for developers.
For this to be a straight forward fix, I need to make one policy change: currently if a block does not have any good replicas left it is not included in "neededReplications" list. I think this was done mainly as an "optimization". But a cluster should not have any blocks this state. even 'neededReplications' name implies such blocks should be included. It would be better if I don't need to add another list that need to be maintained.
was (Author: rangadi):
The scope of the fix is narrowed to the following :
# NameNode webui shows in (probably in red) indicating if there are any missing blocks.
#will mostly add simon stats for such a number.
# 'dfsadmin -metasave' can be used to find all the missing blocks
## later jira will enhance -metasave or have different command that is more user friendly. currently -metasave is mainly meant for developers.
For this to be a straight forward fix, I need to make one policy change: currently if a block does not have any good replicas left it is not included in "neededReplications" list. I think this was done mainly as an "optimization". But a cluster should not have any blocks this state. even 'neededReplications' name implies such blocks should be included. It would be better if I don't need to add another list that need to be maintained.
> Alert for missing blocks
> ------------------------
>
> Key: HADOOP-4103
> URL: https://issues.apache.org/jira/browse/HADOOP-4103
> Project: Hadoop Core
> Issue Type: New Feature
> Components: dfs
> Affects Versions: 0.17.2
> Reporter: Christian Kunz
> Assignee: Raghu Angadi
>
> A whole bunch of datanodes became dead because of some network problems resulting in heartbeat timeouts although datanodes were fine.
> Many processes started to fail because of the corrupted filesystem.
> In order to catch and diagnose such problems faster the namenode should detect the corruption automatically and provide a way to alert operations. At the minimum it should show the fact of corruption on the GUI.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.