You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Rushabh S Shah (JIRA)" <ji...@apache.org> on 2015/08/06 23:28:06 UTC
[jira] [Created] (HDFS-8869) Don't mark storages as failed before
first block report
Rushabh S Shah created HDFS-8869:
------------------------------------
Summary: Don't mark storages as failed before first block report
Key: HDFS-8869
URL: https://issues.apache.org/jira/browse/HDFS-8869
Project: Hadoop HDFS
Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Rushabh S Shah
Assignee: Daryn Sharp
Creating this ticket on behalf of [~daryn].
Heartbeat processing performs the failed storage check. The DN reports its storages and any prior missing storages, ex. unique storage id upgrade, are marked failed. The heartbeat monitor removes all blocks associated to the failed storage. A replication storm ensues for all blocks on the node.
Eventually the DN block reports for the new storages - up to 15m later for large clusters. Now the NN has many excess blocks to invalidate. If the cluster has failed over in the past 24h, ex. rolling upgrade, the standby gone active will queue the block invalidations which triggers the severe performance degradation of HDFS-8674 which has been greatly lessened but is still an issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)