You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Daryn Sharp (JIRA)" <ji...@apache.org> on 2014/02/13 17:13:20 UTC
[jira] [Created] (HDFS-5947) Improve dead node detection and
handling
Daryn Sharp created HDFS-5947:
---------------------------------
Summary: Improve dead node detection and handling
Key: HDFS-5947
URL: https://issues.apache.org/jira/browse/HDFS-5947
Project: Hadoop HDFS
Issue Type: Improvement
Components: namenode
Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0
Reporter: Daryn Sharp
When {{HeartbeatManager.heartbeatCheck}} runs:
# All DNs are scanned to count dead nodes
# Processes the first dead node
# If there was a dead node, loops to re-scan all DNs again
Processing the dead node holds the namesystem write lock while removing the node from the blockmap. It also appears to do a lot of work to immediately re-adjust the replication queues. All this processing might be too expensive while holding the write lock, ex. if a rack or two is lost.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)