You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-dev@hadoop.apache.org by "wudeyu (Jira)" <ji...@apache.org> on 2021/06/30 06:39:00 UTC

[jira] [Created] (HDFS-16100) HA: Improve performance of Standby node transition to Active

wudeyu created HDFS-16100:
-----------------------------

Summary: HA: Improve performance of Standby node transition to Active
Key: HDFS-16100
URL: https://issues.apache.org/jira/browse/HDFS-16100
Project: Hadoop HDFS
Issue Type: Wish
Components: namenode
Reporter: wudeyu

pendingDNMessages in Standby is used to support process postponed block reports. Block reports in pendingDNMessages would be processed:
# If GS of replica is in the future, Standby Node will process it when corresponding edit log(e.g add_block) is loaded.
# If replica is corrupted, Standby Node will process it while it transfer to Active.
# If DataNode is removed, corresponding of block reports will be removed in pendingDNMessages.

Obviously, if num of corrupted replica grows, more time cost during transferring. In out situation, there're 60 millions block reports in pendingDNMessages before transfer. Processing block reports cost almost 7mins and it's killed by zkfc. The replica state of the most block reports is RBW with wrong GS(less than storedblock in Standby Node).

In my opinion, Standby Node could ignore the block reports that replica state is RBW with wrong GS. Because Active node/DataNode will remove it later.

--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org