You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by 麦树荣 <sh...@qunar.com> on 2015/04/08 04:40:03 UTC

答复: Why replication of Under-Replicated blocks in decommissioned datanodes is so slow

Could someone explain it ? Thanks.

发件人: 麦树荣 [mailto:shurong.mai@qunar.com]
发送时间: 2015年4月7日 11:19
收件人: user@hadoop.apache.org
抄送: 山瑞峰
主题: Why replication of Under-Replicated blocks in decommissioned datanodes is so slow

version: hadoop-2.2.0

There were 13 nodes in our hdfs cluster. We wanted to decommission 7 nodes. We used two methods as follow:

Method 1:
At the beginning, we set the dfs.hosts.exclude parameter and successfully decommissioned 7 nodes, so there were many Under-Replicated blocks need to replicate. However, it spent about 20 hours and the replication didn’t finish yet. We observed the speed of replication is very slow.

Method 2:
Later, we gave up the method, and used another method of stopping datanode node by node. We stopped one datanode. When replication of Under-Replicated blocks of the node finished, we continued to stop another datanode till 7 nodes were stopped. It spent about 12 hours and the speed of replication is obviously much faster the method 1.

We thought method 1 should be faster method 2. But factually, method 2 is much faster than method 1. Why ?