You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org> on 2008/11/19 02:23:44 UTC

[jira] Commented: (HADOOP-4061) Large number of decommission freezes the Namenode

    [ https://issues.apache.org/jira/browse/HADOOP-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648866#action_12648866 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-4061:
------------------------------------------------

Instead of checking all blocks in all decommissioning nodes every 5 minutes, we should check a limited number of blocks in a shorter period.  It would throttle decommissioning in Namenode.  A drawback is that the decommission status update may be delayed.

> Deleting A from Dx is correct because that how NN handles excess replicas in general.

This is probably a good idea to improve the performance.

> Large number of decommission freezes the Namenode
> -------------------------------------------------
>
>                 Key: HADOOP-4061
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4061
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.17.2
>            Reporter: Koji Noguchi
>
> On 1900 nodes cluster, we tried decommissioning 400 nodes with 30k blocks each. Other 1500 nodes were almost empty.
> When decommission started, namenode's queue overflowed every 6 minutes.
> Looking at the cpu usage,  it showed that every 5 minutes org.apache.hadoop.dfs.FSNamesystem$DecommissionedMonitor thread was taking 100% of the CPU for 1 minute causing the queue to overflow.
> {noformat}
>   public synchronized void decommissionedDatanodeCheck() {
>     for (Iterator<DatanodeDescriptor> it = datanodeMap.values().iterator();
>          it.hasNext();) {
>       DatanodeDescriptor node = it.next();
>       checkDecommissionStateInternal(node);
>     }
>   }
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.