You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Konstantin Shvachko (JIRA)" <ji...@apache.org> on 2007/10/08 23:06:50 UTC
[jira] Commented: (HADOOP-1999) DataNodes can become dead nodes
when running 'dfsadmin finalizeUpgrade'
[ https://issues.apache.org/jira/browse/HADOOP-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533201 ]
Konstantin Shvachko commented on HADOOP-1999:
---------------------------------------------
finalize removes hard links previously created by upgrade. The removal is done in a separate thread, but if there is a lot of blocks,
then data-nodes are likely to be blocked on IOs, that is data transmission will be slow. This is what you observed here.
A solution would be to remove the links lazily, e.g. remove 100 files per second or so. Then finalizing will go slower, but
the data-nodes will be able to proceed with normal activities.
The jstack you attached: I do not see that data-node is doing any file deletes. Are you sure this thread dump was done
during finalize? I see that one of the threads is doing DU though. Could the slowdown be related to HADOOP-1946?
Before this was fixed I've seen drastic slowdown of data-nodes, some of them would become dead even with insignificant load.
Finalize would make things even worse.
Missing blocks: I suspect that you get these because many io operation were not complete. Some blocks were not replicated,
some files were not closed.
> DataNodes can become dead nodes when running 'dfsadmin finalizeUpgrade'
> -----------------------------------------------------------------------
>
> Key: HADOOP-1999
> URL: https://issues.apache.org/jira/browse/HADOOP-1999
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.15.0
> Environment: Sep 14 nightly build
> Reporter: Christian Kunz
> Priority: Critical
> Attachments: jstack.datanode
>
>
> I restarted namenode with -upgrade option, started a few scripts running hadoop command line utility to upload a few files into dfs, and ran at some time
> hadoop dfsadmin -finalizeUpgrade.
> At this time all the dfs clients I started before got stuck during block transmission.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.