You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2010/09/09 19:20:25 UTC
[Hadoop Wiki] Update of "FAQ" by SomeOtherAccount
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "FAQ" page has been changed by SomeOtherAccount.
http://wiki.apache.org/hadoop/FAQ?action=diff&rev1=72&rev2=73
--------------------------------------------------
<<BR>> <<Anchor(3)>> '''3. [[#A3|How well does Hadoop scale?]]'''
- Hadoop has been demonstrated on clusters of up to 2000 nodes. Sort performance on 900 nodes is good (sorting 9TB of data on 900 nodes takes around 1.8 hours) and [[attachment:sort900-20080115.png|improving]] using these non-default configuration values:
+ Hadoop has been demonstrated on clusters of up to 4000 nodes. Sort performance on 900 nodes is good (sorting 9TB of data on 900 nodes takes around 1.8 hours) and [[attachment:sort900-20080115.png|improving]] using these non-default configuration values:
* `dfs.block.size = 134217728`
* `dfs.namenode.handler.count = 40`
@@ -276, +276 @@
It appears that DatanodeID.getHost() is the standard place to retrieve this name, and the machineName variable, populated in DataNode.java\#startDataNode, is where the name is first set. The first method attempted is to get "slave.host.name" from the configuration; if that is not available, DNS.getDefaultHost is used instead.
+ <<BR>> <<Anchor(31)>> '''31. [[#A31|On an individual data node, how do you balance the blocks on the disk?]]'''
+
+ Hadoop currently does not have a method by which to do this automatically. To do this manually:
+
+ 1. Take down the HDFS
+ 2. Use the UNIX mv command to move the individual blocks and meta pairs from one directory to another on each host
+ 3. Restart the HDFS
+