You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2010/09/09 19:20:25 UTC

[Hadoop Wiki] Update of "FAQ" by SomeOtherAccount

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "FAQ" page has been changed by SomeOtherAccount.
http://wiki.apache.org/hadoop/FAQ?action=diff&rev1=72&rev2=73

--------------------------------------------------

  
  <<BR>> <<Anchor(3)>> '''3. [[#A3|How well does Hadoop scale?]]'''
  
- Hadoop has been demonstrated on clusters of up to 2000 nodes.  Sort performance on 900 nodes is good (sorting 9TB of data on 900 nodes takes around 1.8 hours) and [[attachment:sort900-20080115.png|improving]] using these non-default configuration values:
+ Hadoop has been demonstrated on clusters of up to 4000 nodes.  Sort performance on 900 nodes is good (sorting 9TB of data on 900 nodes takes around 1.8 hours) and [[attachment:sort900-20080115.png|improving]] using these non-default configuration values:
  
   * `dfs.block.size = 134217728`
   * `dfs.namenode.handler.count = 40`
@@ -276, +276 @@

  
  It appears that DatanodeID.getHost() is the standard place to retrieve this name, and the machineName variable, populated in DataNode.java\#startDataNode, is where the name is first set. The first method attempted is to get "slave.host.name" from the configuration; if that is not available, DNS.getDefaultHost is used instead.
  
+ <<BR>> <<Anchor(31)>> '''31. [[#A31|On an individual data node, how do you balance the blocks on the disk?]]'''
+ 
+ Hadoop currently does not have a method by which to do this automatically.  To do this manually:
+ 
+  1. Take down the HDFS
+  2. Use the UNIX mv command to move the individual blocks and meta pairs from one directory to another on each host
+  3. Restart the HDFS
+