You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2016/01/20 07:15:24 UTC

[Hadoop Wiki] Update of "LargeClusterTips" by ArpitAgarwal

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "LargeClusterTips" page has been changed by ArpitAgarwal:
https://wiki.apache.org/hadoop/LargeClusterTips?action=diff&rev1=11&rev2=12

Comment:
Link to NameNode HA

   * Once you are on the private LAN, turn off all firewalls on the machines, as it only creates connectivity problems.
   * Use LDAP or similar to manage user accounts.
   * Only put the slaves file on your namenode and secondary namenode to prevent confusion.
- 
   * Use RPMs to install the Hadoop binaries. [[Cloudera]] provide some RPMs for this, and a web site to generate configuration RPM files.
   * Use kickstart or similar to bring up the machines. 
   * If you are trying to configure the machines one by one, step away from the keyboard. That is not the way to manage a cluster.
@@ -34, +33 @@

  == NameNode Health ==
  
  The NameNode is a SPOF. When it goes offline, the cluster goes down. If it loses its data, the filesystem is gone. Value it.
-  * Have a secondary name node! When the BackupNode replaces this, have a BackupNode!
+  * [[Configure NameNode High-Availability|https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html]]
   * Never let its disks fill up.
   * Consider RAID storage here. If not, set it to save its data to two independent drives, ideally on separate controllers (just in case the controller decides to play up)
   * Set the NN up to save one copy of all its data to a remote machine (NFS?), so even if the NN goes down, you can bring up a new machine with the same hostname for everything else to bind to.