You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by cm...@apache.org on 2015/02/25 01:34:39 UTC

[09/11] hadoop git commit: HDFS-7668. Backport "Convert site documentation from apt to markdown" to branch-2 (Masatake Iwasaki via Colin P. McCabe)

http://git-wip-us.apache.org/repos/asf/hadoop/blob/5c0073e7/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm b/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm
deleted file mode 100644
index ff6a42c..0000000
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm
+++ /dev/null
@@ -1,816 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Distributed File System-${project.version} - High Availability
-  ---
-  ---
-  ${maven.build.timestamp}
-
-HDFS High Availability Using the Quorum Journal Manager
-
-%{toc|section=1|fromDepth=0}
-
-* {Purpose}
-
-  This guide provides an overview of the HDFS High Availability (HA) feature
-  and how to configure and manage an HA HDFS cluster, using the Quorum Journal
-  Manager (QJM) feature.
- 
-  This document assumes that the reader has a general understanding of
-  general components and node types in an HDFS cluster. Please refer to the
-  HDFS Architecture guide for details.
-
-* {Note: Using the Quorum Journal Manager or Conventional Shared Storage}
-
-  This guide discusses how to configure and use HDFS HA using the Quorum
-  Journal Manager (QJM) to share edit logs between the Active and Standby
-  NameNodes. For information on how to configure HDFS HA using NFS for shared
-  storage instead of the QJM, please see
-  {{{./HDFSHighAvailabilityWithNFS.html}this alternative guide.}}
-
-* {Background}
-
-  Prior to Hadoop 2.0.0, the NameNode was a single point of failure (SPOF) in
-  an HDFS cluster. Each cluster had a single NameNode, and if that machine or
-  process became unavailable, the cluster as a whole would be unavailable
-  until the NameNode was either restarted or brought up on a separate machine.
-  
-  This impacted the total availability of the HDFS cluster in two major ways:
-
-    * In the case of an unplanned event such as a machine crash, the cluster would
-      be unavailable until an operator restarted the NameNode.
-
-    * Planned maintenance events such as software or hardware upgrades on the
-      NameNode machine would result in windows of cluster downtime.
-  
-  The HDFS High Availability feature addresses the above problems by providing
-  the option of running two redundant NameNodes in the same cluster in an
-  Active/Passive configuration with a hot standby. This allows a fast failover to
-  a new NameNode in the case that a machine crashes, or a graceful
-  administrator-initiated failover for the purpose of planned maintenance.
-
-* {Architecture}
-
-  In a typical HA cluster, two separate machines are configured as NameNodes.
-  At any point in time, exactly one of the NameNodes is in an <Active> state,
-  and the other is in a <Standby> state. The Active NameNode is responsible
-  for all client operations in the cluster, while the Standby is simply acting
-  as a slave, maintaining enough state to provide a fast failover if
-  necessary.
-  
-  In order for the Standby node to keep its state synchronized with the Active
-  node, both nodes communicate with a group of separate daemons called
-  "JournalNodes" (JNs). When any namespace modification is performed by the
-  Active node, it durably logs a record of the modification to a majority of
-  these JNs. The Standby node is capable of reading the edits from the JNs, and
-  is constantly watching them for changes to the edit log. As the Standby Node
-  sees the edits, it applies them to its own namespace. In the event of a
-  failover, the Standby will ensure that it has read all of the edits from the
-  JounalNodes before promoting itself to the Active state. This ensures that the
-  namespace state is fully synchronized before a failover occurs.
-  
-  In order to provide a fast failover, it is also necessary that the Standby node
-  have up-to-date information regarding the location of blocks in the cluster.
-  In order to achieve this, the DataNodes are configured with the location of
-  both NameNodes, and send block location information and heartbeats to both.
-  
-  It is vital for the correct operation of an HA cluster that only one of the
-  NameNodes be Active at a time. Otherwise, the namespace state would quickly
-  diverge between the two, risking data loss or other incorrect results.  In
-  order to ensure this property and prevent the so-called "split-brain scenario,"
-  the JournalNodes will only ever allow a single NameNode to be a writer at a
-  time. During a failover, the NameNode which is to become active will simply
-  take over the role of writing to the JournalNodes, which will effectively
-  prevent the other NameNode from continuing in the Active state, allowing the
-  new Active to safely proceed with failover.
-
-* {Hardware resources}
-
-  In order to deploy an HA cluster, you should prepare the following:
-
-    * <<NameNode machines>> - the machines on which you run the Active and
-    Standby NameNodes should have equivalent hardware to each other, and
-    equivalent hardware to what would be used in a non-HA cluster.
-
-    * <<JournalNode machines>> - the machines on which you run the JournalNodes.
-    The JournalNode daemon is relatively lightweight, so these daemons may
-    reasonably be collocated on machines with other Hadoop daemons, for example
-    NameNodes, the JobTracker, or the YARN ResourceManager. <<Note:>> There
-    must be at least 3 JournalNode daemons, since edit log modifications must be
-    written to a majority of JNs. This will allow the system to tolerate the
-    failure of a single machine. You may also run more than 3 JournalNodes, but
-    in order to actually increase the number of failures the system can tolerate,
-    you should run an odd number of JNs, (i.e. 3, 5, 7, etc.). Note that when
-    running with N JournalNodes, the system can tolerate at most (N - 1) / 2
-    failures and continue to function normally.
-  
-  Note that, in an HA cluster, the Standby NameNode also performs checkpoints of
-  the namespace state, and thus it is not necessary to run a Secondary NameNode,
-  CheckpointNode, or BackupNode in an HA cluster. In fact, to do so would be an
-  error. This also allows one who is reconfiguring a non-HA-enabled HDFS cluster
-  to be HA-enabled to reuse the hardware which they had previously dedicated to
-  the Secondary NameNode.
-
-* {Deployment}
-
-** Configuration overview
-
-  Similar to Federation configuration, HA configuration is backward compatible
-  and allows existing single NameNode configurations to work without change.
-  The new configuration is designed such that all the nodes in the cluster may
-  have the same configuration without the need for deploying different
-  configuration files to different machines based on the type of the node.
- 
-  Like HDFS Federation, HA clusters reuse the <<<nameservice ID>>> to identify a
-  single HDFS instance that may in fact consist of multiple HA NameNodes. In
-  addition, a new abstraction called <<<NameNode ID>>> is added with HA. Each
-  distinct NameNode in the cluster has a different NameNode ID to distinguish it.
-  To support a single configuration file for all of the NameNodes, the relevant
-  configuration parameters are suffixed with the <<nameservice ID>> as well as
-  the <<NameNode ID>>.
-
-** Configuration details
-
-  To configure HA NameNodes, you must add several configuration options to your
-  <<hdfs-site.xml>> configuration file.
-
-  The order in which you set these configurations is unimportant, but the values
-  you choose for <<dfs.nameservices>> and
-  <<dfs.ha.namenodes.[nameservice ID]>> will determine the keys of those that
-  follow. Thus, you should decide on these values before setting the rest of the
-  configuration options.
-
-  * <<dfs.nameservices>> - the logical name for this new nameservice
-
-    Choose a logical name for this nameservice, for example "mycluster", and use
-    this logical name for the value of this config option. The name you choose is
-    arbitrary. It will be used both for configuration and as the authority
-    component of absolute HDFS paths in the cluster.
-
-    <<Note:>> If you are also using HDFS Federation, this configuration setting
-    should also include the list of other nameservices, HA or otherwise, as a
-    comma-separated list.
-
-----
-<property>
-  <name>dfs.nameservices</name>
-  <value>mycluster</value>
-</property>
-----
-
-  * <<dfs.ha.namenodes.[nameservice ID]>> - unique identifiers for each NameNode in the nameservice
-
-    Configure with a list of comma-separated NameNode IDs. This will be used by
-    DataNodes to determine all the NameNodes in the cluster. For example, if you
-    used "mycluster" as the nameservice ID previously, and you wanted to use "nn1"
-    and "nn2" as the individual IDs of the NameNodes, you would configure this as
-    such:
-
-----
-<property>
-  <name>dfs.ha.namenodes.mycluster</name>
-  <value>nn1,nn2</value>
-</property>
-----
-
-    <<Note:>> Currently, only a maximum of two NameNodes may be configured per
-    nameservice.
-
-  * <<dfs.namenode.rpc-address.[nameservice ID].[name node ID]>> - the fully-qualified RPC address for each NameNode to listen on
-
-    For both of the previously-configured NameNode IDs, set the full address and
-    IPC port of the NameNode processs. Note that this results in two separate
-    configuration options. For example:
-
-----
-<property>
-  <name>dfs.namenode.rpc-address.mycluster.nn1</name>
-  <value>machine1.example.com:8020</value>
-</property>
-<property>
-  <name>dfs.namenode.rpc-address.mycluster.nn2</name>
-  <value>machine2.example.com:8020</value>
-</property>
-----
-
-    <<Note:>> You may similarly configure the "<<servicerpc-address>>" setting if
-    you so desire.
-
-  * <<dfs.namenode.http-address.[nameservice ID].[name node ID]>> - the fully-qualified HTTP address for each NameNode to listen on
-
-    Similarly to <rpc-address> above, set the addresses for both NameNodes' HTTP
-    servers to listen on. For example:
-
-----
-<property>
-  <name>dfs.namenode.http-address.mycluster.nn1</name>
-  <value>machine1.example.com:50070</value>
-</property>
-<property>
-  <name>dfs.namenode.http-address.mycluster.nn2</name>
-  <value>machine2.example.com:50070</value>
-</property>
-----
-
-    <<Note:>> If you have Hadoop's security features enabled, you should also set
-    the <https-address> similarly for each NameNode.
-
-  * <<dfs.namenode.shared.edits.dir>> - the URI which identifies the group of JNs where the NameNodes will write/read edits
-
-    This is where one configures the addresses of the JournalNodes which provide
-    the shared edits storage, written to by the Active nameNode and read by the
-    Standby NameNode to stay up-to-date with all the file system changes the Active
-    NameNode makes. Though you must specify several JournalNode addresses,
-    <<you should only configure one of these URIs.>> The URI should be of the form:
-    "qjournal://<host1:port1>;<host2:port2>;<host3:port3>/<journalId>". The Journal
-    ID is a unique identifier for this nameservice, which allows a single set of
-    JournalNodes to provide storage for multiple federated namesystems. Though not
-    a requirement, it's a good idea to reuse the nameservice ID for the journal
-    identifier.
-
-    For example, if the JournalNodes for this cluster were running on the
-    machines "node1.example.com", "node2.example.com", and "node3.example.com" and
-    the nameservice ID were "mycluster", you would use the following as the value
-    for this setting (the default port for the JournalNode is 8485):
-
-----
-<property>
-  <name>dfs.namenode.shared.edits.dir</name>
-  <value>qjournal://node1.example.com:8485;node2.example.com:8485;node3.example.com:8485/mycluster</value>
-</property>
-----
-
-  * <<dfs.client.failover.proxy.provider.[nameservice ID]>> - the Java class that HDFS clients use to contact the Active NameNode
-
-    Configure the name of the Java class which will be used by the DFS Client to
-    determine which NameNode is the current Active, and therefore which NameNode is
-    currently serving client requests. The only implementation which currently
-    ships with Hadoop is the <<ConfiguredFailoverProxyProvider>>, so use this
-    unless you are using a custom one. For example:
-
-----
-<property>
-  <name>dfs.client.failover.proxy.provider.mycluster</name>
-  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
-</property>
-----
-
-  * <<dfs.ha.fencing.methods>> - a list of scripts or Java classes which will be used to fence the Active NameNode during a failover
-
-    It is desirable for correctness of the system that only one NameNode be in
-    the Active state at any given time. <<Importantly, when using the Quorum
-    Journal Manager, only one NameNode will ever be allowed to write to the
-    JournalNodes, so there is no potential for corrupting the file system metadata
-    from a split-brain scenario.>> However, when a failover occurs, it is still
-    possible that the previous Active NameNode could serve read requests to
-    clients, which may be out of date until that NameNode shuts down when trying to
-    write to the JournalNodes. For this reason, it is still desirable to configure
-    some fencing methods even when using the Quorum Journal Manager. However, to
-    improve the availability of the system in the event the fencing mechanisms
-    fail, it is advisable to configure a fencing method which is guaranteed to
-    return success as the last fencing method in the list. Note that if you choose
-    to use no actual fencing methods, you still must configure something for this
-    setting, for example "<<<shell(/bin/true)>>>".
-
-    The fencing methods used during a failover are configured as a
-    carriage-return-separated list, which will be attempted in order until one
-    indicates that fencing has succeeded. There are two methods which ship with
-    Hadoop: <shell> and <sshfence>. For information on implementing your own custom
-    fencing method, see the <org.apache.hadoop.ha.NodeFencer> class.
-
-    * <<sshfence>> - SSH to the Active NameNode and kill the process
-
-      The <sshfence> option SSHes to the target node and uses <fuser> to kill the
-      process listening on the service's TCP port. In order for this fencing option
-      to work, it must be able to SSH to the target node without providing a
-      passphrase. Thus, one must also configure the
-      <<dfs.ha.fencing.ssh.private-key-files>> option, which is a
-      comma-separated list of SSH private key files. For example:
-
----
-<property>
-  <name>dfs.ha.fencing.methods</name>
-  <value>sshfence</value>
-</property>
-
-<property>
-  <name>dfs.ha.fencing.ssh.private-key-files</name>
-  <value>/home/exampleuser/.ssh/id_rsa</value>
-</property>
----
-
-      Optionally, one may configure a non-standard username or port to perform the
-      SSH. One may also configure a timeout, in milliseconds, for the SSH, after
-      which this fencing method will be considered to have failed. It may be
-      configured like so:
-
----
-<property>
-  <name>dfs.ha.fencing.methods</name>
-  <value>sshfence([[username][:port]])</value>
-</property>
-<property>
-  <name>dfs.ha.fencing.ssh.connect-timeout</name>
-  <value>30000</value>
-</property>
----
-
-    * <<shell>> - run an arbitrary shell command to fence the Active NameNode
-
-      The <shell> fencing method runs an arbitrary shell command. It may be
-      configured like so:
-
----
-<property>
-  <name>dfs.ha.fencing.methods</name>
-  <value>shell(/path/to/my/script.sh arg1 arg2 ...)</value>
-</property>
----
-
-      The string between '(' and ')' is passed directly to a bash shell and may not
-      include any closing parentheses.
-
-      The shell command will be run with an environment set up to contain all of the
-      current Hadoop configuration variables, with the '_' character replacing any
-      '.' characters in the configuration keys. The configuration used has already had
-      any namenode-specific configurations promoted to their generic forms -- for example
-      <<dfs_namenode_rpc-address>> will contain the RPC address of the target node, even
-      though the configuration may specify that variable as
-      <<dfs.namenode.rpc-address.ns1.nn1>>.
-      
-      Additionally, the following variables referring to the target node to be fenced
-      are also available:
-
-*-----------------------:-----------------------------------+
-| $target_host          | hostname of the node to be fenced |
-*-----------------------:-----------------------------------+
-| $target_port          | IPC port of the node to be fenced |
-*-----------------------:-----------------------------------+
-| $target_address       | the above two, combined as host:port |
-*-----------------------:-----------------------------------+
-| $target_nameserviceid | the nameservice ID of the NN to be fenced |
-*-----------------------:-----------------------------------+
-| $target_namenodeid    | the namenode ID of the NN to be fenced |
-*-----------------------:-----------------------------------+
-      
-      These environment variables may also be used as substitutions in the shell
-      command itself. For example:
-
----
-<property>
-  <name>dfs.ha.fencing.methods</name>
-  <value>shell(/path/to/my/script.sh --nameservice=$target_nameserviceid $target_host:$target_port)</value>
-</property>
----
-      
-      If the shell command returns an exit
-      code of 0, the fencing is determined to be successful. If it returns any other
-      exit code, the fencing was not successful and the next fencing method in the
-      list will be attempted.
-
-      <<Note:>> This fencing method does not implement any timeout. If timeouts are
-      necessary, they should be implemented in the shell script itself (eg by forking
-      a subshell to kill its parent in some number of seconds).
-
-  * <<fs.defaultFS>> - the default path prefix used by the Hadoop FS client when none is given
-
-    Optionally, you may now configure the default path for Hadoop clients to use
-    the new HA-enabled logical URI. If you used "mycluster" as the nameservice ID
-    earlier, this will be the value of the authority portion of all of your HDFS
-    paths. This may be configured like so, in your <<core-site.xml>> file:
-
----
-<property>
-  <name>fs.defaultFS</name>
-  <value>hdfs://mycluster</value>
-</property>
----
-
-
-  * <<dfs.journalnode.edits.dir>> - the path where the JournalNode daemon will store its local state
-
-    This is the absolute path on the JournalNode machines where the edits and
-    other local state used by the JNs will be stored. You may only use a single
-    path for this configuration. Redundancy for this data is provided by running
-    multiple separate JournalNodes, or by configuring this directory on a
-    locally-attached RAID array. For example:
-
----
-<property>
-  <name>dfs.journalnode.edits.dir</name>
-  <value>/path/to/journal/node/local/data</value>
-</property>
----
-
-** Deployment details
-
-  After all of the necessary configuration options have been set, you must
-  start the JournalNode daemons on the set of machines where they will run. This
-  can be done by running the command "<hadoop-daemon.sh start journalnode>" and
-  waiting for the daemon to start on each of the relevant machines.
-
-  Once the JournalNodes have been started, one must initially synchronize the
-  two HA NameNodes' on-disk metadata.
-
-    * If you are setting up a fresh HDFS cluster, you should first run the format
-    command (<hdfs namenode -format>) on one of NameNodes.
-  
-    * If you have already formatted the NameNode, or are converting a
-    non-HA-enabled cluster to be HA-enabled, you should now copy over the
-    contents of your NameNode metadata directories to the other, unformatted
-    NameNode by running the command "<hdfs namenode -bootstrapStandby>" on the
-    unformatted NameNode. Running this command will also ensure that the
-    JournalNodes (as configured by <<dfs.namenode.shared.edits.dir>>) contain
-    sufficient edits transactions to be able to start both NameNodes.
-  
-    * If you are converting a non-HA NameNode to be HA, you should run the
-    command "<hdfs -initializeSharedEdits>", which will initialize the
-    JournalNodes with the edits data from the local NameNode edits directories.
-
-  At this point you may start both of your HA NameNodes as you normally would
-  start a NameNode.
-
-  You can visit each of the NameNodes' web pages separately by browsing to their
-  configured HTTP addresses. You should notice that next to the configured
-  address will be the HA state of the NameNode (either "standby" or "active".)
-  Whenever an HA NameNode starts, it is initially in the Standby state.
-
-** Administrative commands
-
-  Now that your HA NameNodes are configured and started, you will have access
-  to some additional commands to administer your HA HDFS cluster. Specifically,
-  you should familiarize yourself with all of the subcommands of the "<hdfs
-  haadmin>" command. Running this command without any additional arguments will
-  display the following usage information:
-
----
-Usage: DFSHAAdmin [-ns <nameserviceId>]
-    [-transitionToActive <serviceId>]
-    [-transitionToStandby <serviceId>]
-    [-failover [--forcefence] [--forceactive] <serviceId> <serviceId>]
-    [-getServiceState <serviceId>]
-    [-checkHealth <serviceId>]
-    [-help <command>]
----
-
-  This guide describes high-level uses of each of these subcommands. For
-  specific usage information of each subcommand, you should run "<hdfs haadmin
-  -help <command>>".
-
-  * <<transitionToActive>> and <<transitionToStandby>> - transition the state of the given NameNode to Active or Standby
-
-    These subcommands cause a given NameNode to transition to the Active or Standby
-    state, respectively. <<These commands do not attempt to perform any fencing,
-    and thus should rarely be used.>> Instead, one should almost always prefer to
-    use the "<hdfs haadmin -failover>" subcommand.
-
-  * <<failover>> - initiate a failover between two NameNodes
-
-    This subcommand causes a failover from the first provided NameNode to the
-    second. If the first NameNode is in the Standby state, this command simply
-    transitions the second to the Active state without error. If the first NameNode
-    is in the Active state, an attempt will be made to gracefully transition it to
-    the Standby state. If this fails, the fencing methods (as configured by
-    <<dfs.ha.fencing.methods>>) will be attempted in order until one
-    succeeds. Only after this process will the second NameNode be transitioned to
-    the Active state. If no fencing method succeeds, the second NameNode will not
-    be transitioned to the Active state, and an error will be returned.
-
-  * <<getServiceState>> - determine whether the given NameNode is Active or Standby
-
-    Connect to the provided NameNode to determine its current state, printing
-    either "standby" or "active" to STDOUT appropriately. This subcommand might be
-    used by cron jobs or monitoring scripts which need to behave differently based
-    on whether the NameNode is currently Active or Standby.
-
-  * <<checkHealth>> - check the health of the given NameNode
-
-    Connect to the provided NameNode to check its health. The NameNode is capable
-    of performing some diagnostics on itself, including checking if internal
-    services are running as expected. This command will return 0 if the NameNode is
-    healthy, non-zero otherwise. One might use this command for monitoring
-    purposes.
-
-    <<Note:>> This is not yet implemented, and at present will always return
-    success, unless the given NameNode is completely down.
-
-* {Automatic Failover}
-
-** Introduction
-
-  The above sections describe how to configure manual failover. In that mode,
-  the system will not automatically trigger a failover from the active to the
-  standby NameNode, even if the active node has failed. This section describes
-  how to configure and deploy automatic failover.
-
-** Components
-
-  Automatic failover adds two new components to an HDFS deployment: a ZooKeeper
-  quorum, and the ZKFailoverController process (abbreviated as ZKFC).
-
-  Apache ZooKeeper is a highly available service for maintaining small amounts
-  of coordination data, notifying clients of changes in that data, and
-  monitoring clients for failures. The implementation of automatic HDFS failover
-  relies on ZooKeeper for the following things:
-  
-    * <<Failure detection>> - each of the NameNode machines in the cluster
-    maintains a persistent session in ZooKeeper. If the machine crashes, the
-    ZooKeeper session will expire, notifying the other NameNode that a failover
-    should be triggered.
-
-    * <<Active NameNode election>> - ZooKeeper provides a simple mechanism to
-    exclusively elect a node as active. If the current active NameNode crashes,
-    another node may take a special exclusive lock in ZooKeeper indicating that
-    it should become the next active.
-
-  The ZKFailoverController (ZKFC) is a new component which is a ZooKeeper client
-  which also monitors and manages the state of the NameNode.  Each of the
-  machines which runs a NameNode also runs a ZKFC, and that ZKFC is responsible
-  for:
-
-    * <<Health monitoring>> - the ZKFC pings its local NameNode on a periodic
-    basis with a health-check command. So long as the NameNode responds in a
-    timely fashion with a healthy status, the ZKFC considers the node
-    healthy. If the node has crashed, frozen, or otherwise entered an unhealthy
-    state, the health monitor will mark it as unhealthy.
-
-    * <<ZooKeeper session management>> - when the local NameNode is healthy, the
-    ZKFC holds a session open in ZooKeeper. If the local NameNode is active, it
-    also holds a special "lock" znode. This lock uses ZooKeeper's support for
-    "ephemeral" nodes; if the session expires, the lock node will be
-    automatically deleted.
-
-    * <<ZooKeeper-based election>> - if the local NameNode is healthy, and the
-    ZKFC sees that no other node currently holds the lock znode, it will itself
-    try to acquire the lock. If it succeeds, then it has "won the election", and
-    is responsible for running a failover to make its local NameNode active. The
-    failover process is similar to the manual failover described above: first,
-    the previous active is fenced if necessary, and then the local NameNode
-    transitions to active state.
-
-  For more details on the design of automatic failover, refer to the design
-  document attached to HDFS-2185 on the Apache HDFS JIRA.
-
-** Deploying ZooKeeper
-
-  In a typical deployment, ZooKeeper daemons are configured to run on three or
-  five nodes. Since ZooKeeper itself has light resource requirements, it is
-  acceptable to collocate the ZooKeeper nodes on the same hardware as the HDFS
-  NameNode and Standby Node. Many operators choose to deploy the third ZooKeeper
-  process on the same node as the YARN ResourceManager. It is advisable to
-  configure the ZooKeeper nodes to store their data on separate disk drives from
-  the HDFS metadata for best performance and isolation.
-
-  The setup of ZooKeeper is out of scope for this document. We will assume that
-  you have set up a ZooKeeper cluster running on three or more nodes, and have
-  verified its correct operation by connecting using the ZK CLI.
-
-** Before you begin
-
-  Before you begin configuring automatic failover, you should shut down your
-  cluster. It is not currently possible to transition from a manual failover
-  setup to an automatic failover setup while the cluster is running.
-
-** Configuring automatic failover
-
-  The configuration of automatic failover requires the addition of two new
-  parameters to your configuration. In your <<<hdfs-site.xml>>> file, add:
-
-----
- <property>
-   <name>dfs.ha.automatic-failover.enabled</name>
-   <value>true</value>
- </property>
-----
-
-  This specifies that the cluster should be set up for automatic failover.
-  In your <<<core-site.xml>>> file, add:
-
-----
- <property>
-   <name>ha.zookeeper.quorum</name>
-   <value>zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181</value>
- </property>
-----
-
-  This lists the host-port pairs running the ZooKeeper service.
-
-  As with the parameters described earlier in the document, these settings may
-  be configured on a per-nameservice basis by suffixing the configuration key
-  with the nameservice ID. For example, in a cluster with federation enabled,
-  you can explicitly enable automatic failover for only one of the nameservices
-  by setting <<<dfs.ha.automatic-failover.enabled.my-nameservice-id>>>.
-
-  There are also several other configuration parameters which may be set to
-  control the behavior of automatic failover; however, they are not necessary
-  for most installations. Please refer to the configuration key specific
-  documentation for details.
-
-** Initializing HA state in ZooKeeper
-
-  After the configuration keys have been added, the next step is to initialize
-  required state in ZooKeeper. You can do so by running the following command
-  from one of the NameNode hosts.
-
-----
-$ hdfs zkfc -formatZK
-----
-
-  This will create a znode in ZooKeeper inside of which the automatic failover
-  system stores its data.
-
-** Starting the cluster with <<<start-dfs.sh>>>
-
-  Since automatic failover has been enabled in the configuration, the
-  <<<start-dfs.sh>>> script will now automatically start a ZKFC daemon on any
-  machine that runs a NameNode. When the ZKFCs start, they will automatically
-  select one of the NameNodes to become active.
-
-** Starting the cluster manually
-
-  If you manually manage the services on your cluster, you will need to manually
-  start the <<<zkfc>>> daemon on each of the machines that runs a NameNode. You
-  can start the daemon by running:
-
-----
-$ hadoop-daemon.sh start zkfc
-----
-
-** Securing access to ZooKeeper
-
-  If you are running a secure cluster, you will likely want to ensure that the
-  information stored in ZooKeeper is also secured. This prevents malicious
-  clients from modifying the metadata in ZooKeeper or potentially triggering a
-  false failover.
-
-  In order to secure the information in ZooKeeper, first add the following to
-  your <<<core-site.xml>>> file:
-
-----
- <property>
-   <name>ha.zookeeper.auth</name>
-   <value>@/path/to/zk-auth.txt</value>
- </property>
- <property>
-   <name>ha.zookeeper.acl</name>
-   <value>@/path/to/zk-acl.txt</value>
- </property>
-----
-
-  Please note the '@' character in these values -- this specifies that the
-  configurations are not inline, but rather point to a file on disk.
-
-  The first configured file specifies a list of ZooKeeper authentications, in
-  the same format as used by the ZK CLI. For example, you may specify something
-  like:
-
-----
-digest:hdfs-zkfcs:mypassword
-----
-  ...where <<<hdfs-zkfcs>>> is a unique username for ZooKeeper, and
-  <<<mypassword>>> is some unique string used as a password.
-
-  Next, generate a ZooKeeper ACL that corresponds to this authentication, using
-  a command like the following:
-
-----
-$ java -cp $ZK_HOME/lib/*:$ZK_HOME/zookeeper-3.4.2.jar org.apache.zookeeper.server.auth.DigestAuthenticationProvider hdfs-zkfcs:mypassword
-output: hdfs-zkfcs:mypassword->hdfs-zkfcs:P/OQvnYyU/nF/mGYvB/xurX8dYs=
-----
-
-  Copy and paste the section of this output after the '->' string into the file
-  <<<zk-acls.txt>>>, prefixed by the string "<<<digest:>>>". For example:
-
-----
-digest:hdfs-zkfcs:vlUvLnd8MlacsE80rDuu6ONESbM=:rwcda
-----
-
-  In order for these ACLs to take effect, you should then rerun the
-  <<<zkfc -formatZK>>> command as described above.
-
-  After doing so, you may verify the ACLs from the ZK CLI as follows:
-
-----
-[zk: localhost:2181(CONNECTED) 1] getAcl /hadoop-ha
-'digest,'hdfs-zkfcs:vlUvLnd8MlacsE80rDuu6ONESbM=
-: cdrwa
-----
-
-** Verifying automatic failover
-
-  Once automatic failover has been set up, you should test its operation. To do
-  so, first locate the active NameNode. You can tell which node is active by
-  visiting the NameNode web interfaces -- each node reports its HA state at the
-  top of the page.
-
-  Once you have located your active NameNode, you may cause a failure on that
-  node.  For example, you can use <<<kill -9 <pid of NN>>>> to simulate a JVM
-  crash. Or, you could power cycle the machine or unplug its network interface
-  to simulate a different kind of outage.  After triggering the outage you wish
-  to test, the other NameNode should automatically become active within several
-  seconds. The amount of time required to detect a failure and trigger a
-  fail-over depends on the configuration of
-  <<<ha.zookeeper.session-timeout.ms>>>, but defaults to 5 seconds.
-
-  If the test does not succeed, you may have a misconfiguration. Check the logs
-  for the <<<zkfc>>> daemons as well as the NameNode daemons in order to further
-  diagnose the issue.
-
-
-* Automatic Failover FAQ
-
-  * <<Is it important that I start the ZKFC and NameNode daemons in any
-    particular order?>>
-
-  No. On any given node you may start the ZKFC before or after its corresponding
-  NameNode.
-
-  * <<What additional monitoring should I put in place?>>
-
-  You should add monitoring on each host that runs a NameNode to ensure that the
-  ZKFC remains running. In some types of ZooKeeper failures, for example, the
-  ZKFC may unexpectedly exit, and should be restarted to ensure that the system
-  is ready for automatic failover.
-
-  Additionally, you should monitor each of the servers in the ZooKeeper
-  quorum. If ZooKeeper crashes, then automatic failover will not function.
-
-  * <<What happens if ZooKeeper goes down?>>
-
-  If the ZooKeeper cluster crashes, no automatic failovers will be triggered.
-  However, HDFS will continue to run without any impact. When ZooKeeper is
-  restarted, HDFS will reconnect with no issues.
-
-  * <<Can I designate one of my NameNodes as primary/preferred?>>
-
-  No. Currently, this is not supported. Whichever NameNode is started first will
-  become active. You may choose to start the cluster in a specific order such
-  that your preferred node starts first.
-
-  * <<How can I initiate a manual failover when automatic failover is
-    configured?>>
-
-  Even if automatic failover is configured, you may initiate a manual failover
-  using the same <<<hdfs haadmin>>> command. It will perform a coordinated
-  failover.
-
-* HDFS Upgrade/Finalization/Rollback with HA Enabled
-
-  When moving between versions of HDFS, sometimes the newer software can simply
-  be installed and the cluster restarted. Sometimes, however, upgrading the
-  version of HDFS you're running may require changing on-disk data. In this case,
-  one must use the HDFS Upgrade/Finalize/Rollback facility after installing the
-  new software. This process is made more complex in an HA environment, since the
-  on-disk metadata that the NN relies upon is by definition distributed, both on
-  the two HA NNs in the pair, and on the JournalNodes in the case that QJM is
-  being used for the shared edits storage. This documentation section describes
-  the procedure to use the HDFS Upgrade/Finalize/Rollback facility in an HA setup.
-
-  <<To perform an HA upgrade>>, the operator must do the following:
-
-    [[1]] Shut down all of the NNs as normal, and install the newer software.
-
-    [[2]] Start up all of the JNs. Note that it is <<critical>> that all the
-    JNs be running when performing the upgrade, rollback, or finalization
-    operations. If any of the JNs are down at the time of running any of these
-    operations, the operation will fail.
-
-    [[3]] Start one of the NNs with the <<<'-upgrade'>>> flag.
-  
-    [[4]] On start, this NN will not enter the standby state as usual in an HA
-    setup. Rather, this NN will immediately enter the active state, perform an
-    upgrade of its local storage dirs, and also perform an upgrade of the shared
-    edit log.
-  
-    [[5]] At this point the other NN in the HA pair will be out of sync with
-    the upgraded NN. In order to bring it back in sync and once again have a highly
-    available setup, you should re-bootstrap this NameNode by running the NN with
-    the <<<'-bootstrapStandby'>>> flag. It is an error to start this second NN with
-    the <<<'-upgrade'>>> flag.
-  
-  Note that if at any time you want to restart the NameNodes before finalizing
-  or rolling back the upgrade, you should start the NNs as normal, i.e. without
-  any special startup flag.
-
-  <<To finalize an HA upgrade>>, the operator will use the <<<`hdfs
-  dfsadmin -finalizeUpgrade'>>> command while the NNs are running and one of them
-  is active. The active NN at the time this happens will perform the finalization
-  of the shared log, and the NN whose local storage directories contain the
-  previous FS state will delete its local state.
-
-  <<To perform a rollback>> of an upgrade, both NNs should first be shut down.
-  The operator should run the roll back command on the NN where they initiated
-  the upgrade procedure, which will perform the rollback on the local dirs there,
-  as well as on the shared log, either NFS or on the JNs. Afterward, this NN
-  should be started and the operator should run <<<`-bootstrapStandby'>>> on the
-  other NN to bring the two NNs in sync with this rolled-back file system state.

http://git-wip-us.apache.org/repos/asf/hadoop/blob/5c0073e7/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsDesign.apt.vm
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsDesign.apt.vm b/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsDesign.apt.vm
deleted file mode 100644
index 9cd95fa..0000000
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsDesign.apt.vm
+++ /dev/null
@@ -1,512 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  HDFS Architecture
-  ---
-  Dhruba Borthakur
-  ---
-  ${maven.build.timestamp}
-
-HDFS Architecture
-
-%{toc|section=1|fromDepth=0}
-
-* Introduction
-
-   The Hadoop Distributed File System (HDFS) is a distributed file system
-   designed to run on commodity hardware. It has many similarities with
-   existing distributed file systems. However, the differences from other
-   distributed file systems are significant. HDFS is highly fault-tolerant
-   and is designed to be deployed on low-cost hardware. HDFS provides high
-   throughput access to application data and is suitable for applications
-   that have large data sets. HDFS relaxes a few POSIX requirements to
-   enable streaming access to file system data. HDFS was originally built
-   as infrastructure for the Apache Nutch web search engine project. HDFS
-   is part of the Apache Hadoop Core project. The project URL is
-   {{http://hadoop.apache.org/}}.
-
-* Assumptions and Goals
-
-** Hardware Failure
-
-   Hardware failure is the norm rather than the exception. An HDFS
-   instance may consist of hundreds or thousands of server machines, each
-   storing part of the file system’s data. The fact that there are a huge
-   number of components and that each component has a non-trivial
-   probability of failure means that some component of HDFS is always
-   non-functional. Therefore, detection of faults and quick, automatic
-   recovery from them is a core architectural goal of HDFS.
-
-** Streaming Data Access
-
-   Applications that run on HDFS need streaming access to their data sets.
-   They are not general purpose applications that typically run on general
-   purpose file systems. HDFS is designed more for batch processing rather
-   than interactive use by users. The emphasis is on high throughput of
-   data access rather than low latency of data access. POSIX imposes many
-   hard requirements that are not needed for applications that are
-   targeted for HDFS. POSIX semantics in a few key areas has been traded
-   to increase data throughput rates.
-
-** Large Data Sets
-
-   Applications that run on HDFS have large data sets. A typical file in
-   HDFS is gigabytes to terabytes in size. Thus, HDFS is tuned to support
-   large files. It should provide high aggregate data bandwidth and scale
-   to hundreds of nodes in a single cluster. It should support tens of
-   millions of files in a single instance.
-
-** Simple Coherency Model
-
-   HDFS applications need a write-once-read-many access model for files. A
-   file once created, written, and closed need not be changed. This
-   assumption simplifies data coherency issues and enables high throughput
-   data access. A Map/Reduce application or a web crawler application fits
-   perfectly with this model. There is a plan to support appending-writes
-   to files in the future.
-
-** “Moving Computation is Cheaper than Moving Data”
-
-   A computation requested by an application is much more efficient if it
-   is executed near the data it operates on. This is especially true when
-   the size of the data set is huge. This minimizes network congestion and
-   increases the overall throughput of the system. The assumption is that
-   it is often better to migrate the computation closer to where the data
-   is located rather than moving the data to where the application is
-   running. HDFS provides interfaces for applications to move themselves
-   closer to where the data is located.
-
-** Portability Across Heterogeneous Hardware and Software Platforms
-
-   HDFS has been designed to be easily portable from one platform to
-   another. This facilitates widespread adoption of HDFS as a platform of
-   choice for a large set of applications.
-
-* NameNode and DataNodes
-
-   HDFS has a master/slave architecture. An HDFS cluster consists of a
-   single NameNode, a master server that manages the file system namespace
-   and regulates access to files by clients. In addition, there are a
-   number of DataNodes, usually one per node in the cluster, which manage
-   storage attached to the nodes that they run on. HDFS exposes a file
-   system namespace and allows user data to be stored in files.
-   Internally, a file is split into one or more blocks and these blocks
-   are stored in a set of DataNodes. The NameNode executes file system
-   namespace operations like opening, closing, and renaming files and
-   directories. It also determines the mapping of blocks to DataNodes. The
-   DataNodes are responsible for serving read and write requests from the
-   file system’s clients. The DataNodes also perform block creation,
-   deletion, and replication upon instruction from the NameNode.
-
-
-[images/hdfsarchitecture.png] HDFS Architecture
-
-   The NameNode and DataNode are pieces of software designed to run on
-   commodity machines. These machines typically run a GNU/Linux operating
-   system (OS). HDFS is built using the Java language; any machine that
-   supports Java can run the NameNode or the DataNode software. Usage of
-   the highly portable Java language means that HDFS can be deployed on a
-   wide range of machines. A typical deployment has a dedicated machine
-   that runs only the NameNode software. Each of the other machines in the
-   cluster runs one instance of the DataNode software. The architecture
-   does not preclude running multiple DataNodes on the same machine but in
-   a real deployment that is rarely the case.
-
-   The existence of a single NameNode in a cluster greatly simplifies the
-   architecture of the system. The NameNode is the arbitrator and
-   repository for all HDFS metadata. The system is designed in such a way
-   that user data never flows through the NameNode.
-
-* The File System Namespace
-
-   HDFS supports a traditional hierarchical file organization. A user or
-   an application can create directories and store files inside these
-   directories. The file system namespace hierarchy is similar to most
-   other existing file systems; one can create and remove files, move a
-   file from one directory to another, or rename a file. HDFS does not yet
-   implement user quotas or access permissions. HDFS does not support hard
-   links or soft links. However, the HDFS architecture does not preclude
-   implementing these features.
-
-   The NameNode maintains the file system namespace. Any change to the
-   file system namespace or its properties is recorded by the NameNode. An
-   application can specify the number of replicas of a file that should be
-   maintained by HDFS. The number of copies of a file is called the
-   replication factor of that file. This information is stored by the
-   NameNode.
-
-* Data Replication
-
-   HDFS is designed to reliably store very large files across machines in
-   a large cluster. It stores each file as a sequence of blocks; all
-   blocks in a file except the last block are the same size. The blocks of
-   a file are replicated for fault tolerance. The block size and
-   replication factor are configurable per file. An application can
-   specify the number of replicas of a file. The replication factor can be
-   specified at file creation time and can be changed later. Files in HDFS
-   are write-once and have strictly one writer at any time.
-
-   The NameNode makes all decisions regarding replication of blocks. It
-   periodically receives a Heartbeat and a Blockreport from each of the
-   DataNodes in the cluster. Receipt of a Heartbeat implies that the
-   DataNode is functioning properly. A Blockreport contains a list of all
-   blocks on a DataNode.
-
-[images/hdfsdatanodes.png] HDFS DataNodes
-
-** Replica Placement: The First Baby Steps
-
-   The placement of replicas is critical to HDFS reliability and
-   performance. Optimizing replica placement distinguishes HDFS from most
-   other distributed file systems. This is a feature that needs lots of
-   tuning and experience. The purpose of a rack-aware replica placement
-   policy is to improve data reliability, availability, and network
-   bandwidth utilization. The current implementation for the replica
-   placement policy is a first effort in this direction. The short-term
-   goals of implementing this policy are to validate it on production
-   systems, learn more about its behavior, and build a foundation to test
-   and research more sophisticated policies.
-
-   Large HDFS instances run on a cluster of computers that commonly spread
-   across many racks. Communication between two nodes in different racks
-   has to go through switches. In most cases, network bandwidth between
-   machines in the same rack is greater than network bandwidth between
-   machines in different racks.
-
-   The NameNode determines the rack id each DataNode belongs to via the
-   process outlined in {{{../hadoop-common/ClusterSetup.html#Hadoop+Rack+Awareness}Hadoop Rack Awareness}}. A simple but non-optimal policy
-   is to place replicas on unique racks. This prevents losing data when an
-   entire rack fails and allows use of bandwidth from multiple racks when
-   reading data. This policy evenly distributes replicas in the cluster
-   which makes it easy to balance load on component failure. However, this
-   policy increases the cost of writes because a write needs to transfer
-   blocks to multiple racks.
-
-   For the common case, when the replication factor is three, HDFS’s
-   placement policy is to put one replica on one node in the local rack,
-   another on a different node in the local rack, and the last on a
-   different node in a different rack. This policy cuts the inter-rack
-   write traffic which generally improves write performance. The chance of
-   rack failure is far less than that of node failure; this policy does
-   not impact data reliability and availability guarantees. However, it
-   does reduce the aggregate network bandwidth used when reading data
-   since a block is placed in only two unique racks rather than three.
-   With this policy, the replicas of a file do not evenly distribute
-   across the racks. One third of replicas are on one node, two thirds of
-   replicas are on one rack, and the other third are evenly distributed
-   across the remaining racks. This policy improves write performance
-   without compromising data reliability or read performance.
-
-   The current, default replica placement policy described here is a work
-   in progress.
-
-** Replica Selection
-
-   To minimize global bandwidth consumption and read latency, HDFS tries
-   to satisfy a read request from a replica that is closest to the reader.
-   If there exists a replica on the same rack as the reader node, then
-   that replica is preferred to satisfy the read request. If angg/ HDFS
-   cluster spans multiple data centers, then a replica that is resident in
-   the local data center is preferred over any remote replica.
-
-** Safemode
-
-   On startup, the NameNode enters a special state called Safemode.
-   Replication of data blocks does not occur when the NameNode is in the
-   Safemode state. The NameNode receives Heartbeat and Blockreport
-   messages from the DataNodes. A Blockreport contains the list of data
-   blocks that a DataNode is hosting. Each block has a specified minimum
-   number of replicas. A block is considered safely replicated when the
-   minimum number of replicas of that data block has checked in with the
-   NameNode. After a configurable percentage of safely replicated data
-   blocks checks in with the NameNode (plus an additional 30 seconds), the
-   NameNode exits the Safemode state. It then determines the list of data
-   blocks (if any) that still have fewer than the specified number of
-   replicas. The NameNode then replicates these blocks to other DataNodes.
-
-* The Persistence of File System Metadata
-
-   The HDFS namespace is stored by the NameNode. The NameNode uses a
-   transaction log called the EditLog to persistently record every change
-   that occurs to file system metadata. For example, creating a new file
-   in HDFS causes the NameNode to insert a record into the EditLog
-   indicating this. Similarly, changing the replication factor of a file
-   causes a new record to be inserted into the EditLog. The NameNode uses
-   a file in its local host OS file system to store the EditLog. The
-   entire file system namespace, including the mapping of blocks to files
-   and file system properties, is stored in a file called the FsImage. The
-   FsImage is stored as a file in the NameNode’s local file system too.
-
-   The NameNode keeps an image of the entire file system namespace and
-   file Blockmap in memory. This key metadata item is designed to be
-   compact, such that a NameNode with 4 GB of RAM is plenty to support a
-   huge number of files and directories. When the NameNode starts up, it
-   reads the FsImage and EditLog from disk, applies all the transactions
-   from the EditLog to the in-memory representation of the FsImage, and
-   flushes out this new version into a new FsImage on disk. It can then
-   truncate the old EditLog because its transactions have been applied to
-   the persistent FsImage. This process is called a checkpoint. In the
-   current implementation, a checkpoint only occurs when the NameNode
-   starts up. Work is in progress to support periodic checkpointing in the
-   near future.
-
-   The DataNode stores HDFS data in files in its local file system. The
-   DataNode has no knowledge about HDFS files. It stores each block of
-   HDFS data in a separate file in its local file system. The DataNode
-   does not create all files in the same directory. Instead, it uses a
-   heuristic to determine the optimal number of files per directory and
-   creates subdirectories appropriately. It is not optimal to create all
-   local files in the same directory because the local file system might
-   not be able to efficiently support a huge number of files in a single
-   directory. When a DataNode starts up, it scans through its local file
-   system, generates a list of all HDFS data blocks that correspond to
-   each of these local files and sends this report to the NameNode: this
-   is the Blockreport.
-
-* The Communication Protocols
-
-   All HDFS communication protocols are layered on top of the TCP/IP
-   protocol. A client establishes a connection to a configurable TCP port
-   on the NameNode machine. It talks the ClientProtocol with the NameNode.
-   The DataNodes talk to the NameNode using the DataNode Protocol. A
-   Remote Procedure Call (RPC) abstraction wraps both the Client Protocol
-   and the DataNode Protocol. By design, the NameNode never initiates any
-   RPCs. Instead, it only responds to RPC requests issued by DataNodes or
-   clients.
-
-* Robustness
-
-   The primary objective of HDFS is to store data reliably even in the
-   presence of failures. The three common types of failures are NameNode
-   failures, DataNode failures and network partitions.
-
-** Data Disk Failure, Heartbeats and Re-Replication
-
-   Each DataNode sends a Heartbeat message to the NameNode periodically. A
-   network partition can cause a subset of DataNodes to lose connectivity
-   with the NameNode. The NameNode detects this condition by the absence
-   of a Heartbeat message. The NameNode marks DataNodes without recent
-   Heartbeats as dead and does not forward any new IO requests to them.
-   Any data that was registered to a dead DataNode is not available to
-   HDFS any more. DataNode death may cause the replication factor of some
-   blocks to fall below their specified value. The NameNode constantly
-   tracks which blocks need to be replicated and initiates replication
-   whenever necessary. The necessity for re-replication may arise due to
-   many reasons: a DataNode may become unavailable, a replica may become
-   corrupted, a hard disk on a DataNode may fail, or the replication
-   factor of a file may be increased.
-
-** Cluster Rebalancing
-
-   The HDFS architecture is compatible with data rebalancing schemes. A
-   scheme might automatically move data from one DataNode to another if
-   the free space on a DataNode falls below a certain threshold. In the
-   event of a sudden high demand for a particular file, a scheme might
-   dynamically create additional replicas and rebalance other data in the
-   cluster. These types of data rebalancing schemes are not yet
-   implemented.
-
-** Data Integrity
-
-   It is possible that a block of data fetched from a DataNode arrives
-   corrupted. This corruption can occur because of faults in a storage
-   device, network faults, or buggy software. The HDFS client software
-   implements checksum checking on the contents of HDFS files. When a
-   client creates an HDFS file, it computes a checksum of each block of
-   the file and stores these checksums in a separate hidden file in the
-   same HDFS namespace. When a client retrieves file contents it verifies
-   that the data it received from each DataNode matches the checksum
-   stored in the associated checksum file. If not, then the client can opt
-   to retrieve that block from another DataNode that has a replica of that
-   block.
-
-** Metadata Disk Failure
-
-   The FsImage and the EditLog are central data structures of HDFS. A
-   corruption of these files can cause the HDFS instance to be
-   non-functional. For this reason, the NameNode can be configured to
-   support maintaining multiple copies of the FsImage and EditLog. Any
-   update to either the FsImage or EditLog causes each of the FsImages and
-   EditLogs to get updated synchronously. This synchronous updating of
-   multiple copies of the FsImage and EditLog may degrade the rate of
-   namespace transactions per second that a NameNode can support. However,
-   this degradation is acceptable because even though HDFS applications
-   are very data intensive in nature, they are not metadata intensive.
-   When a NameNode restarts, it selects the latest consistent FsImage and
-   EditLog to use.
-
-   The NameNode machine is a single point of failure for an HDFS cluster.
-   If the NameNode machine fails, manual intervention is necessary.
-   Currently, automatic restart and failover of the NameNode software to
-   another machine is not supported.
-
-** Snapshots
-
-   Snapshots support storing a copy of data at a particular instant of
-   time. One usage of the snapshot feature may be to roll back a corrupted
-   HDFS instance to a previously known good point in time. HDFS does not
-   currently support snapshots but will in a future release.
-
-* Data Organization
-
-** Data Blocks
-
-   HDFS is designed to support very large files. Applications that are
-   compatible with HDFS are those that deal with large data sets. These
-   applications write their data only once but they read it one or more
-   times and require these reads to be satisfied at streaming speeds. HDFS
-   supports write-once-read-many semantics on files. A typical block size
-   used by HDFS is 64 MB. Thus, an HDFS file is chopped up into 64 MB
-   chunks, and if possible, each chunk will reside on a different
-   DataNode.
-
-** Staging
-
-   A client request to create a file does not reach the NameNode
-   immediately. In fact, initially the HDFS client caches the file data
-   into a temporary local file. Application writes are transparently
-   redirected to this temporary local file. When the local file
-   accumulates data worth over one HDFS block size, the client contacts
-   the NameNode. The NameNode inserts the file name into the file system
-   hierarchy and allocates a data block for it. The NameNode responds to
-   the client request with the identity of the DataNode and the
-   destination data block. Then the client flushes the block of data from
-   the local temporary file to the specified DataNode. When a file is
-   closed, the remaining un-flushed data in the temporary local file is
-   transferred to the DataNode. The client then tells the NameNode that
-   the file is closed. At this point, the NameNode commits the file
-   creation operation into a persistent store. If the NameNode dies before
-   the file is closed, the file is lost.
-
-   The above approach has been adopted after careful consideration of
-   target applications that run on HDFS. These applications need streaming
-   writes to files. If a client writes to a remote file directly without
-   any client side buffering, the network speed and the congestion in the
-   network impacts throughput considerably. This approach is not without
-   precedent. Earlier distributed file systems, e.g. AFS, have used client
-   side caching to improve performance. A POSIX requirement has been
-   relaxed to achieve higher performance of data uploads.
-
-** Replication Pipelining
-
-   When a client is writing data to an HDFS file, its data is first
-   written to a local file as explained in the previous section. Suppose
-   the HDFS file has a replication factor of three. When the local file
-   accumulates a full block of user data, the client retrieves a list of
-   DataNodes from the NameNode. This list contains the DataNodes that will
-   host a replica of that block. The client then flushes the data block to
-   the first DataNode. The first DataNode starts receiving the data in
-   small portions, writes each portion to its local repository and
-   transfers that portion to the second DataNode in the list. The second
-   DataNode, in turn starts receiving each portion of the data block,
-   writes that portion to its repository and then flushes that portion to
-   the third DataNode. Finally, the third DataNode writes the data to its
-   local repository. Thus, a DataNode can be receiving data from the
-   previous one in the pipeline and at the same time forwarding data to
-   the next one in the pipeline. Thus, the data is pipelined from one
-   DataNode to the next.
-
-* Accessibility
-
-   HDFS can be accessed from applications in many different ways.
-   Natively, HDFS provides a
-   {{{http://hadoop.apache.org/docs/current/api/}FileSystem Java API}}
-   for applications to use. A C language wrapper for this Java API is also
-   available. In addition, an HTTP browser can also be used to browse the files
-   of an HDFS instance. Work is in progress to expose HDFS through the WebDAV
-   protocol.
-
-** FS Shell
-
-   HDFS allows user data to be organized in the form of files and
-   directories. It provides a commandline interface called FS shell that
-   lets a user interact with the data in HDFS. The syntax of this command
-   set is similar to other shells (e.g. bash, csh) that users are already
-   familiar with. Here are some sample action/command pairs:
-
-*---------+---------+
-|| Action | Command
-*---------+---------+
-| Create a directory named <<</foodir>>> | <<<bin/hadoop dfs -mkdir /foodir>>>
-*---------+---------+
-| Remove a directory named <<</foodir>>> | <<<bin/hadoop dfs -rmr /foodir>>>
-*---------+---------+
-| View the contents of a file named <<</foodir/myfile.txt>>> | <<<bin/hadoop dfs -cat /foodir/myfile.txt>>>
-*---------+---------+
-
-   FS shell is targeted for applications that need a scripting language to
-   interact with the stored data.
-
-** DFSAdmin
-
-   The DFSAdmin command set is used for administering an HDFS cluster.
-   These are commands that are used only by an HDFS administrator. Here
-   are some sample action/command pairs:
-
-*---------+---------+
-|| Action | Command
-*---------+---------+
-|Put the cluster in Safemode              | <<<bin/hadoop dfsadmin -safemode enter>>>
-*---------+---------+
-|Generate a list of DataNodes             | <<<bin/hadoop dfsadmin -report>>>
-*---------+---------+
-|Recommission or decommission DataNode(s) | <<<bin/hadoop dfsadmin -refreshNodes>>>
-*---------+---------+
-
-** Browser Interface
-
-   A typical HDFS install configures a web server to expose the HDFS
-   namespace through a configurable TCP port. This allows a user to
-   navigate the HDFS namespace and view the contents of its files using a
-   web browser.
-
-* Space Reclamation
-
-** File Deletes and Undeletes
-
-   When a file is deleted by a user or an application, it is not
-   immediately removed from HDFS. Instead, HDFS first renames it to a file
-   in the <<</trash>>> directory. The file can be restored quickly as long as it
-   remains in <<</trash>>>. A file remains in <<</trash>>> for a configurable amount
-   of time. After the expiry of its life in <<</trash>>>, the NameNode deletes
-   the file from the HDFS namespace. The deletion of a file causes the
-   blocks associated with the file to be freed. Note that there could be
-   an appreciable time delay between the time a file is deleted by a user
-   and the time of the corresponding increase in free space in HDFS.
-
-   A user can Undelete a file after deleting it as long as it remains in
-   the <<</trash>>> directory. If a user wants to undelete a file that he/she
-   has deleted, he/she can navigate the <<</trash>>> directory and retrieve the
-   file. The <<</trash>>> directory contains only the latest copy of the file
-   that was deleted. The <<</trash>>> directory is just like any other directory
-   with one special feature: HDFS applies specified policies to
-   automatically delete files from this directory. Current default trash
-   interval is set to 0 (Deletes file without storing in trash). This value is
-   configurable parameter stored as <<<fs.trash.interval>>> stored in
-   core-site.xml.
-
-** Decrease Replication Factor
-
-   When the replication factor of a file is reduced, the NameNode selects
-   excess replicas that can be deleted. The next Heartbeat transfers this
-   information to the DataNode. The DataNode then removes the
-   corresponding blocks and the corresponding free space appears in the
-   cluster. Once again, there might be a time delay between the completion
-   of the setReplication API call and the appearance of free space in the
-   cluster.
-
-* References
-
-   Hadoop {{{http://hadoop.apache.org/docs/current/api/}JavaDoc API}}.
-
-   HDFS source code: {{http://hadoop.apache.org/version_control.html}}

http://git-wip-us.apache.org/repos/asf/hadoop/blob/5c0073e7/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsEditsViewer.apt.vm
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsEditsViewer.apt.vm b/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsEditsViewer.apt.vm
deleted file mode 100644
index 8c2db1b..0000000
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsEditsViewer.apt.vm
+++ /dev/null
@@ -1,104 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-
-  ---
-  Offline Edits Viewer Guide
-  ---
-  Erik Steffl
-  ---
-  ${maven.build.timestamp}
-
-Offline Edits Viewer Guide
-
-%{toc|section=1|fromDepth=0}
-
-* Overview
-
-   Offline Edits Viewer is a tool to parse the Edits log file. The current
-   processors are mostly useful for conversion between different formats,
-   including XML which is human readable and easier to edit than native
-   binary format.
-
-   The tool can parse the edits formats -18 (roughly Hadoop 0.19) and
-   later. The tool operates on files only, it does not need Hadoop cluster
-   to be running.
-
-   Input formats supported:
-
-     [[1]] <<binary>>: native binary format that Hadoop uses internally
-
-     [[2]] <<xml>>: XML format, as produced by xml processor, used if filename
-     has <<<.xml>>> (case insensitive) extension
-
-   The Offline Edits Viewer provides several output processors (unless
-   stated otherwise the output of the processor can be converted back to
-   original edits file):
-
-     [[1]] <<binary>>: native binary format that Hadoop uses internally
-
-     [[2]] <<xml>>: XML format
-
-     [[3]] <<stats>>: prints out statistics, this cannot be converted back to
-     Edits file
-
-* Usage
-
-----
-   bash$ bin/hdfs oev -i edits -o edits.xml
-----
-
-*-----------------------:-----------------------------------+
-| Flag                  | Description                       |
-*-----------------------:-----------------------------------+
-|[<<<-i>>> ; <<<--inputFile>>>] <input file> | Specify the input edits log file to
-|                       | process. Xml (case insensitive) extension means XML format otherwise
-|                       | binary format is assumed. Required.
-*-----------------------:-----------------------------------+
-|[<<-o>> ; <<--outputFile>>] <output file> | Specify the output filename, if the
-|                       | specified output processor generates one. If the specified file already
-|                       | exists, it is silently overwritten. Required.
-*-----------------------:-----------------------------------+
-|[<<-p>> ; <<--processor>>] <processor> | Specify the image processor to apply
-|                       | against the image file. Currently valid options are
-|                       | <<<binary>>>, <<<xml>>> (default) and <<<stats>>>.
-*-----------------------:-----------------------------------+
-|<<[-v ; --verbose] >>   | Print the input and output filenames and pipe output of
-|                       | processor to console as well as specified file. On extremely large
-|                       | files, this may increase processing time by an order of magnitude.
-*-----------------------:-----------------------------------+
-|<<[-h ; --help] >>      | Display the tool usage and help information and exit.
-*-----------------------:-----------------------------------+
-
-* Case study: Hadoop cluster recovery
-
-   In case there is some problem with hadoop cluster and the edits file is
-   corrupted it is possible to save at least part of the edits file that
-   is correct. This can be done by converting the binary edits to XML,
-   edit it manually and then convert it back to binary. The most common
-   problem is that the edits file is missing the closing record (record
-   that has opCode -1). This should be recognized by the tool and the XML
-   format should be properly closed.
-
-   If there is no closing record in the XML file you can add one after
-   last correct record. Anything after the record with opCode -1 is
-   ignored.
-
-   Example of a closing record (with opCode -1):
-
-+----
-  <RECORD>
-    <OPCODE>-1</OPCODE>
-    <DATA>
-    </DATA>
-  </RECORD>
-+----

http://git-wip-us.apache.org/repos/asf/hadoop/blob/5c0073e7/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsImageViewer.apt.vm
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsImageViewer.apt.vm b/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsImageViewer.apt.vm
deleted file mode 100644
index 3b84226..0000000
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsImageViewer.apt.vm
+++ /dev/null
@@ -1,247 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Offline Image Viewer Guide
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Offline Image Viewer Guide
-
-%{toc|section=1|fromDepth=0}
-
-* Overview
-
-   The Offline Image Viewer is a tool to dump the contents of hdfs fsimage
-   files to a human-readable format and provide read-only WebHDFS API
-   in order to allow offline analysis and examination of an Hadoop cluster's
-   namespace. The tool is able to process very large image files relatively
-   quickly. The tool handles the layout formats that were included with Hadoop
-   versions 2.4 and up. If you want to handle older layout formats, you can
-   use the Offline Image Viewer of Hadoop 2.3 or {{oiv_legacy Command}}.
-   If the tool is not able to process an image file, it will exit cleanly.
-   The Offline Image Viewer does not require a Hadoop cluster to be running;
-   it is entirely offline in its operation.
-
-   The Offline Image Viewer provides several output processors:
-
-   [[1]] Web is the default output processor. It launches a HTTP server
-      that exposes read-only WebHDFS API. Users can investigate the namespace
-      interactively by using HTTP REST API.
-
-   [[2]] XML creates an XML document of the fsimage and includes all of the
-      information within the fsimage, similar to the lsr processor. The
-      output of this processor is amenable to automated processing and
-      analysis with XML tools. Due to the verbosity of the XML syntax,
-      this processor will also generate the largest amount of output.
-
-   [[3]] FileDistribution is the tool for analyzing file sizes in the
-      namespace image. In order to run the tool one should define a range
-      of integers [0, maxSize] by specifying maxSize and a step. The
-      range of integers is divided into segments of size step: [0, s[1],
-      ..., s[n-1], maxSize], and the processor calculates how many files
-      in the system fall into each segment [s[i-1], s[i]). Note that
-      files larger than maxSize always fall into the very last segment.
-      The output file is formatted as a tab separated two column table:
-      Size and NumFiles. Where Size represents the start of the segment,
-      and numFiles is the number of files form the image which size falls
-      in this segment.
-
-* Usage
-
-** Web Processor
-
-   Web processor launches a HTTP server which exposes read-only WebHDFS API.
-   Users can specify the address to listen by -addr option (default by
-   localhost:5978).
-
-----
-   bash$ bin/hdfs oiv -i fsimage
-   14/04/07 13:25:14 INFO offlineImageViewer.WebImageViewer: WebImageViewer
-   started. Listening on /127.0.0.1:5978. Press Ctrl+C to stop the viewer.
-----
-
-   Users can access the viewer and get the information of the fsimage by
-   the following shell command:
-
-----
-   bash$ bin/hdfs dfs -ls webhdfs://127.0.0.1:5978/
-   Found 2 items
-   drwxrwx---   - root supergroup          0 2014-03-26 20:16 webhdfs://127.0.0.1:5978/tmp
-   drwxr-xr-x   - root supergroup          0 2014-03-31 14:08 webhdfs://127.0.0.1:5978/user
-----
-
-   To get the information of all the files and directories, you can simply use
-   the following command:
-
-----
-   bash$ bin/hdfs dfs -ls -R webhdfs://127.0.0.1:5978/
-----
-
-   Users can also get JSON formatted FileStatuses via HTTP REST API.
-
-----
-   bash$ curl -i http://127.0.0.1:5978/webhdfs/v1/?op=liststatus
-   HTTP/1.1 200 OK
-   Content-Type: application/json
-   Content-Length: 252
-
-   {"FileStatuses":{"FileStatus":[
-   {"fileId":16386,"accessTime":0,"replication":0,"owner":"theuser","length":0,"permission":"755","blockSize":0,"modificationTime":1392772497282,"type":"DIRECTORY","group":"supergroup","childrenNum":1,"pathSuffix":"user"}
-   ]}}
-----
-
-   The Web processor now supports the following operations:
-
-   * {{{./WebHDFS.html#List_a_Directory}LISTSTATUS}}
-
-   * {{{./WebHDFS.html#Status_of_a_FileDirectory}GETFILESTATUS}}
-
-   * {{{./WebHDFS.html#Get_ACL_Status}GETACLSTATUS}}
-
-** XML Processor
-
-   XML Processor is used to dump all the contents in the fsimage. Users can
-   specify input and output file via -i and -o command-line.
-
-----
-   bash$ bin/hdfs oiv -p XML -i fsimage -o fsimage.xml
-----
-
-   This will create a file named fsimage.xml contains all the information in
-   the fsimage. For very large image files, this process may take several
-   minutes.
-
-   Applying the Offline Image Viewer with XML processor would result in the
-   following output:
-
-----
-   <?xml version="1.0"?>
-   <fsimage>
-   <NameSection>
-     <genstampV1>1000</genstampV1>
-     <genstampV2>1002</genstampV2>
-     <genstampV1Limit>0</genstampV1Limit>
-     <lastAllocatedBlockId>1073741826</lastAllocatedBlockId>
-     <txid>37</txid>
-   </NameSection>
-   <INodeSection>
-     <lastInodeId>16400</lastInodeId>
-     <inode>
-       <id>16385</id>
-       <type>DIRECTORY</type>
-       <name></name>
-       <mtime>1392772497282</mtime>
-       <permission>theuser:supergroup:rwxr-xr-x</permission>
-       <nsquota>9223372036854775807</nsquota>
-       <dsquota>-1</dsquota>
-     </inode>
-   ...remaining output omitted...
-----
-
-* Options
-
-*-----------------------:-----------------------------------+
-| <<Flag>>              | <<Description>>                   |
-*-----------------------:-----------------------------------+
-| <<<-i>>>\|<<<--inputFile>>> <input file> | Specify the input fsimage file
-|                       | to process. Required.
-*-----------------------:-----------------------------------+
-| <<<-o>>>\|<<<--outputFile>>> <output file> | Specify the output filename,
-|                       | if the specified output processor generates one. If
-|                       | the specified file already exists, it is silently
-|                       | overwritten. (output to stdout by default)
-*-----------------------:-----------------------------------+
-| <<<-p>>>\|<<<--processor>>> <processor> | Specify the image processor to
-|                       | apply against the image file. Currently valid options
-|                       | are Web (default), XML and FileDistribution.
-*-----------------------:-----------------------------------+
-| <<<-addr>>> <address> | Specify the address(host:port) to listen.
-|                       | (localhost:5978 by default). This option is used with
-|                       | Web processor.
-*-----------------------:-----------------------------------+
-| <<<-maxSize>>> <size> | Specify the range [0, maxSize] of file sizes to be
-|                       | analyzed in bytes (128GB by default). This option is
-|                       | used with FileDistribution processor.
-*-----------------------:-----------------------------------+
-| <<<-step>>> <size>    | Specify the granularity of the distribution in bytes
-|                       | (2MB by default). This option is used with
-|                       | FileDistribution processor.
-*-----------------------:-----------------------------------+
-| <<<-h>>>\|<<<--help>>>| Display the tool usage and help information and
-|                       | exit.
-*-----------------------:-----------------------------------+
-
-* Analyzing Results
-
-   The Offline Image Viewer makes it easy to gather large amounts of data
-   about the hdfs namespace. This information can then be used to explore
-   file system usage patterns or find specific files that match arbitrary
-   criteria, along with other types of namespace analysis.
-
-* oiv_legacy Command
-
-   Due to the internal layout changes introduced by the ProtocolBuffer-based
-   fsimage ({{{https://issues.apache.org/jira/browse/HDFS-5698}HDFS-5698}}),
-   OfflineImageViewer consumes excessive amount of memory and loses some
-   functions such as Indented and Delimited processor. If you want to process
-   without large amount of memory or use these processors, you can use
-   <<<oiv_legacy>>> command (same as <<<oiv>>> in Hadoop 2.3).
-
-** Usage
-
-   1. Set <<<dfs.namenode.legacy-oiv-image.dir>>> to an appropriate directory
-      to make standby NameNode or SecondaryNameNode save its namespace in the
-      old fsimage format during checkpointing.
-
-   2. Use <<<oiv_legacy>>> command to the old format fsimage.
-
-----
-   bash$ bin/hdfs oiv_legacy -i fsimage_old -o output
-----
-
-** Options
-
-*-----------------------:-----------------------------------+
-| <<Flag>>              | <<Description>>                   |
-*-----------------------:-----------------------------------+
-| <<<-i>>>\|<<<--inputFile>>> <input file> | Specify the input fsimage file to
-|                       | process. Required.
-*-----------------------:-----------------------------------+
-| <<<-o>>>\|<<<--outputFile>>> <output file> | Specify the output filename, if
-|                       | the specified output processor generates one. If the
-|                       | specified file already exists, it is silently
-|                       | overwritten. Required.
-*-----------------------:-----------------------------------+
-| <<<-p>>>\|<<<--processor>>> <processor> | Specify the image processor to
-|                       | apply against the image file. Valid options are
-|                       | Ls (default), XML, Delimited, Indented, and
-|                       | FileDistribution.
-*-----------------------:-----------------------------------+
-| <<<-skipBlocks>>>     | Do not enumerate individual blocks within files. This
-|                       | may save processing time and outfile file space on
-|                       | namespaces with very large files. The Ls processor
-|                       | reads the blocks to correctly determine file sizes
-|                       | and ignores this option.
-*-----------------------:-----------------------------------+
-| <<<-printToScreen>>>  | Pipe output of processor to console as well as
-|                       | specified file. On extremely large namespaces, this
-|                       | may increase processing time by an order of
-|                       | magnitude.
-*-----------------------:-----------------------------------+
-| <<<-delimiter>>> <arg>| When used in conjunction with the Delimited
-|                       | processor, replaces the default tab delimiter with
-|                       | the string specified by <arg>.
-*-----------------------:-----------------------------------+
-| <<<-h>>>\|<<<--help>>>| Display the tool usage and help information and exit.
-*-----------------------:-----------------------------------+

http://git-wip-us.apache.org/repos/asf/hadoop/blob/5c0073e7/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsMultihoming.apt.vm
----------------------------------------------------------------------
diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsMultihoming.apt.vm b/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsMultihoming.apt.vm
deleted file mode 100644
index 2be4567..0000000
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HdfsMultihoming.apt.vm
+++ /dev/null
@@ -1,145 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Distributed File System-${project.version} - Support for Multi-Homed Networks
-  ---
-  ---
-  ${maven.build.timestamp}
-
-HDFS Support for Multihomed Networks
-
-  This document is targetted to cluster administrators deploying <<<HDFS>>> in
-  multihomed networks. Similar support for <<<YARN>>>/<<<MapReduce>>> is
-  work in progress and will be documented when available.
-
-%{toc|section=1|fromDepth=0}
-
-* Multihoming Background
-
-  In multihomed networks the cluster nodes are connected to more than one
-  network interface. There could be multiple reasons for doing so.
-
-  [[1]] <<Security>>: Security requirements may dictate that intra-cluster
-  traffic be confined to a different network than the network used to
-  transfer data in and out of the cluster.
-
-  [[2]] <<Performance>>: Intra-cluster traffic may use one or more high bandwidth
-  interconnects like Fiber Channel, Infiniband or 10GbE.
-
-  [[3]] <<Failover/Redundancy>>: The nodes may have multiple network adapters
-  connected to a single network to handle network adapter failure.
-
-
-  Note that NIC Bonding (also known as NIC Teaming or Link
-  Aggregation) is a related but separate topic. The following settings
-  are usually not applicable to a NIC bonding configuration which handles
-  multiplexing and failover transparently while presenting a single 'logical
-  network' to applications.
-
-* Fixing Hadoop Issues In Multihomed Environments
-
-** Ensuring HDFS Daemons Bind All Interfaces
-
-  By default <<<HDFS>>> endpoints are specified as either hostnames or IP addresses.
-  In either case <<<HDFS>>> daemons will bind to a single IP address making
-  the daemons unreachable from other networks.
-
-  The solution is to have separate setting for server endpoints to force binding
-  the wildcard IP address <<<INADDR_ANY>>> i.e. <<<0.0.0.0>>>. Do NOT supply a port
-  number with any of these settings.
-
-----
-<property>
-  <name>dfs.namenode.rpc-bind-host</name>
-  <value>0.0.0.0</value>
-  <description>
-    The actual address the RPC server will bind to. If this optional address is
-    set, it overrides only the hostname portion of dfs.namenode.rpc-address.
-    It can also be specified per name node or name service for HA/Federation.
-    This is useful for making the name node listen on all interfaces by
-    setting it to 0.0.0.0.
-  </description>
-</property>
-
-<property>
-  <name>dfs.namenode.servicerpc-bind-host</name>
-  <value>0.0.0.0</value>
-  <description>
-    The actual address the service RPC server will bind to. If this optional address is
-    set, it overrides only the hostname portion of dfs.namenode.servicerpc-address.
-    It can also be specified per name node or name service for HA/Federation.
-    This is useful for making the name node listen on all interfaces by
-    setting it to 0.0.0.0.
-  </description>
-</property>
-
-<property>
-  <name>dfs.namenode.http-bind-host</name>
-  <value>0.0.0.0</value>
-  <description>
-    The actual adress the HTTP server will bind to. If this optional address
-    is set, it overrides only the hostname portion of dfs.namenode.http-address.
-    It can also be specified per name node or name service for HA/Federation.
-    This is useful for making the name node HTTP server listen on all
-    interfaces by setting it to 0.0.0.0.
-  </description>
-</property>
-
-<property>
-  <name>dfs.namenode.https-bind-host</name>
-  <value>0.0.0.0</value>
-  <description>
-    The actual adress the HTTPS server will bind to. If this optional address
-    is set, it overrides only the hostname portion of dfs.namenode.https-address.
-    It can also be specified per name node or name service for HA/Federation.
-    This is useful for making the name node HTTPS server listen on all
-    interfaces by setting it to 0.0.0.0.
-  </description>
-</property>
-----
-
-** Clients use Hostnames when connecting to DataNodes
-
-  By default <<<HDFS>>> clients connect to DataNodes using the IP address
-  provided by the NameNode. Depending on the network configuration this
-  IP address may be unreachable by the clients. The fix is letting clients perform
-  their own DNS resolution of the DataNode hostname. The following setting
-  enables this behavior.
-
-----
-<property>
-  <name>dfs.client.use.datanode.hostname</name>
-  <value>true</value>
-  <description>Whether clients should use datanode hostnames when
-    connecting to datanodes.
-  </description>
-</property>
-----
-
-** DataNodes use HostNames when connecting to other DataNodes
-
-  Rarely, the NameNode-resolved IP address for a DataNode may be unreachable
-  from other DataNodes. The fix is to force DataNodes to perform their own
-  DNS resolution for inter-DataNode connections. The following setting enables
-  this behavior.
-
-----
-<property>
-  <name>dfs.datanode.use.datanode.hostname</name>
-  <value>true</value>
-  <description>Whether datanodes should use datanode hostnames when
-    connecting to other datanodes for data transfer.
-  </description>
-</property>
-----
-