You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by zj...@apache.org on 2015/03/03 20:31:50 UTC
[12/43] hadoop git commit: YARN-3168. Convert site documentation from apt to markdown (Gururaj Shetty via aw)

YARN-3168. Convert site documentation from apt to markdown (Gururaj Shetty via aw)


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/2e44b75f
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/2e44b75f
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/2e44b75f

Branch: refs/heads/YARN-2928
Commit: 2e44b75f729009d33e309d1366bf86746443db81
Parents: edceced
Author: Allen Wittenauer <aw...@apache.org>
Authored: Fri Feb 27 20:39:44 2015 -0800
Committer: Allen Wittenauer <aw...@apache.org>
Committed: Fri Feb 27 20:39:44 2015 -0800

----------------------------------------------------------------------
 hadoop-yarn-project/CHANGES.txt                 |    3 +
 .../src/site/apt/CapacityScheduler.apt.vm       |  368 ---
 .../src/site/apt/DockerContainerExecutor.apt.vm |  204 --
 .../src/site/apt/FairScheduler.apt.vm           |  483 ---
 .../src/site/apt/NodeManager.apt.vm             |   64 -
 .../src/site/apt/NodeManagerCgroups.apt.vm      |   77 -
 .../src/site/apt/NodeManagerRest.apt.vm         |  645 ----
 .../src/site/apt/NodeManagerRestart.apt.vm      |   86 -
 .../src/site/apt/ResourceManagerHA.apt.vm       |  233 --
 .../src/site/apt/ResourceManagerRest.apt.vm     | 3104 ------------------
 .../src/site/apt/ResourceManagerRestart.apt.vm  |  298 --
 .../src/site/apt/SecureContainer.apt.vm         |  176 -
 .../src/site/apt/TimelineServer.apt.vm          |  260 --
 .../src/site/apt/WebApplicationProxy.apt.vm     |   49 -
 .../src/site/apt/WebServicesIntro.apt.vm        |  593 ----
 .../src/site/apt/WritingYarnApplications.apt.vm |  757 -----
 .../hadoop-yarn-site/src/site/apt/YARN.apt.vm   |   77 -
 .../src/site/apt/YarnCommands.apt.vm            |  369 ---
 .../hadoop-yarn-site/src/site/apt/index.apt.vm  |   82 -
 .../src/site/markdown/CapacityScheduler.md      |  186 ++
 .../site/markdown/DockerContainerExecutor.md.vm |  154 +
 .../src/site/markdown/FairScheduler.md          |  233 ++
 .../src/site/markdown/NodeManager.md            |   57 +
 .../src/site/markdown/NodeManagerCgroups.md     |   57 +
 .../src/site/markdown/NodeManagerRest.md        |  543 +++
 .../src/site/markdown/NodeManagerRestart.md     |   53 +
 .../src/site/markdown/ResourceManagerHA.md      |  140 +
 .../src/site/markdown/ResourceManagerRest.md    | 2640 +++++++++++++++
 .../src/site/markdown/ResourceManagerRestart.md |  181 +
 .../src/site/markdown/SecureContainer.md        |  135 +
 .../src/site/markdown/TimelineServer.md         |  231 ++
 .../src/site/markdown/WebApplicationProxy.md    |   24 +
 .../src/site/markdown/WebServicesIntro.md       |  569 ++++
 .../site/markdown/WritingYarnApplications.md    |  591 ++++
 .../hadoop-yarn-site/src/site/markdown/YARN.md  |   42 +
 .../src/site/markdown/YarnCommands.md           |  272 ++
 .../hadoop-yarn-site/src/site/markdown/index.md |   75 +
 37 files changed, 6186 insertions(+), 7925 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/2e44b75f/hadoop-yarn-project/CHANGES.txt
----------------------------------------------------------------------
diff --git a/hadoop-yarn-project/CHANGES.txt b/hadoop-yarn-project/CHANGES.txt
index e7af84b..02b1831 100644
--- a/hadoop-yarn-project/CHANGES.txt
+++ b/hadoop-yarn-project/CHANGES.txt
@@ -20,6 +20,9 @@ Trunk - Unreleased
     YARN-2980. Move health check script related functionality to hadoop-common
     (Varun Saxena via aw)
 
+    YARN-3168. Convert site documentation from apt to markdown (Gururaj Shetty
+    via aw)
+
   OPTIMIZATIONS
 
   BUG FIXES

http://git-wip-us.apache.org/repos/asf/hadoop/blob/2e44b75f/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm
----------------------------------------------------------------------
diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm
deleted file mode 100644
index 8528c1a..0000000
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm
+++ /dev/null
@@ -1,368 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Map Reduce Next Generation-${project.version} - Capacity Scheduler
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Hadoop MapReduce Next Generation - Capacity Scheduler
-
-%{toc|section=1|fromDepth=0}
-
-* {Purpose} 
-
-  This document describes the <<<CapacityScheduler>>>, a pluggable scheduler 
-  for Hadoop which allows for multiple-tenants to securely share a large cluster 
-  such that their applications are allocated resources in a timely manner under 
-  constraints of allocated capacities.
-
-* {Overview}
-
-  The <<<CapacityScheduler>>> is designed to run Hadoop applications as a 
-  shared, multi-tenant cluster in an operator-friendly manner while maximizing 
-  the throughput and the utilization of the cluster.
-   
-  Traditionally each organization has it own private set of compute resources 
-  that have sufficient capacity to meet the organization's SLA under peak or 
-  near peak conditions. This generally leads to poor average utilization and 
-  overhead of managing multiple independent clusters, one per each organization. 
-  Sharing clusters between organizations is a cost-effective manner of running 
-  large Hadoop installations since this allows them to reap benefits of
-  economies of scale without creating private clusters. However, organizations 
-  are concerned about sharing a cluster because they are worried about others 
-  using the resources that are critical for their SLAs. 
-   
-  The <<<CapacityScheduler>>> is designed to allow sharing a large cluster while 
-  giving each organization capacity guarantees. The central idea is 
-  that the available resources in the Hadoop cluster are shared among multiple 
-  organizations who collectively fund the cluster based on their computing 
-  needs. There is an added benefit that an organization can access 
-  any excess capacity not being used by others. This provides elasticity for 
-  the organizations in a cost-effective manner.
-   
-  Sharing clusters across organizations necessitates strong support for
-  multi-tenancy since each organization must be guaranteed capacity and 
-  safe-guards to ensure the shared cluster is impervious to single rouge 
-  application or user or sets thereof. The <<<CapacityScheduler>>> provides a 
-  stringent set of limits to ensure that a single application or user or queue 
-  cannot consume disproportionate amount of resources in the cluster. Also, the 
-  <<<CapacityScheduler>>> provides limits on initialized/pending applications 
-  from a single user and queue to ensure fairness and stability of the cluster.
-   
-  The primary abstraction provided by the <<<CapacityScheduler>>> is the concept 
-  of <queues>. These queues are typically setup by administrators to reflect the 
-  economics of the shared cluster. 
-  
-  To provide further control and predictability on sharing of resources, the 
-  <<<CapacityScheduler>>> supports <hierarchical queues> to ensure 
-  resources are shared among the sub-queues of an organization before other 
-  queues are allowed to use free resources, there-by providing <affinity> 
-  for sharing free resources among applications of a given organization.
-   
-* {Features}
-
-  The <<<CapacityScheduler>>> supports the following features:
-  
-  * Hierarchical Queues - Hierarchy of queues is supported to ensure resources 
-    are shared among the sub-queues of an organization before other 
-    queues are allowed to use free resources, there-by providing more control
-    and predictability.
-    
-  * Capacity Guarantees - Queues are allocated a fraction of the capacity of the 
-    grid in the sense that a certain capacity of resources will be at their 
-    disposal. All applications submitted to a queue will have access to the 
-    capacity allocated to the queue. Adminstrators can configure soft limits and 
-    optional hard limits on the capacity allocated to each queue.
-    
-  * Security - Each queue has strict ACLs which controls which users can submit 
-    applications to individual queues. Also, there are safe-guards to ensure 
-    that users cannot view and/or modify applications from other users.
-    Also, per-queue and system administrator roles are supported.
-    
-  * Elasticity - Free resources can be allocated to any queue beyond it's 
-    capacity. When there is demand for these resources from queues running below 
-    capacity at a future point in time, as tasks scheduled on these resources 
-    complete, they will be assigned to applications on queues running below the
-    capacity (pre-emption is not supported). This ensures that resources are available 
-    in a predictable and elastic manner to queues, thus preventing artifical silos 
-    of resources in the cluster which helps utilization.
-    
-  * Multi-tenancy - Comprehensive set of limits are provided to prevent a 
-    single application, user and queue from monopolizing resources of the queue 
-    or the cluster as a whole to ensure that the cluster isn't overwhelmed.
-    
-  * Operability
-  
-    * Runtime Configuration - The queue definitions and properties such as 
-      capacity, ACLs can be changed, at runtime, by administrators in a secure 
-      manner to minimize disruption to users. Also, a console is provided for 
-      users and administrators to view current allocation of resources to 
-      various queues in the system. Administrators can <add additional queues> 
-      at runtime, but queues cannot be <deleted> at runtime.
-      
-    * Drain applications - Administrators can <stop> queues
-      at runtime to ensure that while existing applications run to completion,
-      no new applications can be submitted. If a queue is in <<<STOPPED>>> 
-      state, new applications cannot be submitted to <itself> or 
-      <any of its child queueus>. Existing applications continue to completion, 
-      thus the queue can be <drained> gracefully.  Administrators can also 
-      <start> the stopped queues. 
-    
-  * Resource-based Scheduling - Support for resource-intensive applications, 
-    where-in a application can optionally specify higher resource-requirements 
-    than the default, there-by accomodating applications with differing resource
-    requirements. Currently, <memory> is the the resource requirement supported.
-  
-  []
-  
-* {Configuration}
-
-  * Setting up <<<ResourceManager>>> to use <<<CapacityScheduler>>>
-  
-    To configure the <<<ResourceManager>>> to use the <<<CapacityScheduler>>>, set
-    the following property in the <<conf/yarn-site.xml>>:
-  
-*--------------------------------------+--------------------------------------+
-|| Property                            || Value                                |
-*--------------------------------------+--------------------------------------+
-| <<<yarn.resourcemanager.scheduler.class>>> | |
-| | <<<org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler>>> |
-*--------------------------------------+--------------------------------------+
-
-  * Setting up <queues>
-   
-    <<conf/capacity-scheduler.xml>> is the configuration file for the
-    <<<CapacityScheduler>>>.  
-  
-    The <<<CapacityScheduler>>> has a pre-defined queue called <root>. All 
-    queueus in the system are children of the root queue.
-
-    Further queues can be setup by configuring 
-    <<<yarn.scheduler.capacity.root.queues>>> with a list of comma-separated
-    child queues.
-    
-    The configuration for <<<CapacityScheduler>>> uses a concept called
-    <queue path> to configure the hierarchy of queues. The <queue path> is the
-    full path of the queue's hierarchy, starting at <root>, with . (dot) as the 
-    delimiter.
-    
-    A given queue's children can be defined with the configuration knob:
-    <<<yarn.scheduler.capacity.<queue-path>.queues>>>. Children do not 
-    inherit properties directly from the parent unless otherwise noted.
-
-    Here is an example with three top-level child-queues <<<a>>>, <<<b>>> and 
-    <<<c>>> and some sub-queues for <<<a>>> and <<<b>>>:
-     
-----    
-<property>
-  <name>yarn.scheduler.capacity.root.queues</name>
-  <value>a,b,c</value>
-  <description>The queues at the this level (root is the root queue).
-  </description>
-</property>
-
-<property>
-  <name>yarn.scheduler.capacity.root.a.queues</name>
-  <value>a1,a2</value>
-  <description>The queues at the this level (root is the root queue).
-  </description>
-</property>
-
-<property>
-  <name>yarn.scheduler.capacity.root.b.queues</name>
-  <value>b1,b2,b3</value>
-  <description>The queues at the this level (root is the root queue).
-  </description>
-</property>
-----    
-
-  * Queue Properties
-  
-    * Resource Allocation
-  
-*--------------------------------------+--------------------------------------+
-|| Property                            || Description                         |
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.<queue-path>.capacity>>> | |
-| | Queue <capacity> in percentage (%) as a float (e.g. 12.5).| 
-| | The sum of capacities for all queues, at each level, must be equal |
-| | to 100. | 
-| | Applications in the queue may consume more resources than the queue's | 
-| | capacity if there are free resources, providing elasticity. |
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.<queue-path>.maximum-capacity>>> |   | 
-| | Maximum queue capacity in percentage (%) as a float. |
-| | This limits the <elasticity> for applications in the queue. |
-| | Defaults to -1 which disables it. |
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent>>> |   | 
-| | Each queue enforces a limit on the percentage of resources allocated to a | 
-| | user at any given time, if there is demand for resources. The user limit | 
-| | can vary between a minimum and maximum value. The the former |
-| | (the minimum value) is set to this property value and the latter |
-| | (the maximum value) depends on the number of users who have submitted |
-| | applications. For e.g., suppose the value of this property is 25. | 
-| | If two users have submitted applications to a queue, no single user can |
-| | use more than 50% of the queue resources. If a third user submits an | 
-| | application, no single user can use more than 33% of the queue resources. |
-| | With 4 or more users, no user can use more than 25% of the queues |
-| | resources. A value of 100 implies no user limits are imposed. The default |
-| | is 100. Value is specified as a integer.|
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.<queue-path>.user-limit-factor>>> |   | 
-| | The multiple of the queue capacity which can be configured to allow a | 
-| | single user to acquire more resources. By default this is set to 1 which | 
-| | ensures that a single user can never take more than the queue's configured | 
-| | capacity irrespective of how idle th cluster is. Value is specified as |
-| | a float.|
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.<queue-path>.maximum-allocation-mb>>> |   |
-| | The per queue maximum limit of memory to allocate to each container |
-| | request at the Resource Manager. This setting overrides the cluster |
-| | configuration <<<yarn.scheduler.maximum-allocation-mb>>>. This value |
-| | must be smaller than or equal to the cluster maximum. |
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.<queue-path>.maximum-allocation-vcores>>> |   |
-| | The per queue maximum limit of virtual cores to allocate to each container |
-| | request at the Resource Manager. This setting overrides the cluster |
-| | configuration <<<yarn.scheduler.maximum-allocation-vcores>>>. This value |
-| | must be smaller than or equal to the cluster maximum. |
-*--------------------------------------+--------------------------------------+
-
-    * Running and Pending Application Limits
-    
-    
-    The <<<CapacityScheduler>>> supports the following parameters to control 
-    the running and pending applications:
-    
-
-*--------------------------------------+--------------------------------------+
-|| Property                            || Description                         |
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.maximum-applications>>> /  |
-| <<<yarn.scheduler.capacity.<queue-path>.maximum-applications>>>  | |
-| | Maximum number of applications in the system which can be concurrently |
-| | active both running and pending. Limits on each queue are directly |
-| | proportional to their queue capacities and user limits. This is a 
-| | hard limit and any applications submitted when this limit is reached will |
-| | be rejected. Default is 10000. This can be set for all queues with |
-| | <<<yarn.scheduler.capacity.maximum-applications>>> and can also be overridden on a  |
-| | per queue basis by setting <<<yarn.scheduler.capacity.<queue-path>.maximum-applications>>>. |
-| | Integer value expected.|
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.maximum-am-resource-percent>>> / |
-| <<<yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent>>> | |
-| | Maximum percent of resources in the cluster which can be used to run |
-| | application masters - controls number of concurrent active applications. Limits on each |
-| | queue are directly proportional to their queue capacities and user limits. |
-| | Specified as a float - ie 0.5 = 50%. Default is 10%. This can be set for all queues with |
-| | <<<yarn.scheduler.capacity.maximum-am-resource-percent>>> and can also be overridden on a  |
-| | per queue basis by setting <<<yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent>>> |
-*--------------------------------------+--------------------------------------+
-
-    * Queue Administration & Permissions
-    
-    The <<<CapacityScheduler>>> supports the following parameters to  
-    the administer the queues:
-    
-    
-*--------------------------------------+--------------------------------------+
-|| Property                            || Description                         |
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.<queue-path>.state>>> | |
-| | The <state> of the queue. Can be one of <<<RUNNING>>> or <<<STOPPED>>>. |
-| | If a queue is in <<<STOPPED>>> state, new applications cannot be |
-| | submitted to <itself> or <any of its child queues>. | 
-| | Thus, if the <root> queue is <<<STOPPED>>> no applications can be | 
-| | submitted to the entire cluster. |
-| | Existing applications continue to completion, thus the queue can be 
-| | <drained> gracefully. Value is specified as Enumeration. |
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.root.<queue-path>.acl_submit_applications>>> | |
-| | The <ACL> which controls who can <submit> applications to the given queue. |
-| | If the given user/group has necessary ACLs on the given queue or |
-| | <one of the parent queues in the hierarchy> they can submit applications. |
-| | <ACLs> for this property <are> inherited from the parent queue |
-| | if not specified. |
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.root.<queue-path>.acl_administer_queue>>> | |
-| | The <ACL> which controls who can <administer> applications on the given queue. |
-| | If the given user/group has necessary ACLs on the given queue or |
-| | <one of the parent queues in the hierarchy> they can administer applications. |
-| | <ACLs> for this property <are> inherited from the parent queue |
-| | if not specified. |
-*--------------------------------------+--------------------------------------+
-    
-    <Note:> An <ACL> is of the form <user1>, <user2><space><group1>, <group2>.
-    The special value of <<*>> implies <anyone>. The special value of <space>
-    implies <no one>. The default is <<*>> for the root queue if not specified.
-
-  * Other Properties
-
-    * Resource Calculator
-
-
-*--------------------------------------+--------------------------------------+
-|| Property                            || Description                         |
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.resource-calculator>>> | |
-| | The ResourceCalculator implementation to be used to compare Resources in the |
-| | scheduler. The default i.e. org.apache.hadoop.yarn.util.resource.DefaultResourseCalculator |
-| | only uses Memory while DominantResourceCalculator uses Dominant-resource |
-| | to compare multi-dimensional resources such as Memory, CPU etc. A Java |
-| | ResourceCalculator class name is expected. |
-*--------------------------------------+--------------------------------------+
-
-
-    * Data Locality
-
-*--------------------------------------+--------------------------------------+
-|| Property                            || Description                         |
-*--------------------------------------+--------------------------------------+
-| <<<yarn.scheduler.capacity.node-locality-delay>>> | |
-| | Number of missed scheduling opportunities after which the CapacityScheduler |
-| | attempts to schedule rack-local containers. Typically, this should be set to |
-| | number of nodes in the cluster. By default is setting approximately number |
-| | of nodes in one rack which is 40. Positive integer value is expected.|
-*--------------------------------------+--------------------------------------+
-
-
-  * Reviewing the configuration of the CapacityScheduler
-
-      Once the installation and configuration is completed, you can review it 
-      after starting the YARN cluster from the web-ui.
-
-    * Start the YARN cluster in the normal manner.
-
-    * Open the <<<ResourceManager>>> web UI.
-
-    * The </scheduler> web-page should show the resource usages of individual 
-      queues.
-      
-      []
-      
-* {Changing Queue Configuration}
-
-  Changing queue properties and adding new queues is very simple. You need to
-  edit <<conf/capacity-scheduler.xml>> and run <yarn rmadmin -refreshQueues>.
-  
-----
-$ vi $HADOOP_CONF_DIR/capacity-scheduler.xml
-$ $HADOOP_YARN_HOME/bin/yarn rmadmin -refreshQueues
-----  
-
-  <Note:> Queues cannot be <deleted>, only addition of new queues is supported -
-  the updated queue configuration should be a valid one i.e. queue-capacity at
-  each <level> should be equal to 100%.
-

http://git-wip-us.apache.org/repos/asf/hadoop/blob/2e44b75f/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/DockerContainerExecutor.apt.vm
----------------------------------------------------------------------
diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/DockerContainerExecutor.apt.vm b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/DockerContainerExecutor.apt.vm
deleted file mode 100644
index db75de9..0000000
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/DockerContainerExecutor.apt.vm
+++ /dev/null
@@ -1,204 +0,0 @@
-
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Map Reduce Next Generation-${project.version} - Docker Container Executor
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Docker Container Executor
-
-%{toc|section=1|fromDepth=0}
-
-* {Overview}
-
-    Docker (https://www.docker.io/) combines an easy-to-use interface to
-Linux containers with easy-to-construct image files for those
-containers.  In short, Docker launches very light weight virtual
-machines.
-
-    The Docker Container Executor (DCE) allows the YARN NodeManager to
-launch YARN containers into Docker containers.  Users can specify the
-Docker images they want for their YARN containers.  These containers
-provide a custom software environment in which the user's code runs,
-isolated from the software environment of the NodeManager.  These
-containers can include special libraries needed by the application,
-and they can have different versions of Perl, Python, and even Java
-than what is installed on the NodeManager.  Indeed, these containers
-can run a different flavor of Linux than what is running on the
-NodeManager -- although the YARN container must define all the environments
- and libraries needed to run the job, nothing will be shared with the NodeManager.
-
-   Docker for YARN provides both consistency (all YARN containers will
-have the same software environment) and isolation (no interference
-with whatever is installed on the physical machine).
-  
-* {Cluster Configuration}
-
-    Docker Container Executor runs in non-secure mode of HDFS and
-YARN. It will not run in secure mode, and will exit if it detects
-secure mode.
-
-    The DockerContainerExecutor requires Docker daemon to be running on
-the NodeManagers, and the Docker client installed and able to start Docker
-containers.  To prevent timeouts while starting jobs, the Docker
-images to be used by a job should already be downloaded in the
-NodeManagers. Here's an example of how this can be done:
-
-----
-sudo docker pull sequenceiq/hadoop-docker:2.4.1
-----
-
-   This should be done as part of the NodeManager startup.
-
-   The following properties must be set in yarn-site.xml:
-
-----
-<property>
- <name>yarn.nodemanager.docker-container-executor.exec-name</name>
-  <value>/usr/bin/docker</value>
-  <description>
-     Name or path to the Docker client. This is a required parameter. If this is empty,
-     user must pass an image name as part of the job invocation(see below).
-  </description>
-</property>
-
-<property>
-  <name>yarn.nodemanager.container-executor.class</name>
-  <value>org.apache.hadoop.yarn.server.nodemanager.DockerContainerExecutor</value>
-  <description>
-     This is the container executor setting that ensures that all
-jobs are started with the DockerContainerExecutor.
-  </description>
-</property>
-----
-
-   Administrators should be aware that DCE doesn't currently provide
-user name-space isolation.  This means, in particular, that software
-running as root in the YARN container will have root privileges in the
-underlying NodeManager.  Put differently, DCE currently provides no
-better security guarantees than YARN's Default Container Executor. In
-fact, DockerContainerExecutor will exit if it detects secure yarn.
-
-* {Tips for connecting to a secure docker repository}
-
-   By default, docker images are pulled from the docker public repository. The
-format of a docker image url is: <username>/<image_name>. For example,
-sequenceiq/hadoop-docker:2.4.1 is an image in docker public repository that contains java and
-hadoop.
-
-   If you want your own private repository, you provide the repository url instead of
-your username. Therefore, the image url becomes: <private_repo_url>/<image_name>.
-For example, if your repository is on localhost:8080, your images would be like:
- localhost:8080/hadoop-docker
-
-   To connect to a secure docker repository, you can use the following invocation:
-
-----
-docker login [OPTIONS] [SERVER]
-
-Register or log in to a Docker registry server, if no server is specified
-"https://index.docker.io/v1/" is the default.
-
--e, --email=""       Email
--p, --password=""    Password
--u, --username=""    Username
-----
-
-   If you want to login to a self-hosted registry you can specify this by adding
-the server name.
-
-----
-docker login <private_repo_url>
-----
-
-   This needs to be run as part of the NodeManager startup, or as a cron job if
-the login session expires periodically. You can login to multiple docker repositories
-from the same NodeManager, but all your users will have access to all your repositories,
-as at present the DockerContainerExecutor does not support per-job docker login.
-
-* {Job Configuration}
-
-   Currently you cannot configure any of the Docker settings with the job configuration.
-You can provide Mapper, Reducer, and ApplicationMaster environment overrides for the
-docker images, using the following 3 JVM properties respectively(only for MR jobs):
-
-  * mapreduce.map.env: You can override the mapper's image by passing
-    yarn.nodemanager.docker-container-executor.image-name=<your_image_name>
-    to this JVM property.
-
-  * mapreduce.reduce.env: You can override the reducer's image by passing
-    yarn.nodemanager.docker-container-executor.image-name=<your_image_name>
-    to this JVM property.
-
-  * yarn.app.mapreduce.am.env: You can override the ApplicationMaster's image
-    by passing yarn.nodemanager.docker-container-executor.image-name=<your_image_name>
-    to this JVM property.
-
-* {Docker Image requirements}
-
-   The Docker Images used for YARN containers must meet the following
-requirements:
-
-   The distro and version of Linux in your Docker Image can be quite different 
-from that of your NodeManager.  (Docker does have a few limitations in this 
-regard, but you're not likely to hit them.)  However, if you're using the 
-MapReduce framework, then your image will need to be configured for running 
-Hadoop. Java must be installed in the container, and the following environment variables
-must be defined in the image: JAVA_HOME, HADOOP_COMMON_PATH, HADOOP_HDFS_HOME,
-HADOOP_MAPRED_HOME, HADOOP_YARN_HOME, and HADOOP_CONF_DIR
-
-
-* {Working example of yarn launched docker containers.}
-
-  The following example shows how to run teragen using DockerContainerExecutor.
-
-  * First ensure that YARN is properly configured with DockerContainerExecutor(see above).
-
-----
-<property>
- <name>yarn.nodemanager.docker-container-executor.exec-name</name>
-  <value>docker -H=tcp://0.0.0.0:4243</value>
-  <description>
-     Name or path to the Docker client. The tcp socket must be
-     where docker daemon is listening.
-  </description>
-</property>
-
-<property>
-  <name>yarn.nodemanager.container-executor.class</name>
-  <value>org.apache.hadoop.yarn.server.nodemanager.DockerContainerExecutor</value>
-  <description>
-     This is the container executor setting that ensures that all
-jobs are started with the DockerContainerExecutor.
-  </description>
-</property>
-----
-
-  * Pick a custom Docker image if you want. In this example, we'll use sequenceiq/hadoop-docker:2.4.1 from the
-docker hub repository. It has jdk, hadoop, and all the previously mentioned environment variables configured.
-
-  * Run:
-
-----
-hadoop jar $HADOOP_INSTALLATION_DIR/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar \
-teragen \
--Dmapreduce.map.env="yarn.nodemanager.docker-container-executor.image-name=sequenceiq/hadoop-docker:2.4.1" \
--Dyarn.app.mapreduce.am.env="yarn.nodemanager.docker-container-executor.image-name=sequenceiq/hadoop-docker:2.4.1" \
-1000 \
-teragen_out_dir
-----
-
-  Once it succeeds, you can check the yarn debug logs to verify that docker indeed has launched containers.
-

http://git-wip-us.apache.org/repos/asf/hadoop/blob/2e44b75f/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm
----------------------------------------------------------------------
diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm
deleted file mode 100644
index 10de3e0..0000000
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm
+++ /dev/null
@@ -1,483 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Hadoop Map Reduce Next Generation-${project.version} - Fair Scheduler
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Hadoop MapReduce Next Generation - Fair Scheduler
-
-%{toc|section=1|fromDepth=0}
-
-* {Purpose} 
-
-  This document describes the <<<FairScheduler>>>, a pluggable scheduler for Hadoop 
-  that allows YARN applications to share resources in large clusters fairly.
-
-* {Introduction}
-
-  Fair scheduling is a method of assigning resources to applications such that 
-  all apps get, on average, an equal share of resources over time.
-  Hadoop NextGen is capable of scheduling multiple resource types. By default,
-  the Fair Scheduler bases scheduling fairness decisions only on memory. It
-  can be configured to schedule with both memory and CPU, using the notion
-  of Dominant Resource Fairness developed by Ghodsi et al. When there is a
-  single app running, that app uses the entire cluster. When other apps are
-  submitted, resources that free up are assigned to the new apps, so that each
-  app eventually on gets roughly the same amount of resources. Unlike the default
-  Hadoop scheduler, which forms a queue of apps, this lets short apps finish in
-  reasonable time while not starving long-lived apps. It is also a reasonable way
-  to share a cluster between a number of users. Finally, fair sharing can also
-  work with app priorities - the priorities are used as weights to determine the 
-  fraction of total resources that each app should get.
-
-  The scheduler organizes apps further into "queues", and shares resources
-  fairly between these queues. By default, all users share a single queue,
-  named "default". If an app specifically lists a queue in a container resource
-  request, the request is submitted to that queue. It is also possible to assign
-  queues based on the user name included with the request through
-  configuration. Within each queue, a scheduling policy is used to share
-  resources between the running apps. The default is memory-based fair sharing,
-  but FIFO and multi-resource with Dominant Resource Fairness can also be
-  configured. Queues can be arranged in a hierarchy to divide resources and
-  configured with weights to share the cluster in specific proportions.
-
-  In addition to providing fair sharing, the Fair Scheduler allows assigning 
-  guaranteed minimum shares to queues, which is useful for ensuring that 
-  certain users, groups or production applications always get sufficient 
-  resources. When a queue contains apps, it gets at least its minimum share, 
-  but when the queue does not need its full guaranteed share, the excess is 
-  split between other running apps. This lets the scheduler guarantee capacity 
-  for queues while utilizing resources efficiently when these queues don't
-  contain applications.
-
-  The Fair Scheduler lets all apps run by default, but it is also possible to 
-  limit the number of running apps per user and per queue through the config 
-  file. This can be useful when a user must submit hundreds of apps at once, 
-  or in general to improve performance if running too many apps at once would 
-  cause too much intermediate data to be created or too much context-switching.
-  Limiting the apps does not cause any subsequently submitted apps to fail, 
-  only to wait in the scheduler's queue until some of the user's earlier apps 
-  finish. 
-
-* {Hierarchical queues with pluggable policies}
-
-  The fair scheduler supports hierarchical queues. All queues descend from a
-  queue named "root". Available resources are distributed among the children
-  of the root queue in the typical fair scheduling fashion. Then, the children
-  distribute the resources assigned to them to their children in the same
-  fashion.  Applications may only be scheduled on leaf queues. Queues can be
-  specified as children of other queues by placing them as sub-elements of 
-  their parents in the fair scheduler allocation file.
-  
-  A queue's name starts with the names of its parents, with periods as
-  separators. So a queue named "queue1" under the root queue, would be referred
-  to as "root.queue1", and a queue named "queue2" under a queue named "parent1"
-  would be referred to as "root.parent1.queue2". When referring to queues, the
-  root part of the name is optional, so queue1 could be referred to as just
-  "queue1", and a queue2 could be referred to as just "parent1.queue2".
-
-  Additionally, the fair scheduler allows setting a different custom policy for
-  each queue to allow sharing the queue's resources in any which way the user
-  wants. A custom policy can be built by extending
-  <<<org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.SchedulingPolicy>>>.
-  FifoPolicy, FairSharePolicy (default), and DominantResourceFairnessPolicy are
-  built-in and can be readily used.
-
-  Certain add-ons are not yet supported which existed in the original (MR1) 
-  Fair Scheduler. Among them, is the use of a custom policies governing 
-  priority "boosting" over  certain apps. 
-
-* {Automatically placing applications in queues}
-
-  The Fair Scheduler allows administrators to configure policies that
-  automatically place submitted applications into appropriate queues. Placement
-  can depend on the user and groups of the submitter and the requested queue
-  passed by the application. A policy consists of a set of rules that are applied
-  sequentially to classify an incoming application. Each rule either places the
-  app into a queue, rejects it, or continues on to the next rule. Refer to the
-  allocation file format below for how to configure these policies.
-
-* {Installation}
-
-  To use the Fair Scheduler first assign the appropriate scheduler class in 
-  yarn-site.xml:
-
-------
-<property>
-  <name>yarn.resourcemanager.scheduler.class</name>
-  <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
-</property>
-------
-
-* {Configuration}
-
-  Customizing the Fair Scheduler typically involves altering two files. First, 
-  scheduler-wide options can be set by adding configuration properties in the 
-  yarn-site.xml file in your existing configuration directory. Second, in 
-  most cases users will want to create an allocation file listing which queues 
-  exist and their respective weights and capacities. The allocation file
-  is reloaded every 10 seconds, allowing changes to be made on the fly.
-
-Properties that can be placed in yarn-site.xml
-
- * <<<yarn.scheduler.fair.allocation.file>>>
-
-   * Path to allocation file. An allocation file is an XML manifest describing
-     queues and their properties, in addition to certain policy defaults. This file
-     must be in the XML format described in the next section. If a relative path is
-     given, the file is searched for on the classpath (which typically includes
-     the Hadoop conf directory).
-     Defaults to fair-scheduler.xml.
-
- * <<<yarn.scheduler.fair.user-as-default-queue>>>
-
-    * Whether to use the username associated with the allocation as the default 
-      queue name, in the event that a queue name is not specified. If this is set 
-      to "false" or unset, all jobs have a shared default queue, named "default".
-      Defaults to true.  If a queue placement policy is given in the allocations
-      file, this property is ignored.
-
- * <<<yarn.scheduler.fair.preemption>>>
-
-    * Whether to use preemption. Defaults to false.
-
- * <<<yarn.scheduler.fair.preemption.cluster-utilization-threshold>>>
-
-    * The utilization threshold after which preemption kicks in. The
-      utilization is computed as the maximum ratio of usage to capacity among
-      all resources. Defaults to 0.8f.
-
- * <<<yarn.scheduler.fair.sizebasedweight>>>
-  
-    * Whether to assign shares to individual apps based on their size, rather than
-      providing an equal share to all apps regardless of size. When set to true,
-      apps are weighted by the natural logarithm of one plus the app's total
-      requested memory, divided by the natural logarithm of 2. Defaults to false.
-
- * <<<yarn.scheduler.fair.assignmultiple>>>
-
-    * Whether to allow multiple container assignments in one heartbeat. Defaults
-      to false.
-
- * <<<yarn.scheduler.fair.max.assign>>>
-
-    * If assignmultiple is true, the maximum amount of containers that can be
-      assigned in one heartbeat. Defaults to -1, which sets no limit.
-
- * <<<yarn.scheduler.fair.locality.threshold.node>>>
-
-    * For applications that request containers on particular nodes, the number of
-      scheduling opportunities since the last container assignment to wait before
-      accepting a placement on another node. Expressed as a float between 0 and 1,
-      which, as a fraction of the cluster size, is the number of scheduling
-      opportunities to pass up. The default value of -1.0 means don't pass up any
-      scheduling opportunities.
-
- * <<<yarn.scheduler.fair.locality.threshold.rack>>>
-
-    * For applications that request containers on particular racks, the number of
-      scheduling opportunities since the last container assignment to wait before
-      accepting a placement on another rack. Expressed as a float between 0 and 1,
-      which, as a fraction of the cluster size, is the number of scheduling
-      opportunities to pass up. The default value of -1.0 means don't pass up any
-      scheduling opportunities.
-
- * <<<yarn.scheduler.fair.allow-undeclared-pools>>>
-
-    * If this is true, new queues can be created at application submission time,
-      whether because they are specified as the application's queue by the
-      submitter or because they are placed there by the user-as-default-queue
-      property. If this is false, any time an app would be placed in a queue that
-      is not specified in the allocations file, it is placed in the "default" queue
-      instead. Defaults to true. If a queue placement policy is given in the
-      allocations file, this property is ignored.
-
- * <<<yarn.scheduler.fair.update-interval-ms>>>
- 
-    * The interval at which to lock the scheduler and recalculate fair shares,
-      recalculate demand, and check whether anything is due for preemption.
-      Defaults to 500 ms. 
-
-Allocation file format
-
-  The allocation file must be in XML format. The format contains five types of
-  elements:
-
- * <<Queue elements>>, which represent queues. Queue elements can take an optional
-   attribute 'type', which when set to 'parent' makes it a parent queue. This is useful
-   when we want to create a parent queue without configuring any leaf queues.
-   Each queue element may contain the following properties:
-
-   * minResources: minimum resources the queue is entitled to, in the form
-     "X mb, Y vcores". For the single-resource fairness policy, the vcores
-     value is ignored. If a queue's minimum share is not satisfied, it will be
-     offered available resources before any other queue under the same parent.
-     Under the single-resource fairness policy, a queue
-     is considered unsatisfied if its memory usage is below its minimum memory
-     share. Under dominant resource fairness, a queue is considered unsatisfied
-     if its usage for its dominant resource with respect to the cluster capacity
-     is below its minimum share for that resource. If multiple queues are
-     unsatisfied in this situation, resources go to the queue with the smallest
-     ratio between relevant resource usage and minimum. Note that it is
-     possible that a queue that is below its minimum may not immediately get up
-     to its minimum when it submits an application, because already-running jobs
-     may be using those resources.
-
-   * maxResources: maximum resources a queue is allowed, in the form
-     "X mb, Y vcores". For the single-resource fairness policy, the vcores
-     value is ignored. A queue will never be assigned a container that would
-     put its aggregate usage over this limit.
-
-   * maxRunningApps: limit the number of apps from the queue to run at once
-
-   * maxAMShare: limit the fraction of the queue's fair share that can be used
-     to run application masters. This property can only be used for leaf queues.
-     For example, if set to 1.0f, then AMs in the leaf queue can take up to 100%
-     of both the memory and CPU fair share. The value of -1.0f will disable
-     this feature and the amShare will not be checked. The default value is 0.5f.
-
-   * weight: to share the cluster non-proportionally with other queues. Weights
-     default to 1, and a queue with weight 2 should receive approximately twice
-     as many resources as a queue with the default weight.
-
-   * schedulingPolicy: to set the scheduling policy of any queue. The allowed
-     values are "fifo"/"fair"/"drf" or any class that extends
-     <<<org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.SchedulingPolicy>>>. 
-     Defaults to "fair". If "fifo", apps with earlier submit times are given preference
-     for containers, but apps submitted later may run concurrently if there is
-     leftover space on the cluster after satisfying the earlier app's requests.
-
-   * aclSubmitApps: a list of users and/or groups that can submit apps to the
-     queue. Refer to the ACLs section below for more info on the format of this
-     list and how queue ACLs work.
-
-   * aclAdministerApps: a list of users and/or groups that can administer a
-     queue.  Currently the only administrative action is killing an application.
-     Refer to the ACLs section below for more info on the format of this list
-     and how queue ACLs work.
-
-   * minSharePreemptionTimeout: number of seconds the queue is under its minimum share
-     before it will try to preempt containers to take resources from other queues.
-     If not set, the queue will inherit the value from its parent queue.
-
-   * fairSharePreemptionTimeout: number of seconds the queue is under its fair share
-     threshold before it will try to preempt containers to take resources from other
-     queues. If not set, the queue will inherit the value from its parent queue.
-
-   * fairSharePreemptionThreshold: the fair share preemption threshold for the
-     queue. If the queue waits fairSharePreemptionTimeout without receiving
-     fairSharePreemptionThreshold*fairShare resources, it is allowed to preempt
-     containers to take resources from other queues. If not set, the queue will
-     inherit the value from its parent queue.
-
- * <<User elements>>, which represent settings governing the behavior of individual 
-     users. They can contain a single property: maxRunningApps, a limit on the 
-     number of running apps for a particular user.
-
- * <<A userMaxAppsDefault element>>, which sets the default running app limit 
-   for any users whose limit is not otherwise specified.
-
- * <<A defaultFairSharePreemptionTimeout element>>, which sets the fair share
-   preemption timeout for the root queue; overridden by fairSharePreemptionTimeout
-   element in root queue.
-
- * <<A defaultMinSharePreemptionTimeout element>>, which sets the min share
-   preemption timeout for the root queue; overridden by minSharePreemptionTimeout
-   element in root queue.
-
- * <<A defaultFairSharePreemptionThreshold element>>, which sets the fair share
-   preemption threshold for the root queue; overridden by fairSharePreemptionThreshold
-   element in root queue.
-
- * <<A queueMaxAppsDefault element>>, which sets the default running app limit
-   for queues; overriden by maxRunningApps element in each queue.
-
- * <<A queueMaxAMShareDefault element>>, which sets the default AM resource
-   limit for queue; overriden by maxAMShare element in each queue.
-
- * <<A defaultQueueSchedulingPolicy element>>, which sets the default scheduling
-   policy for queues; overriden by the schedulingPolicy element in each queue
-   if specified. Defaults to "fair".
-
- * <<A queuePlacementPolicy element>>, which contains a list of rule elements
-   that tell the scheduler how to place incoming apps into queues. Rules
-   are applied in the order that they are listed. Rules may take arguments. All
-   rules accept the "create" argument, which indicates whether the rule can create
-   a new queue. "Create" defaults to true; if set to false and the rule would
-   place the app in a queue that is not configured in the allocations file, we
-   continue on to the next rule. The last rule must be one that can never issue a
-   continue.  Valid rules are:
-
-     * specified: the app is placed into the queue it requested.  If the app
-       requested no queue, i.e. it specified "default", we continue. If the app
-       requested a queue name starting or ending with period, i.e. names like
-       ".q1" or "q1." will be rejected.
-
-     * user: the app is placed into a queue with the name of the user who
-       submitted it. Periods in the username will be replace with "_dot_",
-       i.e. the queue name for user "first.last" is "first_dot_last".
-
-     * primaryGroup: the app is placed into a queue with the name of the
-       primary group of the user who submitted it. Periods in the group name
-       will be replaced with "_dot_", i.e. the queue name for group "one.two"
-       is "one_dot_two".
-
-     * secondaryGroupExistingQueue: the app is placed into a queue with a name
-       that matches a secondary group of the user who submitted it. The first
-       secondary group that matches a configured queue will be selected.
-       Periods in group names will be replaced with "_dot_", i.e. a user with
-       "one.two" as one of their secondary groups would be placed into the
-       "one_dot_two" queue, if such a queue exists.
-
-     * nestedUserQueue : the app is placed into a queue with the name of the user
-       under the queue suggested by the nested rule. This is similar to ‘user’
-       rule,the difference being in 'nestedUserQueue' rule,user queues can be created 
-       under any parent queue, while 'user' rule creates user queues only under root queue.
-       Note that nestedUserQueue rule would be applied only if the nested rule returns a 
-       parent queue.One can configure a parent queue either by setting 'type' attribute of queue
-       to 'parent' or by configuring at least one leaf under that queue which makes it a parent.
-       See example allocation for a sample use case. 
-
-     * default: the app is placed into the queue specified in the 'queue' attribute of the 
-       default rule. If 'queue' attribute is not specified, the app is placed into 'root.default' queue.
-
-     * reject: the app is rejected.
-
-  An example allocation file is given here:
-
----
-<?xml version="1.0"?>
-<allocations>
-  <queue name="sample_queue">
-    <minResources>10000 mb,0vcores</minResources>
-    <maxResources>90000 mb,0vcores</maxResources>
-    <maxRunningApps>50</maxRunningApps>
-    <maxAMShare>0.1</maxAMShare>
-    <weight>2.0</weight>
-    <schedulingPolicy>fair</schedulingPolicy>
-    <queue name="sample_sub_queue">
-      <aclSubmitApps>charlie</aclSubmitApps>
-      <minResources>5000 mb,0vcores</minResources>
-    </queue>
-  </queue>
-
-  <queueMaxAMShareDefault>0.5</queueMaxAMShareDefault>
-
-  <!-- Queue 'secondary_group_queue' is a parent queue and may have
-       user queues under it -->
-  <queue name="secondary_group_queue" type="parent">
-  <weight>3.0</weight>
-  </queue>
-  
-  <user name="sample_user">
-    <maxRunningApps>30</maxRunningApps>
-  </user>
-  <userMaxAppsDefault>5</userMaxAppsDefault>
-  
-  <queuePlacementPolicy>
-    <rule name="specified" />
-    <rule name="primaryGroup" create="false" />
-    <rule name="nestedUserQueue">
-        <rule name="secondaryGroupExistingQueue" create="false" />
-    </rule>
-    <rule name="default" queue="sample_queue"/>
-  </queuePlacementPolicy>
-</allocations>
----
-
-  Note that for backwards compatibility with the original FairScheduler, "queue" elements can instead be named as "pool" elements.
-
-
-Queue Access Control Lists (ACLs)
-
-  Queue Access Control Lists (ACLs) allow administrators to control who may
-  take actions on particular queues. They are configured with the aclSubmitApps
-  and aclAdministerApps properties, which can be set per queue. Currently the
-  only supported administrative action is killing an application. Anybody who
-  may administer a queue may also submit applications to it. These properties
-  take values in a format like "user1,user2 group1,group2" or " group1,group2".
-  An action on a queue will be permitted if its user or group is in the ACL of
-  that queue or in the ACL of any of that queue's ancestors. So if queue2
-  is inside queue1, and user1 is in queue1's ACL, and user2 is in queue2's
-  ACL, then both users may submit to queue2.
-
-  <<Note:>> The delimiter is a space character. To specify only ACL groups, begin the 
-  value with a space character. 
-  
-  The root queue's ACLs are "*" by default which, because ACLs are passed down,
-  means that everybody may submit to and kill applications from every queue.
-  To start restricting access, change the root queue's ACLs to something other
-  than "*". 
-
-  
-* {Administration}
-
-  The fair scheduler provides support for administration at runtime through a few mechanisms:
-
-Modifying configuration at runtime
-
-  It is possible to modify minimum shares, limits, weights, preemption timeouts
-  and queue scheduling policies at runtime by editing the allocation file. The
-  scheduler will reload this file 10-15 seconds after it sees that it was
-  modified.
-
-Monitoring through web UI
-
-  Current applications, queues, and fair shares can be examined through the
-  ResourceManager's web interface, at
-  http://<ResourceManager URL>/cluster/scheduler.
-
-  The following fields can be seen for each queue on the web interface:
-  
- * Used Resources - The sum of resources allocated to containers within the queue. 
-
- * Num Active Applications - The number of applications in the queue that have
-   received at least one container.
- 
- * Num Pending Applications - The number of applications in the queue that have
-   not yet received any containers.
-
- * Min Resources - The configured minimum resources that are guaranteed to the queue.
-  	
- * Max Resources - The configured maximum resources that are allowed to the queue.
- 
- * Instantaneous Fair Share - The queue's instantaneous fair share of resources.
-   These shares consider only actives queues (those with running applications),
-   and are used for scheduling decisions. Queues may be allocated resources
-   beyond their shares when other queues aren't using them. A queue whose
-   resource consumption lies at or below its instantaneous fair share will never
-   have its containers preempted.
-
- * Steady Fair Share - The queue's steady fair share of resources. These shares
-   consider all the queues irrespective of whether they are active (have
-   running applications) or not. These are computed less frequently and
-   change only when the configuration or capacity changes.They are meant to
-   provide visibility into resources the user can expect, and hence displayed
-   in the Web UI.
-
-Moving applications between queues
-
-  The Fair Scheduler supports moving a running application to a different queue.
-  This can be useful for moving an important application to a higher priority
-  queue, or for moving an unimportant application to a lower priority queue.
-  Apps can be moved by running "yarn application -movetoqueue appID -queue
-  targetQueueName".
-  
-  When an application is moved to a queue, its existing allocations become
-  counted with the new queue's allocations instead of the old for purposes
-  of determining fairness. An attempt to move an application to a queue will
-  fail if the addition of the app's resources to that queue would violate the
-  its maxRunningApps or maxResources constraints.
-

http://git-wip-us.apache.org/repos/asf/hadoop/blob/2e44b75f/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManager.apt.vm
----------------------------------------------------------------------
diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManager.apt.vm b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManager.apt.vm
deleted file mode 100644
index 9ee942f..0000000
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManager.apt.vm
+++ /dev/null
@@ -1,64 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  NodeManager Overview.
-  ---
-  ---
-  ${maven.build.timestamp}
-
-NodeManager Overview.
-
-%{toc|section=1|fromDepth=0|toDepth=2}
-
-* Overview
-
-  The NodeManager is responsible for launching and managing containers on a node. Containers execute tasks as specified by the AppMaster.
-  
-* Health checker service
-
-  The NodeManager runs services to determine the health of the node it is executing on. The services perform checks on the disk as well as any user specified tests. If any health check fails, the NodeManager marks the node as unhealthy and communicates this to the ResourceManager, which then stops assigning containers to the node. Communication of the node status is done as part of the heartbeat between the NodeManager and the ResourceManager. The intervals at which the disk checker and health monitor(described below) run don't affect the heartbeat intervals. When the heartbeat takes place, the status of both checks is used to determine the health of the node.
-
-  ** Disk checker
-
-    The disk checker checks the state of the disks that the NodeManager is configured to use(local-dirs and log-dirs, configured using yarn.nodemanager.local-dirs and yarn.nodemanager.log-dirs respectively). The checks include permissions and free disk space. It also checks that the filesystem isn't in a read-only state. The checks are run at 2 minute intervals by default but can be configured to run as often as the user desires. If a disk fails the check, the NodeManager stops using that particular disk but still reports the node status as healthy. However if a number of disks fail the check(the number can be configured, as explained below), then the node is reported as unhealthy to the ResourceManager and new containers will not be assigned to the node. In addition, once a disk is marked as unhealthy, the NodeManager stops checking it to see if it has recovered(e.g. disk became full and was then cleaned up). The only way for the NodeManager to use that disk to restart the software
  on the node. The following configuration parameters can be used to modify the disk checks:
-
-*------------------+----------------+------------------+
-|| Configuration name || Allowed Values || Description |
-*------------------+----------------+------------------+
-| yarn.nodemanager.disk-health-checker.enable | true, false | Enable or disable the disk health checker service |
-*------------------+----------------+------------------+
-| yarn.nodemanager.disk-health-checker.interval-ms | Positive integer | The interval, in milliseconds, at which the disk checker should run; the default value is 2 minutes |
-*------------------+----------------+------------------+
-| yarn.nodemanager.disk-health-checker.min-healthy-disks | Float between 0-1 | The minimum fraction of disks that must pass the check for the NodeManager to mark the node as healthy; the default is 0.25 |
-*------------------+----------------+------------------+
-| yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage | Float between 0-100 | The maximum percentage of disk space that may be utilized before a disk is marked as unhealthy by the disk checker service. This check is run for every disk used by the NodeManager. The default value is 100 i.e. the entire disk can be used. |
-*------------------+----------------+------------------+
-| yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb | Integer | The minimum amount of free space that must be available on the disk for the disk checker service to mark the disk as healthy. This check is run for every disk used by the NodeManager. The default value is 0 i.e. the entire disk can be used. |
-*------------------+----------------+------------------+
-
- ** External health script
-
-    Users may specify their own health checker script that will be invoked by the health checker service. Users may specify a timeout as well as options to be passed to the script. If the script exits with a non-zero exit code, times out or results in an exception being thrown, the node is marked as unhealthy. Please note that if the script cannot be executed due to permissions or an incorrect path, etc, then it counts as a failure and the node will be reported as unhealthy. Please note that speifying a health check script is not mandatory. If no script is specified, only the disk checker status will be used to determine the health of the node. The following configuration parameters can be used to set the health script:
-
-*------------------+----------------+------------------+
-|| Configuration name || Allowed Values || Description |
-*------------------+----------------+------------------+
-| yarn.nodemanager.health-checker.interval-ms | Postive integer | The interval, in milliseconds, at which health checker service runs; the default value is 10 minutes. |
-*------------------+----------------+------------------+
-| yarn.nodemanager.health-checker.script.timeout-ms | Postive integer | The timeout for the health script that's executed; the default value is 20 minutes. |
-*------------------+----------------+------------------+
-| yarn.nodemanager.health-checker.script.path | String | Absolute path to the health check script to be run. |
-*------------------+----------------+------------------+
-| yarn.nodemanager.health-checker.script.opts | String | Arguments to be passed to the script when the script is executed. |
-*------------------+----------------+------------------+
-

http://git-wip-us.apache.org/repos/asf/hadoop/blob/2e44b75f/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerCgroups.apt.vm
----------------------------------------------------------------------
diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerCgroups.apt.vm b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerCgroups.apt.vm
deleted file mode 100644
index f228e3b..0000000
--- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerCgroups.apt.vm
+++ /dev/null
@@ -1,77 +0,0 @@
-~~ Licensed under the Apache License, Version 2.0 (the "License");
-~~ you may not use this file except in compliance with the License.
-~~ You may obtain a copy of the License at
-~~
-~~   http://www.apache.org/licenses/LICENSE-2.0
-~~
-~~ Unless required by applicable law or agreed to in writing, software
-~~ distributed under the License is distributed on an "AS IS" BASIS,
-~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-~~ See the License for the specific language governing permissions and
-~~ limitations under the License. See accompanying LICENSE file.
-
-  ---
-  Using CGroups with YARN
-  ---
-  ---
-  ${maven.build.timestamp}
-
-Using CGroups with YARN
-
-%{toc|section=1|fromDepth=0|toDepth=2}
-
- CGroups is a mechanism for aggregating/partitioning sets of tasks, and all their future children, into hierarchical groups with specialized behaviour. CGroups is a Linux kernel feature and was merged into kernel version 2.6.24. From a YARN perspective, this allows containers to be limited in their resource usage. A good example of this is CPU usage. Without CGroups, it becomes hard to limit container CPU usage. Currently, CGroups is only used for limiting CPU usage.
-
-* CGroups configuration
-
- The config variables related to using CGroups are the following:
-
- The following settings are related to setting up CGroups. All of these need to be set in yarn-site.xml.
-
-  [[1]] yarn.nodemanager.container-executor.class
-
-    This should be set to "org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor". CGroups is a Linux kernel feature and is exposed via the LinuxContainerExecutor.
-
-  [[2]] yarn.nodemanager.linux-container-executor.resources-handler.class
-
-    This should be set to "org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler".Using the LinuxContainerExecutor doesn't force you to use CGroups. If you wish to use CGroups, the resource-handler-class must be set to CGroupsLCEResourceHandler.
-
-  [[3]] yarn.nodemanager.linux-container-executor.cgroups.hierarchy
-
-    The cgroups hierarchy under which to place YARN proccesses(cannot contain commas). If yarn.nodemanager.linux-container-executor.cgroups.mount is false (that is, if cgroups have been pre-configured), then this cgroups hierarchy must already exist
-
-  [[4]] yarn.nodemanager.linux-container-executor.cgroups.mount
-
-    Whether the LCE should attempt to mount cgroups if not found - can be true or false
-
-  [[5]] yarn.nodemanager.linux-container-executor.cgroups.mount-path
-
-    Where the LCE should attempt to mount cgroups if not found. Common locations include /sys/fs/cgroup and /cgroup; the default location can vary depending on the Linux distribution in use. This path must exist before the NodeManager is launched. Only used when the LCE resources handler is set to the CgroupsLCEResourcesHandler, and yarn.nodemanager.linux-container-executor.cgroups.mount is true. A point to note here is that the container-executor binary will try to mount the path specified + "/" + the subsystem. In our case, since we are trying to limit CPU the binary tries to mount the path specified + "/cpu" and that's the path it expects to exist.
-
-  [[6]] yarn.nodemanager.linux-container-executor.group
-
-    The Unix group of the NodeManager. It should match the setting in "container-executor.cfg". This configuration is required for validating the secure access of the container-executor binary.
-
- The following settings are related to limiting resource usage of YARN containers
-
-  [[1]] yarn.nodemanager.resource.percentage-physical-cpu-limit
-
-    This setting lets you limit the cpu usage of all YARN containers. It sets a hard upper limit on the cumulative CPU usage of the containers. For example, if set to 60, the combined CPU usage of all YARN containers will not exceed 60%.
-
-  [[2]] yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
-
-    CGroups allows cpu usage limits to be hard or soft. When this setting is true, containers cannot use more CPU usage than allocated even if spare CPU is available. This ensures that containers can only use CPU that they were allocated. When set to false, containers can use spare CPU if available. It should be noted that irrespective of whether set to true or false, at no time can the combined CPU usage of all containers exceed the value specified in "yarn.nodemanager.resource.percentage-physical-cpu-limit".
-
-* CGroups and security
-
- CGroups itself has no requirements related to security. However, the LinuxContainerExecutor does have some requirements. If running in non-secure mode, by default, the LCE runs all jobs as user "nobody". This user can be changed by setting "yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user" to the desired user. However, it can also be configured to run jobs as the user submitting the job. In that case "yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users" should be set to false.
-
-*-----------+-----------+---------------------------+
-|| yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user || yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users || User running jobs |
-*-----------+-----------+---------------------------+
-| (default) | (default) | nobody                    |
-*-----------+-----------+---------------------------+
-| yarn      | (default) | yarn                      |
-*-----------+-----------+---------------------------+
-| yarn      | false     | (User submitting the job) |
-*-----------+-----------+---------------------------+