You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by yh...@apache.org on 2009/05/05 10:46:10 UTC
svn commit: r771623 - in /hadoop/core/branches/branch-0.20: ./ conf/
src/docs/src/documentation/content/xdocs/
Author: yhemanth
Date: Tue May 5 08:46:09 2009
New Revision: 771623
URL: http://svn.apache.org/viewvc?rev=771623&view=rev
Log:
HADOOP-5736. Update the capacity scheduler documentation for features like memory based scheduling, job initialization and removal of pre-emption. Contributed by Sreekanth Ramakrishnan.
Modified:
hadoop/core/branches/branch-0.20/CHANGES.txt
hadoop/core/branches/branch-0.20/conf/capacity-scheduler.xml.template
hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml
hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/cluster_setup.xml
hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
Modified: hadoop/core/branches/branch-0.20/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.20/CHANGES.txt?rev=771623&r1=771622&r2=771623&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.20/CHANGES.txt (original)
+++ hadoop/core/branches/branch-0.20/CHANGES.txt Tue May 5 08:46:09 2009
@@ -13,6 +13,10 @@
HADOOP-5711. Change Namenode file close log to info. (szetszwo)
+ HADOOP-5736. Update the capacity scheduler documentation for features
+ like memory based scheduling, job initialization and removal of pre-emption.
+ (Sreekanth Ramakrishnan via yhemanth)
+
OPTIMIZATIONS
BUG FIXES
Modified: hadoop/core/branches/branch-0.20/conf/capacity-scheduler.xml.template
URL: http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.20/conf/capacity-scheduler.xml.template?rev=771623&r1=771622&r2=771623&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.20/conf/capacity-scheduler.xml.template (original)
+++ hadoop/core/branches/branch-0.20/conf/capacity-scheduler.xml.template Tue May 5 08:46:09 2009
@@ -60,16 +60,13 @@
<property>
<name>mapred.capacity-scheduler.task.default-pmem-percentage-in-vmem</name>
<value>-1</value>
- <description>If mapred.task.maxpmem is set to -1, this configuration will
- be used to calculate job's physical memory requirements as a percentage of
- the job's virtual memory requirements set via mapred.task.maxvmem. This
- property thus provides default value of physical memory for job's that
- don't explicitly specify physical memory requirements.
-
- If not explicitly set to a valid value, scheduler will not consider
- physical memory for scheduling even if virtual memory based scheduling is
- enabled(by setting valid values for both mapred.task.default.maxvmem and
- mapred.task.limit.maxvmem).
+ <description>A percentage (float) of the default VM limit for jobs
+ (mapred.task.default.maxvm). This is the default RAM task-limit
+ associated with a task. Unless overridden by a job's setting, this
+ number defines the RAM task-limit.
+
+ If this property is missing, or set to an invalid value, scheduling
+ based on physical memory, RAM, is disabled.
</description>
</property>
Modified: hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml
URL: http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml?rev=771623&r1=771622&r2=771623&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml (original)
+++ hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/capacity_scheduler.xml Tue May 5 08:46:09 2009
@@ -28,7 +28,9 @@
<section>
<title>Purpose</title>
- <p>This document describes the Capacity Scheduler, a pluggable Map/Reduce scheduler for Hadoop which provides a way to share large clusters.</p>
+ <p>This document describes the Capacity Scheduler, a pluggable
+ Map/Reduce scheduler for Hadoop which provides a way to share
+ large clusters.</p>
</section>
<section>
@@ -40,19 +42,17 @@
Support for multiple queues, where a job is submitted to a queue.
</li>
<li>
- Queues are guaranteed a fraction of the capacity of the grid (their
- 'guaranteed capacity') in the sense that a certain capacity of
- resources will be at their disposal. All jobs submitted to a
- queue will have access to the capacity guaranteed to the queue.
+ Queues are allocated a fraction of the capacity of the grid in the
+ sense that a certain capacity of resources will be at their
+ disposal. All jobs submitted to a queue will have access to the
+ capacity allocated to the queue.
</li>
<li>
- Free resources can be allocated to any queue beyond its guaranteed
- capacity. These excess allocated resources can be reclaimed and made
- available to another queue in order to meet its capacity guarantee.
- </li>
- <li>
- The scheduler guarantees that excess resources taken from a queue
- will be restored to it within N minutes of its need for them.
+ Free resources can be allocated to any queue beyond it's capacity.
+ When there is demand for these resources from queues running below
+ capacity at a future point in time, as tasks scheduled on these
+ resources complete, they will be assigned to jobs on queues
+ running below the capacity.
</li>
<li>
Queues optionally support job priorities (disabled by default).
@@ -60,7 +60,9 @@
<li>
Within a queue, jobs with higher priority will have access to the
queue's resources before jobs with lower priority. However, once a
- job is running, it will not be preempted for a higher priority job.
+ job is running, it will not be preempted for a higher priority job,
+ though new tasks from the higher priority job will be
+ preferentially scheduled.
</li>
<li>
In order to prevent one or more users from monopolizing its
@@ -83,59 +85,34 @@
<p>Note that many of these steps can be, and will be, enhanced over time
to provide better algorithms.</p>
- <p>Whenever a TaskTracker is free, the Capacity Scheduler first picks a
- queue that needs to reclaim any resources the earliest (this is a queue
- whose resources were temporarily being used by some other queue and now
- needs access to those resources). If no such queue is found, it then picks
+ <p>Whenever a TaskTracker is free, the Capacity Scheduler picks
a queue which has most free space (whose ratio of # of running slots to
- guaranteed capacity is the lowest).</p>
+ capacity is the lowest).</p>
- <p>Once a queue is selected, the scheduler picks a job in the queue. Jobs
+ <p>Once a queue is selected, the Scheduler picks a job in the queue. Jobs
are sorted based on when they're submitted and their priorities (if the
queue supports priorities). Jobs are considered in order, and a job is
selected if its user is within the user-quota for the queue, i.e., the
user is not already using queue resources above his/her limit. The
- scheduler also makes sure that there is enough free memory in the
+ Scheduler also makes sure that there is enough free memory in the
TaskTracker to tun the job's task, in case the job has special memory
requirements.</p>
- <p>Once a job is selected, the scheduler picks a task to run. This logic
+ <p>Once a job is selected, the Scheduler picks a task to run. This logic
to pick a task remains unchanged from earlier versions.</p>
</section>
<section>
- <title>Reclaiming capacity</title>
-
- <p>Periodically, the scheduler determines:</p>
- <ul>
- <li>
- if a queue needs to reclaim capacity. This happens when a queue has
- at least one task pending and part of its guaranteed capacity is
- being used by some other queue. If this happens, the scheduler notes
- the amount of resources it needs to reclaim for this queue within a
- specified period of time (the reclaim time).
- </li>
- <li>
- if a queue has not received all the resources it needed to reclaim,
- and its reclaim time is about to expire. In this case, the scheduler
- needs to kill tasks from queues running over capacity. This it does
- by killing the tasks that started the latest.
- </li>
- </ul>
-
- </section>
-
- <section>
<title>Installation</title>
- <p>The capacity scheduler is available as a JAR file in the Hadoop
+ <p>The Capacity Scheduler is available as a JAR file in the Hadoop
tarball under the <em>contrib/capacity-scheduler</em> directory. The name of
the JAR file would be on the lines of hadoop-*-capacity-scheduler.jar.</p>
- <p>You can also build the scheduler from source by executing
+ <p>You can also build the Scheduler from source by executing
<em>ant package</em>, in which case it would be available under
<em>build/contrib/capacity-scheduler</em>.</p>
- <p>To run the capacity scheduler in your Hadoop installation, you need
+ <p>To run the Capacity Scheduler in your Hadoop installation, you need
to put it on the <em>CLASSPATH</em>. The easiest way is to copy the
<code>hadoop-*-capacity-scheduler.jar</code> from
to <code>HADOOP_HOME/lib</code>. Alternatively, you can modify
@@ -147,9 +124,9 @@
<title>Configuration</title>
<section>
- <title>Using the capacity scheduler</title>
+ <title>Using the Capacity Scheduler</title>
<p>
- To make the Hadoop framework use the capacity scheduler, set up
+ To make the Hadoop framework use the Capacity Scheduler, set up
the following property in the site configuration:</p>
<table>
<tr>
@@ -167,7 +144,7 @@
<title>Setting up queues</title>
<p>
You can define multiple queues to which users can submit jobs with
- the capacity scheduler. To define multiple queues, you should edit
+ the Capacity Scheduler. To define multiple queues, you should edit
the site configuration for Hadoop and modify the
<em>mapred.queue.names</em> property.
</p>
@@ -185,8 +162,8 @@
<section>
<title>Configuring properties for queues</title>
- <p>The capacity scheduler can be configured with several properties
- for each queue that control the behavior of the scheduler. This
+ <p>The Capacity Scheduler can be configured with several properties
+ for each queue that control the behavior of the Scheduler. This
configuration is in the <em>conf/capacity-scheduler.xml</em>. By
default, the configuration is set up for one queue, named
<em>default</em>.</p>
@@ -194,10 +171,10 @@
configuration, you should use the property name as
<em>mapred.capacity-scheduler.queue.<queue-name>.<property-name></em>.
</p>
- <p>For example, to define the property <em>guaranteed-capacity</em>
+ <p>For example, to define the property <em>capacity</em>
for queue named <em>research</em>, you should specify the property
name as
- <em>mapred.capacity-scheduler.queue.research.guaranteed-capacity</em>.
+ <em>mapred.capacity-scheduler.queue.research.capacity</em>.
</p>
<p>The properties defined for queues and their descriptions are
@@ -205,15 +182,10 @@
<table>
<tr><th>Name</th><th>Description</th></tr>
- <tr><td>mapred.capacity-scheduler.queue.<queue-name>.guaranteed-capacity</td>
- <td>Percentage of the number of slots in the cluster that are
- guaranteed to be available for jobs in this queue.
- The sum of guaranteed capacities for all queues should be less
- than or equal 100.</td>
- </tr>
- <tr><td>mapred.capacity-scheduler.queue.<queue-name>.reclaim-time-limit</td>
- <td>The amount of time, in seconds, before which resources
- distributed to other queues will be reclaimed.</td>
+ <tr><td>mapred.capacity-scheduler.queue.<queue-name>.capacity</td>
+ <td>Percentage of the number of slots in the cluster that are made
+ to be available for jobs in this queue. The sum of capacities
+ for all queues should be less than or equal 100.</td>
</tr>
<tr><td>mapred.capacity-scheduler.queue.<queue-name>.supports-priority</td>
<td>If true, priorities of jobs will be taken into account in scheduling
@@ -236,27 +208,133 @@
</section>
<section>
- <title>Configuring the capacity scheduler</title>
- <p>The capacity scheduler's behavior can be controlled through the
- following properties.
+ <title>Memory management</title>
+
+ <p>The Capacity Scheduler supports scheduling of tasks on a
+ <code>TaskTracker</code>(TT) based on a job's memory requirements
+ and the availability of RAM and Virtual Memory (VMEM) on the TT node.
+ See the <a href="mapred_tutorial.html#Memory+monitoring">Hadoop
+ Map/Reduce tutorial</a> for details on how the TT monitors
+ memory usage.</p>
+ <p>Currently the memory based scheduling is only supported
+ in Linux platform.</p>
+ <p>Memory-based scheduling works as follows:</p>
+ <ol>
+ <li>The absence of any one or more of three config parameters
+ or -1 being set as value of any of the parameters,
+ <code>mapred.tasktracker.vmem.reserved</code>,
+ <code>mapred.task.default.maxvmem</code>, or
+ <code>mapred.task.limit.maxvmem</code>, disables memory-based
+ scheduling, just as it disables memory monitoring for a TT. These
+ config parameters are described in the
+ <a href="mapred_tutorial.html#Memory+monitoring">Hadoop Map/Reduce
+ tutorial</a>. The value of
+ <code>mapred.tasktracker.vmem.reserved</code> is
+ obtained from the TT via its heartbeat.
+ </li>
+ <li>If all the three mandatory parameters are set, the Scheduler
+ enables VMEM-based scheduling. First, the Scheduler computes the free
+ VMEM on the TT. This is the difference between the available VMEM on the
+ TT (the node's total VMEM minus the offset, both of which are sent by
+ the TT on each heartbeat)and the sum of VMs already allocated to
+ running tasks (i.e., sum of the VMEM task-limits). Next, the Scheduler
+ looks at the VMEM requirements for the job that's first in line to
+ run. If the job's VMEM requirements are less than the available VMEM on
+ the node, the job's task can be scheduled. If not, the Scheduler
+ ensures that the TT does not get a task to run (provided the job
+ has tasks to run). This way, the Scheduler ensures that jobs with
+ high memory requirements are not starved, as eventually, the TT
+ will have enough VMEM available. If the high-mem job does not have
+ any task to run, the Scheduler moves on to the next job.
+ </li>
+ <li>In addition to VMEM, the Capacity Scheduler can also consider
+ RAM on the TT node. RAM is considered the same way as VMEM. TTs report
+ the total RAM available on their node, and an offset. If both are
+ set, the Scheduler computes the available RAM on the node. Next,
+ the Scheduler figures out the RAM requirements of the job, if any.
+ As with VMEM, users can optionally specify a RAM limit for their job
+ (<code>mapred.task.maxpmem</code>, described in the Map/Reduce
+ tutorial). The Scheduler also maintains a limit for this value
+ (<code>mapred.capacity-scheduler.task.default-pmem-percentage-in-vmem</code>,
+ described below). All these three values must be set for the
+ Scheduler to schedule tasks based on RAM constraints.
+ </li>
+ <li>The Scheduler ensures that jobs cannot ask for RAM or VMEM higher
+ than configured limits. If this happens, the job is failed when it
+ is submitted.
+ </li>
+ </ol>
+
+ <p>As described above, the additional scheduler-based config
+ parameters are as follows:</p>
+
+ <table>
+ <tr><th>Name</th><th>Description</th></tr>
+ <tr><td>mapred.capacity-scheduler.task.default-pmem-percentage-in-vmem</td>
+ <td>A percentage of the default VMEM limit for jobs
+ (<code>mapred.task.default.maxvmem</code>). This is the default
+ RAM task-limit associated with a task. Unless overridden by a
+ job's setting, this number defines the RAM task-limit.</td>
+ </tr>
+ <tr><td>mapred.capacity-scheduler.task.limit.maxpmem</td>
+ <td>Configuration which provides an upper limit to maximum physical
+ memory which can be specified by a job. If a job requires more
+ physical memory than what is specified in this limit then the same
+ is rejected.</td>
+ </tr>
+ </table>
+ </section>
+ <section>
+ <title>Job Initialization Parameters</title>
+ <p>Capacity scheduler lazily initializes the jobs before they are
+ scheduled, for reducing the memory footprint on jobtracker.
+ Following are the parameters, by which you can control the laziness
+ of the job initialization. The following parameters can be
+ configured in capacity-scheduler.xml
</p>
+
<table>
+ <tr><th>Name</th><th>Description</th></tr>
+ <tr>
+ <td>
+ mapred.capacity-scheduler.queue.<queue-name>.maximum-initialized-jobs-per-user
+ </td>
+ <td>
+ Maximum number of jobs which are allowed to be pre-initialized for
+ a particular user in the queue. Once a job is scheduled, i.e.
+ it starts running, then that job is not considered
+ while scheduler computes the maximum job a user is allowed to
+ initialize.
+ </td>
+ </tr>
<tr>
- <th>Name</th><th>Description</th>
+ <td>
+ mapred.capacity-scheduler.init-poll-interval
+ </td>
+ <td>
+ Amount of time in miliseconds which is used to poll the scheduler
+ job queue to look for jobs to be initialized.
+ </td>
</tr>
<tr>
- <td>mapred.capacity-scheduler.reclaimCapacity.interval</td>
- <td>The time interval, in seconds, between which the scheduler
- periodically determines whether capacity needs to be reclaimed for
- any queue. The default value is 5 seconds.
- </td>
+ <td>
+ mapred.capacity-scheduler.init-worker-threads
+ </td>
+ <td>
+ Number of worker threads which would be used by Initialization
+ poller to initialize jobs in a set of queue. If number mentioned
+ in property is equal to number of job queues then a thread is
+ assigned jobs from one queue. If the number configured is lesser than
+ number of queues, then a thread can get jobs from more than one queue
+ which it initializes in a round robin fashion. If the number configured
+ is greater than number of queues, then number of threads spawned
+ would be equal to number of job queues.
+ </td>
</tr>
</table>
-
- </section>
-
+ </section>
<section>
- <title>Reviewing the configuration of the capacity scheduler</title>
+ <title>Reviewing the configuration of the Capacity Scheduler</title>
<p>
Once the installation and configuration is completed, you can review
it after starting the Map/Reduce cluster from the admin UI.
@@ -270,7 +348,8 @@
Information</em> column against each queue.</li>
</ul>
</section>
- </section>
+
+ </section>
</body>
</document>
Modified: hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/cluster_setup.xml
URL: http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/cluster_setup.xml?rev=771623&r1=771622&r2=771623&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/cluster_setup.xml (original)
+++ hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/cluster_setup.xml Tue May 5 08:46:09 2009
@@ -463,6 +463,120 @@
</section>
</section>
+ <section>
+ <title> Memory monitoring</title>
+ <p>A <code>TaskTracker</code>(TT) can be configured to monitor memory
+ usage of tasks it spawns, so that badly-behaved jobs do not bring
+ down a machine due to excess memory consumption. With monitoring
+ enabled, every task is assigned a task-limit for virtual memory (VMEM).
+ In addition, every node is assigned a node-limit for VMEM usage.
+ A TT ensures that a task is killed if it, and
+ its descendants, use VMEM over the task's per-task limit. It also
+ ensures that one or more tasks are killed if the sum total of VMEM
+ usage by all tasks, and their descendents, cross the node-limit.</p>
+
+ <p>Users can, optionally, specify the VMEM task-limit per job. If no
+ such limit is provided, a default limit is used. A node-limit can be
+ set per node.</p>
+ <p>Currently the memory monitoring and management is only supported
+ in Linux platform.</p>
+ <p>To enable monitoring for a TT, the
+ following parameters all need to be set:</p>
+
+ <table>
+ <tr><th>Name</th><th>Type</th><th>Description</th></tr>
+ <tr><td>mapred.tasktracker.vmem.reserved</td><td>long</td>
+ <td>A number, in bytes, that represents an offset. The total VMEM on
+ the machine, minus this offset, is the VMEM node-limit for all
+ tasks, and their descendants, spawned by the TT.
+ </td></tr>
+ <tr><td>mapred.task.default.maxvmem</td><td>long</td>
+ <td>A number, in bytes, that represents the default VMEM task-limit
+ associated with a task. Unless overridden by a job's setting,
+ this number defines the VMEM task-limit.
+ </td></tr>
+ <tr><td>mapred.task.limit.maxvmem</td><td>long</td>
+ <td>A number, in bytes, that represents the upper VMEM task-limit
+ associated with a task. Users, when specifying a VMEM task-limit
+ for their tasks, should not specify a limit which exceeds this amount.
+ </td></tr>
+ </table>
+
+ <p>In addition, the following parameters can also be configured.</p>
+
+ <table>
+ <tr><th>Name</th><th>Type</th><th>Description</th></tr>
+ <tr><td>mapred.tasktracker.taskmemorymanager.monitoring-interval</td>
+ <td>long</td>
+ <td>The time interval, in milliseconds, between which the TT
+ checks for any memory violation. The default value is 5000 msec
+ (5 seconds).
+ </td></tr>
+ </table>
+
+ <p>Here's how the memory monitoring works for a TT.</p>
+ <ol>
+ <li>If one or more of the configuration parameters described
+ above are missing or -1 is specified , memory monitoring is
+ disabled for the TT.
+ </li>
+ <li>In addition, monitoring is disabled if
+ <code>mapred.task.default.maxvmem</code> is greater than
+ <code>mapred.task.limit.maxvmem</code>.
+ </li>
+ <li>If a TT receives a task whose task-limit is set by the user
+ to a value larger than <code>mapred.task.limit.maxvmem</code>, it
+ logs a warning but executes the task.
+ </li>
+ <li>Periodically, the TT checks the following:
+ <ul>
+ <li>If any task's current VMEM usage is greater than that task's
+ VMEM task-limit, the task is killed and reason for killing
+ the task is logged in task diagonistics . Such a task is considered
+ failed, i.e., the killing counts towards the task's failure count.
+ </li>
+ <li>If the sum total of VMEM used by all tasks and descendants is
+ greater than the node-limit, the TT kills enough tasks, in the
+ order of least progress made, till the overall VMEM usage falls
+ below the node-limt. Such killed tasks are not considered failed
+ and their killing does not count towards the tasks' failure counts.
+ </li>
+ </ul>
+ </li>
+ </ol>
+
+ <p>Schedulers can choose to ease the monitoring pressure on the TT by
+ preventing too many tasks from running on a node and by scheduling
+ tasks only if the TT has enough VMEM free. In addition, Schedulers may
+ choose to consider the physical memory (RAM) available on the node
+ as well. To enable Scheduler support, TTs report their memory settings
+ to the JobTracker in every heartbeat. Before getting into details,
+ consider the following additional memory-related parameters than can be
+ configured to enable better scheduling:</p>
+
+ <table>
+ <tr><th>Name</th><th>Type</th><th>Description</th></tr>
+ <tr><td>mapred.tasktracker.pmem.reserved</td><td>int</td>
+ <td>A number, in bytes, that represents an offset. The total
+ physical memory (RAM) on the machine, minus this offset, is the
+ recommended RAM node-limit. The RAM node-limit is a hint to a
+ Scheduler to scheduler only so many tasks such that the sum
+ total of their RAM requirements does not exceed this limit.
+ RAM usage is not monitored by a TT.
+ </td></tr>
+ </table>
+
+ <p>A TT reports the following memory-related numbers in every
+ heartbeat:</p>
+ <ul>
+ <li>The total VMEM available on the node.</li>
+ <li>The value of <code>mapred.tasktracker.vmem.reserved</code>,
+ if set.</li>
+ <li>The total RAM available on the node.</li>
+ <li>The value of <code>mapred.tasktracker.pmem.reserved</code>,
+ if set.</li>
+ </ul>
+ </section>
<section>
<title>Slaves</title>
Modified: hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml
URL: http://svn.apache.org/viewvc/hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml?rev=771623&r1=771622&r2=771623&view=diff
==============================================================================
--- hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml (original)
+++ hadoop/core/branches/branch-0.20/src/docs/src/documentation/content/xdocs/mapred_tutorial.xml Tue May 5 08:46:09 2009
@@ -1104,8 +1104,26 @@
counters for a job- particularly relative to byte counts from the map
and into the reduce- is invaluable to the tuning of these
parameters.</p>
+
+ <p>Users can choose to override default limits of Virtual Memory and RAM
+ enforced by the task tracker, if memory management is enabled.
+ Users can set the following parameter per job:</p>
+
+ <table>
+ <tr><th>Name</th><th>Type</th><th>Description</th></tr>
+ <tr><td><code>mapred.task.maxvmem</code></td><td>int</td>
+ <td>A number, in bytes, that represents the maximum Virtual Memory
+ task-limit for each task of the job. A task will be killed if
+ it consumes more Virtual Memory than this number.
+ </td></tr>
+ <tr><td>mapred.task.maxpmem</td><td>int</td>
+ <td>A number, in bytes, that represents the maximum RAM task-limit
+ for each task of the job. This number can be optionally used by
+ Schedulers to prevent over-scheduling of tasks on a node based
+ on RAM needs.
+ </td></tr>
+ </table>
</section>
-
<section>
<title>Map Parameters</title>