You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-commits@hadoop.apache.org by vi...@apache.org on 2010/11/18 12:12:48 UTC
svn commit: r1036409 - in /hadoop/mapreduce/trunk: CHANGES.txt
src/docs/src/documentation/content/xdocs/gridmix.xml
Author: vinodkv
Date: Thu Nov 18 11:12:48 2010
New Revision: 1036409
URL: http://svn.apache.org/viewvc?rev=1036409&view=rev
Log:
MAPREDUCE-1931. Gridmix forrest documentation. Contributed by Ranjit Mathew.
Modified:
hadoop/mapreduce/trunk/CHANGES.txt
hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/gridmix.xml
Modified: hadoop/mapreduce/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/CHANGES.txt?rev=1036409&r1=1036408&r2=1036409&view=diff
==============================================================================
--- hadoop/mapreduce/trunk/CHANGES.txt (original)
+++ hadoop/mapreduce/trunk/CHANGES.txt Thu Nov 18 11:12:48 2010
@@ -186,6 +186,8 @@ Release 0.22.0 - Unreleased
MAPREDUCE-2167. Faster directory traversal for raid node. (Ramkumar Vadali
via schen)
+ MAPREDUCE-1931. Gridmix forrest documentation . (Ranjit Mathew via vinodkv).
+
OPTIMIZATIONS
MAPREDUCE-1354. Enhancements to JobTracker for better performance and
Modified: hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/gridmix.xml
URL: http://svn.apache.org/viewvc/hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/gridmix.xml?rev=1036409&r1=1036408&r2=1036409&view=diff
==============================================================================
--- hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/gridmix.xml (original)
+++ hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/gridmix.xml Thu Nov 18 11:12:48 2010
@@ -15,150 +15,542 @@
See the License for the specific language governing permissions and
limitations under the License.
-->
-
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
-
<document>
-
-<header>
- <title>Gridmix</title>
-</header>
-
-<body>
-
- <section>
- <title>Overview</title>
-
- <p>Gridmix is a benchmark for live clusters. It submits a mix of synthetic
- jobs, modeling a profile mined from production loads.</p>
-
- <p>There exist three versions of the Gridmix tool. This document discusses
- the third (checked into contrib), distinct from the two checked into the
- benchmarks subdirectory. While the first two versions of the tool included
- stripped-down versions of common jobs, both were principally saturation
- tools for stressing the framework at scale. In support of a broader range of
- deployments and finer-tuned job mixes, this version of the tool will attempt
- to model the resource profiles of production jobs to identify bottlenecks,
- guide development, and serve as a replacement for the existing gridmix
- benchmarks.</p>
-
- </section>
-
- <section id="usage">
-
- <title>Usage</title>
-
- <p>To run Gridmix, one requires a job trace describing the job mix for a
- given cluster. Such traces are typically genenerated by Rumen (see related
- documentation). Gridmix also requires input data from which the synthetic
- jobs will draw bytes. The input data need not be in any particular format,
- as the synthetic jobs are currently binary readers. If one is running on a
- new cluster, an optional step generating input data may precede the run.</p>
-
- <p>Basic command line usage:</p>
-<source>
-
-bin/mapred org.apache.hadoop.mapred.gridmix.Gridmix [-generate <MiB>] <iopath> <trace>
-</source>
-
- <p>The <code>-generate</code> parameter accepts standard units, e.g.
- <code>100g</code> will generate 100 * 2<sup>30</sup> bytes. The
- <iopath> parameter is the destination directory for generated and/or
- the directory from which input data will be read. The <trace>
- parameter is a path to a job trace. The following configuration parameters
- are also accepted in the standard idiom, before other Gridmix
- parameters.</p>
-
- <section>
- <title>Configuration parameters</title>
- <p></p>
- <table>
- <tr><th> Parameter </th><th> Description </th><th> Notes </th></tr>
- <tr><td><code>gridmix.output.directory</code></td>
- <td>The directory into which output will be written. If specified, the
- <code>iopath</code> will be relative to this parameter.</td>
- <td>The submitting user must have read/write access to this
- directory. The user should also be mindful of any quota issues that
- may arise during a run.</td></tr>
- <tr><td><code>gridmix.client.submit.threads</code></td>
- <td>The number of threads submitting jobs to the cluster. This also
- controls how many splits will be loaded into memory at a given time,
- pending the submit time in the trace.</td>
- <td>Splits are pregenerated to hit submission deadlines, so
- particularly dense traces may want more submitting threads. However,
- storing splits in memory is reasonably expensive, so one should raise
- this cautiously.</td></tr>
- <tr><td><code>gridmix.client.pending.queue.depth</code></td>
- <td>The depth of the queue of job descriptions awaiting split
- generation.</td>
- <td>The jobs read from the trace occupy a queue of this depth before
- being processed by the submission threads. It is unusual to configure
- this.</td></tr>
- <tr><td><code>gridmix.min.key.length</code></td>
- <td>The key size for jobs submitted to the cluster.</td>
- <td>While this is clearly a job-specific, even task-specific property,
- no data on key length is currently available. Since the intermediate
- data are random, memcomparable data, not even the sort is likely
- affected. It exists as a tunable as no default value is appropriate,
- but future versions will likely replace it with trace data.</td></tr>
- </table>
-
- </section>
-</section>
-
-<section id="assumptions">
-
- <title>Simplifying Assumptions</title>
-
- <p>Gridmix will be developed in stages, incorporating feedback and patches
- from the community. Currently, its intent is to evaluate Map/Reduce and HDFS
- performance and not the layers on top of them (i.e. the extensive lib and
- subproject space). Given these two limitations, the following
- characteristics of job load are not currently captured in job traces and
- cannot be accurately reproduced in Gridmix.</p>
-
- <table>
- <tr><th>Property</th><th>Notes</th></tr>
- <tr><td>CPU usage</td><td>We have no data for per-task CPU usage, so we
- cannot attempt even an approximation. Gridmix tasks are never CPU bound
- independent of I/O, though this surely happens in practice.</td></tr>
- <tr><td>Filesystem properties</td><td>No attempt is made to match block
- sizes, namespace hierarchies, or any property of input, intermediate, or
- output data other than the bytes/records consumed and emitted from a given
- task. This implies that some of the most heavily used parts of the system-
- the compression libraries, text processing, streaming, etc.- cannot be
- meaningfully tested with the current implementation.</td></tr>
- <tr><td>I/O rates</td><td>The rate at which records are consumed/emitted is
- assumed to be limited only by the speed of the reader/writer and constant
- throughout the task.</td></tr>
- <tr><td>Memory profile</td><td>No data on tasks' memory usage over time is
- available, though the max heap size is retained.</td></tr>
- <tr><td>Skew</td><td>The records consumed and emitted to/from a given task
- are assumed to follow observed averages, i.e. records will be more regular
- than may be seen in the wild. Each map also generates a proportional
- percentage of data for each reduce, so a job with unbalanced input will be
- flattened.</td></tr>
- <tr><td>Job failure</td><td>User code is assumed to be correct.</td></tr>
- <tr><td>Job independence</td><td>The output or outcome of one job does not
- affect when or whether a subsequent job will run.</td></tr>
- </table>
-
-</section>
-
-<section>
-
- <title>Appendix</title>
-
- <p>Issues tracking the implementations of <a
- href="https://issues.apache.org/jira/browse/HADOOP-2369">gridmix1</a>, <a
- href="https://issues.apache.org/jira/browse/HADOOP-3770">gridmix2</a>, and
- <a href="https://issues.apache.org/jira/browse/MAPREDUCE-776">gridmix3</a>.
- Other issues tracking the development of Gridmix can be found by searching
- the Map/Reduce <a
- href="https://issues.apache.org/jira/browse/MAPREDUCE">JIRA</a></p>
-
-</section>
-
-</body>
-
+ <header>
+ <title>GridMix</title>
+ </header>
+ <body>
+ <section id="overview">
+ <title>Overview</title>
+ <p>GridMix is a benchmark for Hadoop clusters. It submits a mix of
+ synthetic jobs, modeling a profile mined from production loads.</p>
+ <p>There exist three versions of the GridMix tool. This document
+ discusses the third (checked into <code>src/contrib</code>), distinct
+ from the two checked into the <code>src/benchmarks</code> sub-directory.
+ While the first two versions of the tool included stripped-down versions
+ of common jobs, both were principally saturation tools for stressing the
+ framework at scale. In support of a broader range of deployments and
+ finer-tuned job mixes, this version of the tool will attempt to model
+ the resource profiles of production jobs to identify bottlenecks, guide
+ development, and serve as a replacement for the existing GridMix
+ benchmarks.</p>
+ <p>To run GridMix, you need a MapReduce job trace describing the job mix
+ for a given cluster. Such traces are typically generated by Rumen (see
+ Rumen documentation). GridMix also requires input data from which the
+ synthetic jobs will be reading bytes. The input data need not be in any
+ particular format, as the synthetic jobs are currently binary readers.
+ If you are running on a new cluster, an optional step generating input
+ data may precede the run.</p>
+ <p>In order to emulate the load of production jobs from a given cluster
+ on the same or another cluster, follow these steps:</p>
+ <ol>
+ <li>Locate the job history files on the production cluster. This
+ location is specified by the
+ <code>mapreduce.jobtracker.jobhistory.completed.location</code>
+ configuration property of the cluster.</li>
+ <li>Run Rumen to build a job trace in JSON format for all or select
+ jobs.</li>
+ <li><em>(Optional)</em> Use Rumen to fold this job trace to scale
+ the load.</li>
+ <li>Use GridMix with the job trace on the benchmark cluster.</li>
+ </ol>
+ <p>Jobs submitted by GridMix have names of the form
+ "<code>GRIDMIXnnnnn</code>", where
+ "<code>nnnnn</code>" is a sequence number padded with leading
+ zeroes.</p>
+ </section>
+ <section id="usage">
+ <title>Usage</title>
+ <p>Basic command-line usage without configuration parameters:</p>
+ <source>
+org.apache.hadoop.mapred.gridmix.Gridmix [-generate <size>] [-users <users-list>] <iopath> <trace>
+ </source>
+ <p>Basic command-line usage with configuration parameters:</p>
+ <source>
+org.apache.hadoop.mapred.gridmix.Gridmix \
+ -Dgridmix.client.submit.threads=10 -Dgridmix.output.directory=foo \
+ [-generate <size>] [-users <users-list>] <iopath> <trace>
+ </source>
+ <note>
+ Configuration parameters like
+ <code>-Dgridmix.client.submit.threads=10</code> and
+ <code>-Dgridmix.output.directory=foo</code> as given above should
+ be used <em>before</em> other GridMix parameters.
+ </note>
+ <p>The <code>-generate</code> option is used to generate input data
+ for the synthetic jobs. It accepts size suffixes, e.g.
+ <code>100g</code> will generate 100 * 2<sup>30</sup> bytes.</p>
+ <p>The <code>-users</code> option is used to point to a users-list
+ file (see <a href="#usersqueues">Emulating Users and Queues</a>).</p>
+ <p>The <code><iopath></code> parameter is the destination
+ directory for generated output and/or the directory from which input
+ data will be read. Note that this can either be on the local file-system
+ or on HDFS, but it is highly recommended that it be the same as that for
+ the original job mix so that GridMix puts the same load on the local
+ file-system and HDFS respectively.</p>
+ <p>The <code><trace></code> parameter is a path to a job trace
+ generated by Rumen. This trace can be compressed (it must be readable
+ using one of the compression codecs supported by the cluster) or
+ uncompressed. Use "-" as the value of this parameter if you
+ want to pass an <em>uncompressed</em> trace via the standard
+ input-stream of GridMix.</p>
+ <p>The class <code>org.apache.hadoop.mapred.gridmix.Gridmix</code> can
+ be found in the JAR
+ <code>contrib/gridmix/hadoop-$VERSION-gridmix.jar</code> inside your
+ Hadoop installation, where <code>$VERSION</code> corresponds to the
+ version of Hadoop installed. A simple way of ensuring that this class
+ and all its dependencies are loaded correctly is to use the
+ <code>hadoop</code> wrapper script in Hadoop:</p>
+ <source>
+hadoop jar <gridmix-jar> org.apache.hadoop.mapred.gridmix.Gridmix \
+ [-generate <size>] [-users <users-list>] <iopath> <trace>
+ </source>
+ <p>The supported configuration parameters are explained in the
+ following sections.</p>
+ </section>
+ <section id="cfgparams">
+ <title>General Configuration Parameters</title>
+ <p/>
+ <table>
+ <tr>
+ <th>Parameter</th>
+ <th>Description</th>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.output.directory</code>
+ </td>
+ <td>The directory into which output will be written. If specified,
+ <code>iopath</code> will be relative to this parameter. The
+ submitting user must have read/write access to this directory. The
+ user should also be mindful of any quota issues that may arise
+ during a run. The default is "<code>gridmix</code>".</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.client.submit.threads</code>
+ </td>
+ <td>The number of threads submitting jobs to the cluster. This
+ also controls how many splits will be loaded into memory at a given
+ time, pending the submit time in the trace. Splits are pre-generated
+ to hit submission deadlines, so particularly dense traces may want
+ more submitting threads. However, storing splits in memory is
+ reasonably expensive, so you should raise this cautiously. The
+ default is 1 for the SERIAL job-submission policy (see
+ <a href="#policies">Job Submission Policies</a>) and one more than
+ the number of processors on the client machine for the other
+ policies.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.submit.multiplier</code>
+ </td>
+ <td>The multiplier to accelerate or decelerate the submission of
+ jobs. The time separating two jobs is multiplied by this factor.
+ The default value is 1.0. This is a crude mechanism to size
+ a job trace to a cluster.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.client.pending.queue.depth</code>
+ </td>
+ <td>The depth of the queue of job descriptions awaiting split
+ generation. The jobs read from the trace occupy a queue of this
+ depth before being processed by the submission threads. It is
+ unusual to configure this. The default is 5.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.gen.blocksize</code>
+ </td>
+ <td>The block-size of generated data. The default value is 256
+ MiB.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.gen.bytes.per.file</code>
+ </td>
+ <td>The maximum bytes written per file. The default value is 1
+ GiB.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.min.file.size</code>
+ </td>
+ <td>The minimum size of the input files. The default limit is 128
+ MiB. Tweak this parameter if you see an error-message like
+ "Found no satisfactory file" while testing GridMix with
+ a relatively-small input data-set.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.max.total.scan</code>
+ </td>
+ <td>The maximum size of the input files. The default limit is 100
+ TiB.</td>
+ </tr>
+ </table>
+ </section>
+ <section id="jobtypes">
+ <title>Job Types</title>
+ <p>GridMix takes as input a job trace, essentially a stream of
+ JSON-encoded job descriptions. For each job description, the submission
+ client obtains the original job submission time and for each task in
+ that job, the byte and record counts read and written. Given this data,
+ it constructs a synthetic job with the same byte and record patterns as
+ recorded in the trace. It constructs jobs of two types:</p>
+ <table>
+ <tr>
+ <th>Job Type</th>
+ <th>Description</th>
+ </tr>
+ <tr>
+ <td>
+ <code>LOADJOB</code>
+ </td>
+ <td>A synthetic job that emulates the workload mentioned in Rumen
+ trace. In the current version we are supporting I/O. It reproduces
+ the I/O workload on the benchmark cluster. It does so by embedding
+ the detailed I/O information for every map and reduce task, such as
+ the number of bytes and records read and written, into each
+ job's input splits. The map tasks further relay the I/O patterns of
+ reduce tasks through the intermediate map output data.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>SLEEPJOB</code>
+ </td>
+ <td>A synthetic job where each task does <em>nothing</em> but sleep
+ for a certain duration as observed in the production trace. The
+ scalability of the Job Tracker is often limited by how many
+ heartbeats it can handle every second. (Heartbeats are periodic
+ messages sent from Task Trackers to update their status and grab new
+ tasks from the Job Tracker.) Since a benchmark cluster is typically
+ a fraction in size of a production cluster, the heartbeat traffic
+ generated by the slave nodes is well below the level of the
+ production cluster. One possible solution is to run multiple Task
+ Trackers on each slave node. This leads to the obvious problem that
+ the I/O workload generated by the synthetic jobs would thrash the
+ slave nodes. Hence the need for such a job.</td>
+ </tr>
+ </table>
+ <p>The following configuration parameters affect the job type:</p>
+ <table>
+ <tr>
+ <th>Parameter</th>
+ <th>Description</th>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.job.type</code>
+ </td>
+ <td>The value for this key can be one of LOADJOB or SLEEPJOB. The
+ default value is LOADJOB.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.key.fraction</code>
+ </td>
+ <td>For a LOADJOB type of job, the fraction of a record used for
+ the data for the key. The default value is 0.1.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.sleep.maptask-only</code>
+ </td>
+ <td>For a SLEEPJOB type of job, whether to ignore the reduce
+ tasks for the job. The default is <code>false</code>.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.sleep.fake-locations</code>
+ </td>
+ <td>For a SLEEPJOB type of job, the number of fake locations
+ for map tasks for the job. The default is 0.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.sleep.max-map-time</code>
+ </td>
+ <td>For a SLEEPJOB type of job, the maximum runtime for map
+ tasks for the job in milliseconds. The default is unlimited.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.sleep.max-reduce-time</code>
+ </td>
+ <td>For a SLEEPJOB type of job, the maximum runtime for reduce
+ tasks for the job in milliseconds. The default is unlimited.</td>
+ </tr>
+ </table>
+ </section>
+ <section id="policies">
+ <title>Job Submission Policies</title>
+ <p>GridMix controls the rate of job submission. This control can be
+ based on the trace information or can be based on statistics it gathers
+ from the Job Tracker. Based on the submission policies users define,
+ GridMix uses the respective algorithm to control the job submission.
+ There are currently three types of policies:</p>
+ <table>
+ <tr>
+ <th>Job Submission Policy</th>
+ <th>Description</th>
+ </tr>
+ <tr>
+ <td>
+ <code>STRESS</code>
+ </td>
+ <td>Keep submitting jobs so that the cluster remains under stress.
+ In this mode we control the rate of job submission by monitoring
+ the real-time load of the cluster so that we can maintain a stable
+ stress level of workload on the cluster. Based on the statistics we
+ gather we define if a cluster is <em>underloaded</em> or
+ <em>overloaded</em>. We consider a cluster <em>underloaded</em> if
+ and only if the following three conditions are true:
+ <ol>
+ <li>the number of pending and running jobs are under a threshold
+ TJ</li>
+ <li>the number of pending and running maps are under threshold
+ TM</li>
+ <li>the number of pending and running reduces are under threshold
+ TR</li>
+ </ol>
+ The thresholds TJ, TM and TR are proportional to the size of the
+ cluster and map, reduce slots capacities respectively. In case of a
+ cluster being <em>overloaded</em>, we throttle the job submission.
+ In the actual calculation we also weigh each running task with its
+ remaining work - namely, a 90% complete task is only counted as 0.1
+ in calculation. Finally, to avoid a very large job blocking other
+ jobs, we limit the number of pending/waiting tasks each job can
+ contribute.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>REPLAY</code>
+ </td>
+ <td>In this mode we replay the job traces faithfully. This mode
+ exactly follows the time-intervals given in the actual job
+ trace.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>SERIAL</code>
+ </td>
+ <td>In this mode we submit the next job only once the job submitted
+ earlier is completed.</td>
+ </tr>
+ </table>
+ <p>The following configuration parameters affect the job submission
+ policy:</p>
+ <table>
+ <tr>
+ <th>Parameter</th>
+ <th>Description</th>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.job-submission.policy</code>
+ </td>
+ <td>The value for this key would one of the three: STRESS, REPLAY or
+ SERIAL. In most of the cases the value of key would be STRESS or
+ REPLAY. The default value is STRESS.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.throttle.jobs-to-tracker-ratio</code>
+ </td>
+ <td>In STRESS mode, the minimum ratio of running jobs to Task
+ Trackers in a cluster for the cluster to be considered
+ <em>overloaded</em>. This is the threshold TJ referred to earlier.
+ The default is 1.0.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.throttle.maps.task-to-slot-ratio</code>
+ </td>
+ <td>In STRESS mode, the minimum ratio of pending and running map
+ tasks (i.e. incomplete map tasks) to the number of map slots for
+ a cluster for the cluster to be considered <em>overloaded</em>.
+ This is the threshold TM referred to earlier. Running map tasks are
+ counted partially. For example, a 40% complete map task is counted
+ as 0.6 map tasks. The default is 2.0.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.throttle.reduces.task-to-slot-ratio</code>
+ </td>
+ <td>In STRESS mode, the minimum ratio of pending and running reduce
+ tasks (i.e. incomplete reduce tasks) to the number of reduce slots
+ for a cluster for the cluster to be considered <em>overloaded</em>.
+ This is the threshold TR referred to earlier. Running reduce tasks
+ are counted partially. For example, a 30% complete reduce task is
+ counted as 0.7 reduce tasks. The default is 2.5.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.throttle.maps.max-slot-share-per-job</code>
+ </td>
+ <td>In STRESS mode, the maximum share of a cluster's map-slots
+ capacity that can be counted toward a job's incomplete map tasks in
+ overload calculation. The default is 0.1.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.throttle.reducess.max-slot-share-per-job</code>
+ </td>
+ <td>In STRESS mode, the maximum share of a cluster's reduce-slots
+ capacity that can be counted toward a job's incomplete reduce tasks
+ in overload calculation. The default is 0.1.</td>
+ </tr>
+ </table>
+ </section>
+ <section id="usersqueues">
+ <title>Emulating Users and Queues</title>
+ <p>Typical production clusters are often shared with different users and
+ the cluster capacity is divided among different departments through job
+ queues. Ensuring fairness among jobs from all users, honoring queue
+ capacity allocation policies and avoiding an ill-behaving job from
+ taking over the cluster adds significant complexity in Hadoop software.
+ To be able to sufficiently test and discover bugs in these areas,
+ GridMix must emulate the contentions of jobs from different users and/or
+ submitted to different queues.</p>
+ <p>Emulating multiple queues is easy - we simply set up the benchmark
+ cluster with the same queue configuration as the production cluster and
+ we configure synthetic jobs so that they get submitted to the same queue
+ as recorded in the trace. However, not all users shown in the trace have
+ accounts on the benchmark cluster. Instead, we set up a number of testing
+ user accounts and associate each unique user in the trace to testing
+ users in a round-robin fashion.</p>
+ <p>The following configuration parameters affect the emulation of users
+ and queues:</p>
+ <table>
+ <tr>
+ <th>Parameter</th>
+ <th>Description</th>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.job-submission.use-queue-in-trace</code>
+ </td>
+ <td>When set to <code>true</code> it uses exactly the same set of
+ queues as those mentioned in the trace. The default value is
+ <code>false</code>.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.job-submission.default-queue</code>
+ </td>
+ <td>Specifies the default queue to which all the jobs would be
+ submitted. If this parameter is not specified, GridMix uses the
+ default queue defined for the submitting user on the cluster.</td>
+ </tr>
+ <tr>
+ <td>
+ <code>gridmix.user.resolve.class</code>
+ </td>
+ <td>Specifies which <code>UserResolver</code> implementation to use.
+ We currently have three implementations:
+ <ol>
+ <li><code>org.apache.hadoop.mapred.gridmix.EchoUserResolver</code>
+ - submits a job as the user who submitted the original job. All
+ the users of the production cluster identified in the job trace
+ must also have accounts on the benchmark cluster in this case.</li>
+ <li><code>org.apache.hadoop.mapred.gridmix.SubmitterUserResolver</code>
+ - submits all the jobs as current GridMix user. In this case we
+ simply map all the users in the trace to the current GridMix user
+ and submit the job.</li>
+ <li><code>org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver</code>
+ - maps trace users to test users in a round-robin fashion. In
+ this case we set up a number of testing user accounts and
+ associate each unique user in the trace to testing users in a
+ round-robin fashion.</li>
+ </ol>
+ The default is
+ <code>org.apache.hadoop.mapred.gridmix.SubmitterUserResolver</code>.</td>
+ </tr>
+ </table>
+ <p>If the parameter <code>gridmix.user.resolve.class</code> is set to
+ <code>org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver</code>,
+ we need to define a users-list file with a list of test users and groups.
+ This is specified using the <code>-users</code> option to GridMix.</p>
+ <note>
+ Specifying a users-list file using the <code>-users</code> option is
+ mandatory when using the round-robin user-resolver. Other user-resolvers
+ ignore this option.
+ </note>
+ <p>A users-list file has one user-group-information (UGI) per line, each
+ UGI of the format:</p>
+ <source>
+ <username>,<group>[,group]*
+ </source>
+ <p>For example:</p>
+ <source>
+ user1,group1
+ user2,group2,group3
+ user3,group3,group4
+ </source>
+ <p>In the above example we have defined three users <code>user1</code>,
+ <code>user2</code> and <code>user3</code> with their respective groups.
+ Now we would associate each unique user in the trace to the above users
+ defined in round-robin fashion. For example, if traces users are
+ <code>tuser1</code>, <code>tuser2</code>, <code>tuser3</code>,
+ <code>tuser4</code> and <code>tuser5</code>, then the mappings would
+ be:</p>
+ <source>
+ tuser1 -> user1
+ tuser2 -> user2
+ tuser3 -> user3
+ tuser4 -> user1
+ tuser5 -> user2
+ </source>
+ </section>
+ <section id="assumptions">
+ <title>Simplifying Assumptions</title>
+ <p>GridMix will be developed in stages, incorporating feedback and
+ patches from the community. Currently its intent is to evaluate
+ MapReduce and HDFS performance and not the layers on top of them (i.e.
+ the extensive lib and sub-project space). Given these two limitations,
+ the following characteristics of job load are not currently captured in
+ job traces and cannot be accurately reproduced in GridMix:</p>
+ <ul>
+ <li><em>CPU Usage</em> - We have no data for per-task CPU usage, so we
+ cannot even attempt an approximation. GridMix tasks are never
+ CPU-bound independent of I/O, though this surely happens in
+ practice.</li>
+ <li><em>Filesystem Properties</em> - No attempt is made to match block
+ sizes, namespace hierarchies, or any property of input, intermediate
+ or output data other than the bytes/records consumed and emitted from
+ a given task. This implies that some of the most heavily-used parts of
+ the system - the compression libraries, text processing, streaming,
+ etc. - cannot be meaningfully tested with the current
+ implementation.</li>
+ <li><em>I/O Rates</em> - The rate at which records are
+ consumed/emitted is assumed to be limited only by the speed of the
+ reader/writer and constant throughout the task.</li>
+ <li><em>Memory Profile</em> - No data on tasks' memory usage over time
+ is available, though the max heap-size is retained.</li>
+ <li><em>Skew</em> - The records consumed and emitted to/from a given
+ task are assumed to follow observed averages, i.e. records will be
+ more regular than may be seen in the wild. Each map also generates
+ a proportional percentage of data for each reduce, so a job with
+ unbalanced input will be flattened.</li>
+ <li><em>Job Failure</em> - User code is assumed to be correct.</li>
+ <li><em>Job Independence</em> - The output or outcome of one job does
+ not affect when or whether a subsequent job will run.</li>
+ </ul>
+ </section>
+ <section id="appendix">
+ <title>Appendix</title>
+ <p>Issues tracking the original implementations of <a
+ href="https://issues.apache.org/jira/browse/HADOOP-2369">GridMix1</a>,
+ <a href="https://issues.apache.org/jira/browse/HADOOP-3770">GridMix2</a>,
+ and <a
+ href="https://issues.apache.org/jira/browse/MAPREDUCE-776">GridMix3</a>
+ can be found on the Apache Hadoop MapReduce JIRA. Other issues tracking
+ the current development of GridMix can be found by searching <a
+ href="https://issues.apache.org/jira/browse/MAPREDUCE/component/12313086">the
+ Apache Hadoop MapReduce JIRA</a></p>
+ </section>
+ </body>
</document>