You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2011/04/12 05:44:06 UTC

[jira] [Commented] (MAPREDUCE-2153) Bring in more job configuration properties in to the trace file

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018689#comment-13018689 ] 

Hadoop QA commented on MAPREDUCE-2153:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12475671/mr-2153-test-patch-results.txt
  against trunk revision 1090390.

    -1 @author.  The patch appears to contain 3 @author tags which the Hadoop community has agreed to not allow in code contributions.

    +1 tests included.  The patch appears to include 503 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/164//console

This message is automatically generated.

> Bring in more job configuration properties in to the trace file
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-2153
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2153
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tools/rumen
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>         Attachments: MapReduce-2153-trunk.patch, mr-2153-test-patch-results.txt
>
>
> To emulate distributed cache usage in gridmix jobs, there are 9 configuration properties needed to be available in trace file: 
> (1) mapreduce.job.cache.files
> (2) mapreduce.job.cache.files.visibilities
> (3) mapreduce.job.cache.files.filesizes
> (4) mapreduce.job.cache.files.timestamps
> (5) mapreduce.job.cache.archives
> (6) mapreduce.job.cache.archives.visibilities
> (7) mapreduce.job.cache.archives.filesizes
> (8) mapreduce.job.cache.archives.timestamps
> (9) mapreduce.job.cache.symlink.create
> To emulate data compression in gridmix jobs, trace file should contain the following configuration properties:
> (1) mapreduce.map.output.compress
> (2) mapreduce.map.output.compress.codec
> (3) mapreduce.output.fileoutputformat.compress
> (4) mapreduce.output.fileoutputformat.compress.codec
> (5) mapreduce.output.fileoutputformat.compress.type
> Ideally, gridmix should set many job specific configuration properties like io.sort.mb, io.sort.factor, etc when running simulated jobs to get the same effect of original/real job in terms of spilled records, number of merges, etc.
> TraceBuilder should bring in all these properties into the generated trace file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira