You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Ravi Gummadi (JIRA)" <ji...@apache.org> on 2011/03/29 11:27:05 UTC

[jira] [Created] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Make Gridmix emulate usage of Distributed Cache files
-----------------------------------------------------

                 Key: MAPREDUCE-2407
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
             Project: Hadoop Map/Reduce
          Issue Type: New Feature
          Components: contrib/gridmix
            Reporter: Ravi Gummadi
            Assignee: Ravi Gummadi


Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034752#comment-13034752 ] 

Ravi Gummadi commented on MAPREDUCE-2407:
-----------------------------------------

Amar, Would you please review the patch ? Thanks.

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2407.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2407:
------------------------------------

      Resolution: Fixed
    Release Note: Makes Gridmix emulate HDFS based distributed cache files and local file system based distributed cache files.
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this to trunk.

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2407:
------------------------------------

    Status: Open  (was: Patch Available)

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038638#comment-13038638 ] 

Hudson commented on MAPREDUCE-2407:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk #689 (See [https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk/689/])
    MAPREDUCE-2407. Make GridMix emulate usage of distributed cache files in simulated jobs.

ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1126499
Files : 
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/PseudoLocalFs.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/DistributedCacheEmulator.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestPseudoLocalFs.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/Gridmix.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateData.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/JobCreator.java
* /hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/gridmix.xml
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/DebugJobProducer.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestDistCacheEmulation.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestGridmixSubmission.java


> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2407:
------------------------------------

    Attachment: 2407.patch

Attaching patch that adds emulation of distributed cache load in gridmix simulated jobs.

High level details of what this patch does are:

(1) New gridmix configuration property "gridmix.distributed-cache-emulation.enable" is added, whose default value is true. Setting it to false disables emulation of distributed cache load. Irrespective of this config property setting, with -generate option, distributed cache files are generated on HDFS by gridmix.
Distributed Cache Emulation is disabled for the case of '-' as input trace(i.e. stdin stream instead of file).
Distributed Cache Emulation is disabled for the case where <iopath> is on local file system.

(2) Behavior of the option -generate is changed. -generate option means (a) generate input data in the directory
<iopath>/input/ and (b) generate distributed cache data needed for emulation of distributed cache load of this
trace file in the directory <iopath>/distributedCache/.
For (a), same old GenerateData MR job is used.
For (b), a new MR job GenerateDistCacheData is added, which is run after GenerateData and before submission of simulated jobs.

With -generate option, (a) existence of <iopath>/input/ directory gives an error, similar to current behavior and
(b) existence of <iopath>/gridmixDistCache/ directory is not an error and leads to generation of only the missing/nonexisting distributed cache files under <iopath>/gridmixDistCache/ for the specific trace file. If all the needed distributed cache files are already
there, then submission of GenerateDistCacheData job is skipped.

Without -generate option, if emulation of distributed cache load is enabled, then gridmix checks if all the needed distributed cache files are available under <iopath>/distributedCache/ and emits an error if any of the expected files are missing.

(3) setupDistCacheEmulation : Read the trace file and build a list of distributed cache file paths and their file sizes. The
file paths are the mapped paths on the simulated cluster(mapped from original cluster's paths to simulated cluster's
paths using
{code}MD5Hash(filePath+timestamp){code} for public distributed cache files
and
{code}MD5Hash(filePath+timestamp+username){code} for private distributed cache files.

This list of mappeed file paths along with the file sizes is written to a special file
<iopath>/distributedCache/_distCacheFiles.txt and the file name can be configured using
"gridmix.distcache.file.list".

So this means all distributed cache files in the gridmix simulated jobs are public distributed cache files but for each private distributed cache file of a user of the original cluster (i.e. from trace file), there will be a different public distributed cache file on gridmix simulated cluster.

(4) GenerateDistCacheData : The MR job (launched by gridmix if -generate option is seen) that generates distributed cache data files on HDFS. Input to this job is the special file _distCacheFiles.txt that contains the distributed cache file paths and their sizes.
Each map() call generates one distributed cache file.

(5) configureDistCacheFiles : The mapped distributed cache files' paths are configured for the simulated jobs' configrations sothat MapReduce framework takes care of adding the actual distributed cache load equivalent to original cluster's distributed cache load.

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2407.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2407:
------------------------------------

    Status: Open  (was: Patch Available)

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036713#comment-13036713 ] 

Hadoop QA commented on MAPREDUCE-2407:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12478943/2407.patch
  against trunk revision 1125223.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 10 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    -1 release audit.  The applied patch generated 3 release audit warnings (more than the trunk's current 2 warnings).

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/283//testReport/
Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/283//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/283//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/283//console

This message is automatically generated.

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037792#comment-13037792 ] 

Amar Kamat commented on MAPREDUCE-2407:
---------------------------------------

Patch looks good to me. +1

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Santosh Kumar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034674#comment-13034674 ] 

Santosh Kumar commented on MAPREDUCE-2407:
------------------------------------------

I will take it up from here. Please grant me the commit access. 

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2407.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037978#comment-13037978 ] 

Hudson commented on MAPREDUCE-2407:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk-Commit #695 (See [https://builds.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/695/])
    MAPREDUCE-2407. Make GridMix emulate usage of distributed cache files in simulated jobs.

ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1126499
Files : 
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/PseudoLocalFs.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/DistributedCacheEmulator.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestPseudoLocalFs.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/Gridmix.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateData.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/JobCreator.java
* /hadoop/mapreduce/trunk/src/docs/src/documentation/content/xdocs/gridmix.xml
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/GenerateDistCacheData.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/DebugJobProducer.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestDistCacheEmulation.java
* /hadoop/mapreduce/trunk/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestGridmixSubmission.java


> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037892#comment-13037892 ] 

Hadoop QA commented on MAPREDUCE-2407:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12480093/2407.v1.1.patch
  against trunk revision 1125599.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 10 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/291//testReport/
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/291//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/291//console

This message is automatically generated.

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2407:
------------------------------------

        Fix Version/s: 0.23.0
    Affects Version/s: 0.23.0
               Status: Patch Available  (was: Open)

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2407:
------------------------------------

    Attachment: 2407.v1.1.patch

Attaching new patch updating Amar's offline minor comments.

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037191#comment-13037191 ] 

Amar Kamat commented on MAPREDUCE-2407:
---------------------------------------

The latest patch looks good to me. I have some minor comments (mostly alignment, refactoring and parameter naming) which I have discussed with Ravi offline. I don't want to block the patch just for some minor comments. +1.

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2407:
------------------------------------

    Status: Patch Available  (was: Open)

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2407:
------------------------------------

    Attachment: 2407.v1.patch

Attaching new patch fixing the release audit warning.

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036777#comment-13036777 ] 

Hadoop QA commented on MAPREDUCE-2407:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479884/2407.v1.patch
  against trunk revision 1125223.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 10 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/284//testReport/
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/284//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/284//console

This message is automatically generated.

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2407) Make Gridmix emulate usage of Distributed Cache files

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2407:
------------------------------------

    Status: Patch Available  (was: Open)

> Make Gridmix emulate usage of Distributed Cache files
> -----------------------------------------------------
>
>                 Key: MAPREDUCE-2407
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2407
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/gridmix
>    Affects Versions: 0.23.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.23.0
>
>         Attachments: 2407.patch, 2407.v1.1.patch, 2407.v1.patch
>
>
> Currently Gridmix emulates disk IO load only. This JIRA is to make Gridmix emulate Distributed Cache load as defined by the job-trace.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira