You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Ravi Gummadi (JIRA)" <ji...@apache.org> on 2011/07/22 13:26:57 UTC

[jira] [Created] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
------------------------------------------------------------------------------------------

                 Key: MAPREDUCE-2722
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: contrib/gridmix
            Reporter: Ravi Gummadi
            Assignee: Ravi Gummadi


When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Ravi Gummadi (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2722:
------------------------------------

    Release Note: Makes Gridmix use the uncompressed input data size while simulating map tasks in the case where compressed input data was used in original job.
          Status: Patch Available  (was: Open)
    
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222355#comment-13222355 ] 

Hudson commented on MAPREDUCE-2722:
-----------------------------------

Integrated in Hadoop-Common-trunk-Commit #1833 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1833/])
    MAPREDUCE-2722. [Gridmix] Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used.(ravigummadi) (Revision 1297052)

     Result = SUCCESS
ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1297052
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/CompressionEmulationUtil.java
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestCompressionEmulationUtils.java

                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, 2722.v2.1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222369#comment-13222369 ] 

Hudson commented on MAPREDUCE-2722:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk-Commit #1840 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1840/])
    MAPREDUCE-2722. [Gridmix] Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used.(ravigummadi) (Revision 1297052)

     Result = ABORTED
ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1297052
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/CompressionEmulationUtil.java
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestCompressionEmulationUtils.java

                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, 2722.v2.1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Ravi Gummadi (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222346#comment-13222346 ] 

Ravi Gummadi commented on MAPREDUCE-2722:
-----------------------------------------

Gridmix unit tests and test-patch passed on my local machine.

I just committed this patch to trunk. Thanks Amar for the review.
                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, 2722.v2.1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223217#comment-13223217 ] 

Hudson commented on MAPREDUCE-2722:
-----------------------------------

Integrated in Hadoop-Hdfs-trunk #976 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/976/])
    MAPREDUCE-2722. [Gridmix] Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used.(ravigummadi) (Revision 1297052)

     Result = SUCCESS
ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1297052
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/CompressionEmulationUtil.java
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestCompressionEmulationUtils.java

                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, 2722.v2.1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Ravi Gummadi (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2722:
------------------------------------

    Attachment: 2722.v2.1.patch

Attaching patch incorporating review comments.
                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, 2722.v2.1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Ravi Gummadi (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2722:
------------------------------------

      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)
    
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, 2722.v2.1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Ravi Gummadi (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2722:
------------------------------------

    Attachment: 2722.v1.patch

Attaching new patch with unit test.
                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069513#comment-13069513 ] 

Ravi Gummadi commented on MAPREDUCE-2722:
-----------------------------------------

Just to explain the problem further, with trunk, here is an example table of counters for a map task of a job(Compression ratio considered by Gridmix to generate input data is say 0.5):

Counter        originalJob's Map Task                   Gridmix simulated job's map task
HdfsBytesRead        100MB                                    50MB
MapInputBytes        1000MB                                  100MB

Since emulation of correct disk IO is more important for Gridmix, emulation of hdfsBytesRead is needed/important.

> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223250#comment-13223250 ] 

Hudson commented on MAPREDUCE-2722:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk #1011 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1011/])
    MAPREDUCE-2722. [Gridmix] Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used.(ravigummadi) (Revision 1297052)

     Result = FAILURE
ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1297052
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/CompressionEmulationUtil.java
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestCompressionEmulationUtils.java

                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, 2722.v2.1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Amar Kamat (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220701#comment-13220701 ] 

Amar Kamat commented on MAPREDUCE-2722:
---------------------------------------

Ravi, compression-emulation is a feature having 3 parts
# Input compression emulation
# Intermediate compression emulation
# Output compression emulation

Intermediate and output compression emulation happens only when the compression-emulation feature is turned on and the job's config has those parameters set.
For input compression, Gridmix relies on 'mapred.input.dir'. If there are compressed input files only then input compression emulation will be attempted.

Scale the input-data-size field only if input-compression-emulation is desired.
                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219935#comment-13219935 ] 

Hadoop QA commented on MAPREDUCE-2722:
--------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12516656/2722.v1.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1975//testReport/
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1975//console

This message is automatically generated.
                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Amar Kamat (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222200#comment-13222200 ] 

Amar Kamat commented on MAPREDUCE-2722:
---------------------------------------

+1. Looks good to me.
                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, 2722.v2.1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222353#comment-13222353 ] 

Hudson commented on MAPREDUCE-2722:
-----------------------------------

Integrated in Hadoop-Hdfs-trunk-Commit #1907 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1907/])
    MAPREDUCE-2722. [Gridmix] Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used.(ravigummadi) (Revision 1297052)

     Result = SUCCESS
ravigummadi : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1297052
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/CompressionEmulationUtil.java
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/LoadJob.java
* /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/gridmix/src/test/org/apache/hadoop/mapred/gridmix/TestCompressionEmulationUtils.java

                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, 2722.v2.1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Amar Kamat (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214532#comment-13214532 ] 

Amar Kamat commented on MAPREDUCE-2722:
---------------------------------------

Changes look good to me. +1. Is it possible to add a JUnit?
                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222204#comment-13222204 ] 

Hadoop QA commented on MAPREDUCE-2722:
--------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517049/2722.v2.1.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1996//testReport/
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1996//console

This message is automatically generated.
                
> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: 2722.v1.patch, 2722.v2.1.patch, MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2722) Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated MAPREDUCE-2722:
------------------------------------

    Attachment: MR2722.patch

Attaching patch fixing the bug.
The new counter values with this patch for the Gridmix simulated job are(for the example case of previous comment):

100MB as hdfsBytesRead and 200MB as MapInputBytes.

Please review and provide your comments.

> Gridmix simulated job's map's hdfsBytesRead counter is wrong when compressed input is used
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2722
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2722
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/gridmix
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>         Attachments: MR2722.patch
>
>
> When compressed input was used by original job's map task, then the simulated job's map task's hdfsBytesRead counter is wrong if compression emulation is enabled. This issue is because hdfsBytesRead of map task of original job is considered as uncompressed map input size by Gridmix.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira