You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Matt Foley (JIRA)" <ji...@apache.org> on 2013/05/14 06:25:16 UTC

[jira] [Updated] (MAPREDUCE-5125) TestDFSIO should write less compressible data

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matt Foley updated MAPREDUCE-5125:
----------------------------------

    Target Version/s: 3.0.0, 2.0.5-beta, 1.3.0  (was: 1.2.0, 3.0.0, 2.0.5-beta)
    
> TestDFSIO should write less compressible data
> ---------------------------------------------
>
>                 Key: MAPREDUCE-5125
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5125
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 2.0.3-alpha, 1.1.2
>            Reporter: Todd Lipcon
>            Priority: Minor
>
> Currently, TestDFSIO writes a short repeating string of sequential (byte)0 through (byte)50. This makes its output very compressible (I measured 250:1 by LZOing the resulting file). This makes the results of TestDFSIO very hard to compare when running on HDFS vs other file systems which may include some compression on the network, disk, or both -- what is ostensibly a benchmark of IO throughput yields completely skewed results towards the system with compression.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira