You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Konstantin Shvachko (JIRA)" <ji...@apache.org> on 2017/09/01 18:05:00 UTC

[jira] [Commented] (MAPREDUCE-6931) Remove TestDFSIO "Total Throughput" calculation

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150945#comment-16150945 ] 

Konstantin Shvachko commented on MAPREDUCE-6931:
------------------------------------------------

Hey [~djp] I got confused with jira versions, as 2.8.3 was not available. Now it is, thanks.
But I hoped the confusing field, which this jira is removing, will not sneak into any releases at all. To avoid questions like what it means and why it was removed later on. 
I would strongly recommend merging this into 2.8.2. The final decision of course is up to the release manager.

> Remove TestDFSIO "Total Throughput" calculation
> -----------------------------------------------
>
>                 Key: MAPREDUCE-6931
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6931
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: benchmarks, test
>    Affects Versions: 2.8.0
>            Reporter: Dennis Huo
>            Assignee: Dennis Huo
>            Priority: Trivial
>             Fix For: 2.9.0, 3.0.0-beta1, 2.7.5, 2.8.3
>
>         Attachments: MAPREDUCE-6931-001.patch
>
>
> The new "Total Throughput" line added in https://issues.apache.org/jira/browse/HDFS-9153 is currently calculated as {{toMB(size) / ((float)execTime)}} and claims to be in units of "MB/s", but {{execTime}} is in milliseconds; thus, the reported number is 1/1000x the actual value:
> {code:java}
>     String resultLines[] = {
>         "----- TestDFSIO ----- : " + testType,
>         "            Date & time: " + new Date(System.currentTimeMillis()),
>         "        Number of files: " + tasks,
>         " Total MBytes processed: " + df.format(toMB(size)),
>         "      Throughput mb/sec: " + df.format(size * 1000.0 / (time * MEGA)),
>         "Total Throughput mb/sec: " + df.format(toMB(size) / ((float)execTime)),
>         " Average IO rate mb/sec: " + df.format(med),
>         "  IO rate std deviation: " + df.format(stdDev),
>         "     Test exec time sec: " + df.format((float)execTime / 1000),
>         "" };
> {code}
> The different calculated fields can also use toMB and a shared milliseconds-to-seconds conversion to make it easier to keep units consistent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org