You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2016/07/19 14:17:20 UTC

[jira] [Commented] (MAPREDUCE-6735) Performance degradation caused by MAPREDUCE-5465 and HADOOP-12107

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384200#comment-15384200 ] 

Jason Lowe commented on MAPREDUCE-6735:
---------------------------------------

Thanks for the report, Alexandr!

MAPREDUCE-5465 would be expected to make things slightly slower, since the whole point of that JIRA is to let tasks finish on their own rather than proactively killing them.  Before that change the AM was trying to kill tasks as soon as they said they were done, and that was causing problems when the task had things to do afterwards like dumping JVM stats for performance or profiling.  Letting the tasks complete on their own means it will take a little longer sometimes for the container to complete, and that prevents another task from using the cluster resources associated with that container.  Essentially the change was a tradeoff of performance for correctness -- we were running "too fast" sometimes. ;-)

Do you have any details on how HADOOP-12107 is impacting things?  If you just put that one in on its own, do you see a similar impact?  Also note that HADOOP-12107 has a very important followup fix at HADOOP-12706.  It would be interesting to know if HADOOP-12107 really does impact on its own whether adding HADOOP-12706 changes that in any way.

> Performance degradation caused by MAPREDUCE-5465 and HADOOP-12107
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-6735
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6735
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Alexandr Balitsky
>
> Two commits, MAPREDUCE-5465 and HADOOP-12107 are making Terasort on YARN 10% slower.
> Reduce phase with those commits ~5 mins
> Reduce phase without ~3.5 mins
> Average Reduce is taking 4mins, 16sec with those commits compared to 3mins, 48sec without.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org