You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Ravi Prakash (JIRA)" <ji...@apache.org> on 2017/08/02 21:33:00 UTC

[jira] [Commented] (MAPREDUCE-6923) YARN Shuffle I/O for small partitions

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111759#comment-16111759 ] 

Ravi Prakash commented on MAPREDUCE-6923:
-----------------------------------------

Hi Robert! Thanks for filing the JIRA and your contribution! I'm adding you as a contributor and assigning the JIRA to you. Could you please post a patch file to this JIRA? You can name the patch file MAPREDUCE\-6923.00.patch . One minor nit is that we limit lines to 80 characters. Could you please fix that in the patch file?
Also since {{trans}} is [guaranteed to be positive|https://github.com/apache/hadoop/blob/12e44e7bdaf53d3720a89d32f0cc2717241bd6b2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedFileRegion.java#L103] and {{shuffleBufferSize}} is an integer, maybe we don't really need the ternary operator condition? Up to you to keep it though. 

I'm not surprised that there isn't an improvement in job performance but the read overhead improvement is great. Do you know where the 18% overhead is going?

The patch sounds reasonable to me. [~jlowe], [~nikola.vujic], [~cnauroth] do you have any comments? The diff he's proposing is in the link to the word "e.g. [here|https://github.com/apache/hadoop/compare/branch-2.7.3...robert-schmidtke:adaptive-shuffle-buffer]" .

> YARN Shuffle I/O for small partitions
> -------------------------------------
>
>                 Key: MAPREDUCE-6923
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6923
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>         Environment: Observed in Hadoop 2.7.3 and above (judging from the source code of future versions), and Ubuntu 16.04.
>            Reporter: Robert Schmidtke
>
> When a job configuration results in small partitions read by each reducer from each mapper (e.g. 65 kilobytes as in my setup: a [TeraSort|https://github.com/apache/hadoop/blob/branch-2.7.3/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/terasort/TeraSort.java] of 256 gigabytes using 2048 mappers and reducers each), and setting
> {code:xml}
> <property>
>   <name>mapreduce.shuffle.transferTo.allowed</name>
>   <value>false</value>
> </property>
> {code}
> then the default setting of
> {code:xml}
> <property>
>   <name>mapreduce.shuffle.transfer.buffer.size</name>
>   <value>131072</value>
> </property>
> {code}
> results in almost 100% overhead in reads during shuffle in YARN, because for each 65K needed, 128K are read.
> I propose a fix in [FadvisedFileRegion.java|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedFileRegion.java#L114] as follows:
> {code:java}
> ByteBuffer byteBuffer = ByteBuffer.allocate(Math.min(this.shuffleBufferSize, trans > Integer.MAX_VALUE ? Integer.MAX_VALUE : (int) trans));
> {code}
> e.g. [here|https://github.com/apache/hadoop/compare/branch-2.7.3...robert-schmidtke:adaptive-shuffle-buffer]. This sets the shuffle buffer size to the minimum value of the shuffle buffer size specified in the configuration (128K by default), and the actual partition size (65K on average in my setup). In my benchmarks this reduced the read overhead in YARN from about 100% (255 additional gigabytes as described above) down to about 18% (an additional 45 gigabytes). The runtime of the job remained the same in my setup.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org