You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Robert Schmidtke <ro...@gmail.com> on 2017/07/31 13:36:40 UTC

Shuffle buffer size in presence of small partitions

Hi all,

I just ran into an issue, which likely resulted from my not very
intelligent configuration, but nonetheless I'd like to share this with the
community. This is all on Hadoop 2.7.3.

In my setup, each reducer roughly fetched 65K from each mapper's spill
file. I disabled transferTo during shuffle, because I wanted to have a look
at the file system statistics, which miss mmap calls, which is what
transferTo sometimes defaults to. I left the shuffle buffer size at 128K
(not knowing about the parameter at the time). This had the effect that I
observed roughly 100% more data being read during shuffle, since 128K were
read for each 65K needed.

I added a quick fix to Hadoop which chooses the minimum of the partition
size and the shuffle buffer size:
https://github.com/apache/hadoop/compare/branch-2.7.3...robert-schmidtke:adaptive-shuffle-buffer
Benchmarking this version against transferTo.allowed=true yields the same
runtime and roughly 10% more reads in YARN during the shuffle phase
(compared to previous 100%).
Maybe this is something that should be added to Hadoop? Or do users have to
be more clever about their job configurations? I'd be happy to open a PR if
this is deemed useful.

Anyway, thanks for the attention!

Cheers
Robert

-- 
My GPG Key ID: 336E2680

Re: Shuffle buffer size in presence of small partitions

Posted by Robert Schmidtke <ro...@gmail.com>.
Hi all,

fyi this is the ticket I opened up:
https://issues.apache.org/jira/browse/MAPREDUCE-6923
Thanks in advance!

Robert

On Mon, Jul 31, 2017 at 10:21 PM, Ravi Prakash <ra...@gmail.com> wrote:

> Hi Robert!
>
> I'm sorry I do not have a Windows box and probably don't understand the
> shuffle process well enough. Could you please create a JIRA in the
> mapreduce proect if you would like this fixed upstream?
> https://issues.apache.org/jira/secure/RapidBoard.jspa?
> rapidView=116&projectKey=MAPREDUCE
>
> Thanks
> Ravi
>
> On Mon, Jul 31, 2017 at 6:36 AM, Robert Schmidtke <ro...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> I just ran into an issue, which likely resulted from my not very
>> intelligent configuration, but nonetheless I'd like to share this with the
>> community. This is all on Hadoop 2.7.3.
>>
>> In my setup, each reducer roughly fetched 65K from each mapper's spill
>> file. I disabled transferTo during shuffle, because I wanted to have a look
>> at the file system statistics, which miss mmap calls, which is what
>> transferTo sometimes defaults to. I left the shuffle buffer size at 128K
>> (not knowing about the parameter at the time). This had the effect that I
>> observed roughly 100% more data being read during shuffle, since 128K were
>> read for each 65K needed.
>>
>> I added a quick fix to Hadoop which chooses the minimum of the partition
>> size and the shuffle buffer size: https://github.com/apach
>> e/hadoop/compare/branch-2.7.3...robert-schmidtke:adaptive-shuffle-buffer
>> Benchmarking this version against transferTo.allowed=true yields the same
>> runtime and roughly 10% more reads in YARN during the shuffle phase
>> (compared to previous 100%).
>> Maybe this is something that should be added to Hadoop? Or do users have
>> to be more clever about their job configurations? I'd be happy to open a PR
>> if this is deemed useful.
>>
>> Anyway, thanks for the attention!
>>
>> Cheers
>> Robert
>>
>> --
>> My GPG Key ID: 336E2680
>>
>
>


-- 
My GPG Key ID: 336E2680

Re: Shuffle buffer size in presence of small partitions

Posted by Ravi Prakash <ra...@gmail.com>.
Hi Robert!

I'm sorry I do not have a Windows box and probably don't understand the
shuffle process well enough. Could you please create a JIRA in the
mapreduce proect if you would like this fixed upstream?
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=116&projectKey=MAPREDUCE

Thanks
Ravi

On Mon, Jul 31, 2017 at 6:36 AM, Robert Schmidtke <ro...@gmail.com>
wrote:

> Hi all,
>
> I just ran into an issue, which likely resulted from my not very
> intelligent configuration, but nonetheless I'd like to share this with the
> community. This is all on Hadoop 2.7.3.
>
> In my setup, each reducer roughly fetched 65K from each mapper's spill
> file. I disabled transferTo during shuffle, because I wanted to have a look
> at the file system statistics, which miss mmap calls, which is what
> transferTo sometimes defaults to. I left the shuffle buffer size at 128K
> (not knowing about the parameter at the time). This had the effect that I
> observed roughly 100% more data being read during shuffle, since 128K were
> read for each 65K needed.
>
> I added a quick fix to Hadoop which chooses the minimum of the partition
> size and the shuffle buffer size: https://github.com/
> apache/hadoop/compare/branch-2.7.3...robert-schmidtke:
> adaptive-shuffle-buffer
> Benchmarking this version against transferTo.allowed=true yields the same
> runtime and roughly 10% more reads in YARN during the shuffle phase
> (compared to previous 100%).
> Maybe this is something that should be added to Hadoop? Or do users have
> to be more clever about their job configurations? I'd be happy to open a PR
> if this is deemed useful.
>
> Anyway, thanks for the attention!
>
> Cheers
> Robert
>
> --
> My GPG Key ID: 336E2680
>