You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Aaron Davidson <il...@gmail.com> on 2013/11/05 17:18:12 UTC

Re: oome from blockmanager

As a followup on this, the memory footprint of all shuffle metadata has
been  greatly reduced. For your original workload with 7k mappers, 7k
reducers, and 5 machines, the total metadata size should have decreased
from ~3.3 GB to ~80 MB.


On Tue, Oct 29, 2013 at 9:07 AM, Aaron Davidson <il...@gmail.com> wrote:

> Great! Glad to hear it worked out. Spark definitely has a pain point about
> deciding the right number of partitions, and I think we're going to be
> spending a lot of time trying to reduce that issue.
>
> Currently working on the patch to reduce the shuffle file block overheads,
> but in the meantime, you can set "spark.shuffle.consolidateFiles=false" to
> exchange OOMEs due to too many partitions for worse performance (probably
> an acceptable tradeoff).
>
>
>
> On Mon, Oct 28, 2013 at 2:31 PM, Stephen Haberman <
> stephen.haberman@gmail.com> wrote:
>
>> Hey guys,
>>
>> As a follow up, I raised our target partition size to 600mb (up from
>> 64mb), which split this report's 500gb of tiny S3 files into ~700
>> partitions, and everything ran much smoother.
>>
>> In retrospect, this was the same issue we'd ran into before, having too
>> many partitions, and had previously solved by throwing some guesses at
>> coalesce to make it magically go away.
>>
>> But now I feel like we have a much better understanding of why the
>> numbers need to be what they are, which is great.
>>
>> So, thanks for all the input and helping me understand what's going on.
>>
>> It'd be great to see some of the optimizations to BlockManager happen,
>> but I understand in the end why it needs to track what it does. And I
>> was also admittedly using a small cluster anyway.
>>
>> - Stephen
>>
>>
>

Re: oome from blockmanager

Posted by Aaron Davidson <il...@gmail.com>.

It is available in branch-0.8 and master.


On Tue, Nov 5, 2013 at 10:23 AM, Ashish Rangole <ar...@gmail.com> wrote:

> Awesome indeed! Is this available in the main branch now?
>
> Thanks!
> On Nov 5, 2013 9:50 AM, "Stephen Haberman" <st...@gmail.com>
> wrote:
>
>>
>> > As a followup on this, the memory footprint of all shuffle metadata
>> > has been  greatly reduced. For your original workload with 7k
>> > mappers, 7k reducers, and 5 machines, the total metadata size should
>> > have decreased from ~3.3 GB to ~80 MB.
>>
>> Wow! Awesome work, Aaron! Thanks for the amazing quick follow up.
>>
>> - Stephen
>>
>>

Re: oome from blockmanager

Posted by Ashish Rangole <ar...@gmail.com>.

Awesome indeed! Is this available in the main branch now?

Thanks!
On Nov 5, 2013 9:50 AM, "Stephen Haberman" <st...@gmail.com>
wrote:

>
> > As a followup on this, the memory footprint of all shuffle metadata
> > has been  greatly reduced. For your original workload with 7k
> > mappers, 7k reducers, and 5 machines, the total metadata size should
> > have decreased from ~3.3 GB to ~80 MB.
>
> Wow! Awesome work, Aaron! Thanks for the amazing quick follow up.
>
> - Stephen
>
>

Re: oome from blockmanager

Posted by Stephen Haberman <st...@gmail.com>.

> As a followup on this, the memory footprint of all shuffle metadata
> has been  greatly reduced. For your original workload with 7k
> mappers, 7k reducers, and 5 machines, the total metadata size should
> have decreased from ~3.3 GB to ~80 MB.

Wow! Awesome work, Aaron! Thanks for the amazing quick follow up.

- Stephen