You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "tianshuo (JIRA)" <ji...@apache.org> on 2014/11/17 23:00:34 UTC

[jira] [Comment Edited] (SPARK-4452) Enhance Sort-based Shuffle to avoid spilling small files

    [ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215222#comment-14215222 ] 

tianshuo edited comment on SPARK-4452 at 11/17/14 10:00 PM:
------------------------------------------------------------

Hi, [~sandyr]:
Your concern about data structures wouldn' be in charge of their spilling is legit. That's why I'm trying to make a incremental change:
1. The data structure still asks the ShuffleMemoryManager and decides if to spill itself.
2. But ShuffleMemoryManager can also trigger the spill of an object if the memory quota of a thread is used up.

2 happens as a last resort when memory is not enough for the requesting object.
Also as you mentioned in the third solution, if the shuffle manager consider fairness among objs,it has to have a way to trigger the spilling of an object in a situation where current allocation is "not fair". The memory manager has more of a global knowledge about memory allocation, so giving spilling ability to the manager could lead to more optimal memory allocation. If the spilling can only be triggered from the object itself, like currently, one obj may not be aware of the memory usage of other objs and keep holding the memory.

My point is the data structure should be able to trigger spilling by itself, but it should also be able to handle when shuffleManager asks it to spill. I'm also considering the obj can reject to spill itself do address the concern you mentioned.


 


was (Author: tianshuo):
Hi, [~sandyr]:
Your concern about data structures wouldn' be in charge of their spilling is legit. That's why I'm trying to make a incremental change:
1. The data structure still asks the ShuffleMemoryManager and decides if to spill itself.
2. But ShuffleMemoryManager can also trigger the spill of an object if the memory quota of a thread is used up.

2 happens as a last resort when memory is not enough for the requesting object.
Also as you mentioned in the third solution, if the shuffle manager consider fairness among objs,it has to have a way to trigger the spilling of an object in a situation where current allocation is "not fair". The memory manager has more of a global knowledge about memory allocation, so giving spilling ability to the manager could lead to more optimal memory allocation. The the spilling can only be triggered from the object itself, like currently, one obj may not be aware of the memory usage of other objs and keep holding the memory.

My point is the data structure should be able to trigger spilling by itself, but it should also be able to handle when shuffleManager asks it to spill. I'm also considering the obj can reject to spill itself do address the concern you mentioned.


 

> Enhance Sort-based Shuffle to avoid spilling small files
> --------------------------------------------------------
>
>                 Key: SPARK-4452
>                 URL: https://issues.apache.org/jira/browse/SPARK-4452
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: tianshuo
>
> When an Aggregator is used with ExternalSorter in a task, spark will create many small files and could cause too many files open error during merging.
> This happens when using the sort-based shuffle. The issue is caused by multiple factors:
> 1. There seems to be a bug in setting the elementsRead variable in ExternalSorter, which renders the trackMemoryThreshold(defined in Spillable) useless for triggering spilling, the pr to fix it is https://github.com/apache/spark/pull/3302
> 2. Current ShuffleMemoryManager does not work well when there are 2 spillable objects in a thread, which are ExternalSorter and ExternalAppendOnlyMap(used by Aggregator) in this case. Here is an example: Due to the usage of mapside aggregation, ExternalAppendOnlyMap is created first to read the RDD. It may ask as much memory as it can, which is totalMem/numberOfThreads. Then later on when ExternalSorter is created in the same thread, the ShuffleMemoryManager could refuse to allocate more memory to it, since the memory is already given to the previous requested object(ExternalAppendOnlyMap). That causes the ExternalSorter keeps spilling small files(due to the lack of memory)
> I'm currently working on a PR to address these two issues. It will include following changes
> 1. The ShuffleMemoryManager should not only track the memory usage for each thread, but also the object who holds the memory
> 2. The ShuffleMemoryManager should be able to trigger the spilling of a spillable object. In this way, if a new object in a thread is requesting memory, the old occupant could be evicted/spilled. This avoids problem 2 from happening. Previously spillable object triggers spilling by themself. So one may not trigger spilling even if another object in the same thread needs more memory. After this change The ShuffleMemoryManager could trigger the spilling of an object if it needs to
> 3. Make the iterator of ExternalAppendOnlyMap spillable. Previously ExternalAppendOnlyMap returns an destructive iterator and can not be spilled after the iterator is returned. This should be changed so that even after the iterator is returned, the ShuffleMemoryManager can still spill it.
> Currently, I have a working branch in progress: https://github.com/tsdeng/spark/tree/enhance_memory_manager 
> Already made change 3 and have a prototype of change 1 and 2 to evict spillable from memory manager, still in progress.
> I will send a PR when it's done.
> Any feedback or thoughts on this change is highly appreciated !



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org