You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Tianshuo Deng (JIRA)" <ji...@apache.org> on 2014/11/20 19:57:36 UTC

[jira] [Comment Edited] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

    [ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14219795#comment-14219795 ] 

Tianshuo Deng edited comment on SPARK-4452 at 11/20/14 6:57 PM:
----------------------------------------------------------------

Hi, [~matei]:

My way of implementing it is more like the 2nd way you suggested. I will put up a design doc. But I would like to give a preview of my implementation first

 I already implemented following and seems work for me

1. Memory Allocation and spilling is divided into two levels. SpillableTaskMemoryManager for memory allocation and spilling of current thread/task. ShuffleMemoryManager coordinates memory allocation among threads/tasks

2. SpillableTaskMemoryManager: objects are grouped by threads, each STMM maps to one thread/task. If an object requires more memory, it asks STMM for it. STMM will ask ShuffleMemoryManager for more memory for current thread. if the returned memory does not satisfy the request, it will tries to spill objs in current thread to give up memory. Notice the objects it may spill are thread-local, so there is no contention

3. ShuffleMemoryManager: The algorithm in thread memory allocation is basically unchanged. Only thing is that spillables do not ask SMM directly for more memory, instead STMM asks for memory for the thread.

By making this change, spilling is triggered from STMM. This design has following properties in mind:

- Incremental change, thread memory allocation algorithm is unchanged. This way each task/thread get a fair share of memory.
- Spiling is thread local and is triggered from STMM to avoid unnecessary locking and contention. 
- Two levels of memory allocation makes a distinction between allocating memory for tasks and allocating memory/spilling objs in the current task. This distinction makes contention management more clear and easier


was (Author: tianshuo):
Hi, [~matei]:
Hi, Matei

My way of implementing it is more like the 2nd way you suggested. I will put up a design doc. But I would like to give a preview of my implementation first

 I already implemented following and seems work for me

1. Memory Allocation and spilling is divided into two levels. SpillableTaskMemoryManager for memory allocation and spilling of current thread/task. ShuffleMemoryManager coordinates memory allocation among threads/tasks

2. SpillableTaskMemoryManager: objects are grouped by threads, each STMM maps to one thread/task. If an object requires more memory, it asks STMM for it. STMM will ask ShuffleMemoryManager for more memory for current thread. if the returned memory does not satisfy the request, it will tries to spill objs in current thread to give up memory. Notice the objects it may spill are thread-local, so there is no contention

3. ShuffleMemoryManager: The algorithm in thread memory allocation is basically unchanged. Only thing is that spillables do not ask SMM directly for more memory, instead STMM asks for memory for the thread.

By making this change, spilling is triggered from STMM. This design has following properties in mind:

- Incremental change, thread memory allocation algorithm is unchanged. This way each task/thread get a fair share of memory.
- Spiling is thread local and is triggered from STMM to avoid unnecessary locking and contention. 
- Two levels of memory allocation makes a distinction between allocating memory for tasks and allocating memory/spilling objs in the current task. This distinction makes contention management more clear and easier

> Shuffle data structures can starve others on the same thread for memory 
> ------------------------------------------------------------------------
>
>                 Key: SPARK-4452
>                 URL: https://issues.apache.org/jira/browse/SPARK-4452
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.1.0
>            Reporter: Tianshuo Deng
>            Assignee: Tianshuo Deng
>            Priority: Critical
>
> When an Aggregator is used with ExternalSorter in a task, spark will create many small files and could cause too many files open error during merging.
> Currently, ShuffleMemoryManager does not work well when there are 2 spillable objects in a thread, which are ExternalSorter and ExternalAppendOnlyMap(used by Aggregator) in this case. Here is an example: Due to the usage of mapside aggregation, ExternalAppendOnlyMap is created first to read the RDD. It may ask as much memory as it can, which is totalMem/numberOfThreads. Then later on when ExternalSorter is created in the same thread, the ShuffleMemoryManager could refuse to allocate more memory to it, since the memory is already given to the previous requested object(ExternalAppendOnlyMap). That causes the ExternalSorter keeps spilling small files(due to the lack of memory)
> I'm currently working on a PR to address these two issues. It will include following changes:
> 1. The ShuffleMemoryManager should not only track the memory usage for each thread, but also the object who holds the memory
> 2. The ShuffleMemoryManager should be able to trigger the spilling of a spillable object. In this way, if a new object in a thread is requesting memory, the old occupant could be evicted/spilled. Previously the spillable objects trigger spilling by themselves. So one may not trigger spilling even if another object in the same thread needs more memory. After this change The ShuffleMemoryManager could trigger the spilling of an object if it needs to.
> 3. Make the iterator of ExternalAppendOnlyMap spillable. Previously ExternalAppendOnlyMap returns an destructive iterator and can not be spilled after the iterator is returned. This should be changed so that even after the iterator is returned, the ShuffleMemoryManager can still spill it.
> Currently, I have a working branch in progress: https://github.com/tsdeng/spark/tree/enhance_memory_manager. Already made change 3 and have a prototype of change 1 and 2 to evict spillable from memory manager, still in progress. I will send a PR when it's done.
> Any feedback or thoughts on this change is highly appreciated !



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org