You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (JIRA)" <ji...@apache.org> on 2018/08/13 12:57:00 UTC

[jira] [Resolved] (SPARK-22713) OOM caused by the memory contention and memory leak in TaskMemoryManager

     [ https://issues.apache.org/jira/browse/SPARK-22713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan resolved SPARK-22713.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 2.4.0

Issue resolved by pull request 21369
[https://github.com/apache/spark/pull/21369]

> OOM caused by the memory contention and memory leak in TaskMemoryManager
> ------------------------------------------------------------------------
>
>                 Key: SPARK-22713
>                 URL: https://issues.apache.org/jira/browse/SPARK-22713
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle, Spark Core
>    Affects Versions: 2.1.1, 2.1.2
>            Reporter: Lijie Xu
>            Assignee: Eyal Farago
>            Priority: Critical
>             Fix For: 2.4.0
>
>
> The pdf version of this issue with high-quality figures is available at https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/report/OOM-TaskMemoryManager.pdf.
> *[Abstract]* 
> I recently encountered an OOM error in a PageRank application (_org.apache.spark.examples.SparkPageRank_). After profiling the application, I found the OOM error is related to the memory contention in shuffle spill phase. Here, the memory contention means that a task tries to release some old memory consumers from memory for keeping the new memory consumers. After analyzing the OOM heap dump, I found the root cause is a memory leak in _TaskMemoryManager_. Since memory contention is common in shuffle phase, this is a critical bug/defect. In the following sections, I will use the application dataflow, execution log, heap dump, and source code to identify the root cause.
> *[Application]* 
> This is a PageRank application from Spark’s example library. The following figure shows the application dataflow. The source code is available at \[1\].
> !https://raw.githubusercontent.com/JerryLead/Misc/master/OOM-TasksMemoryManager/figures/PageRankDataflow.png|width=100%!
> *[Failure symptoms]*
> This application has a map stage and many iterative reduce stages. An OOM error occurs in a reduce task (Task-28) as follows.
> !https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/figures/Stage.png?raw=true|width=100%!
> !https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/figures/task.png?raw=true|width=100%!
>  
> *[OOM root cause identification]*
> Each executor has 1 CPU core and 6.5GB memory, so it only runs one task at a time. After analyzing the application dataflow, error log, heap dump, and source code, I found the following steps lead to the OOM error. 
> => The MemoryManager found that there is not enough memory to cache the _links:ShuffledRDD_ (rdd-5-28, red circles in the dataflow figure).
> !https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/figures/ShuffledRDD.png?raw=true|width=100%!
> => The task needs to shuffle twice (1st shuffle and 2nd shuffle in the dataflow figure).
> => The task needs to generate two _ExternalAppendOnlyMap_ (E1 for 1st shuffle and E2 for 2nd shuffle) in sequence.
> => The 1st shuffle begins and ends. E1 aggregates all the shuffled data of 1st shuffle and achieves 3.3 GB.
> !https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/figures/FirstShuffle.png?raw=true|width=100%!
> => The 2nd shuffle begins. E2 is aggregating the shuffled data of 2nd shuffle, and finding that there is not enough memory left. This triggers the memory contention.
> !https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/figures/SecondShuffle.png?raw=true|width=100%!
> => To handle the memory contention, the _TaskMemoryManager_ releases E1 (spills it onto disk) and assumes that the 3.3GB space is free now.
> !https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/figures/MemoryContention.png?raw=true|width=100%!
> => E2 continues to aggregates the shuffled records of 2nd shuffle. However, E2 encounters an OOM error while shuffling.
> !https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/figures/OOMbefore.png?raw=true|width=100%!
> !https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/figures/OOMError.png?raw=true|width=100%!
> *[Guess]* 
> The task memory usage below reveals that there is not memory drop down. So, the cause may be that the 3.3GB _ExternalAppendOnlyMap_ (E1) is not actually released by the _TaskMemoryManger_. 
> !https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/figures/GCFigure.png?raw=true|width=100%!
> *[Root cause]* 
> After analyzing the heap dump, I found the guess is right (the 3.3GB _ExternalAppendOnlyMap_ is actually not released). The 1.6GB object is _ExternalAppendOnlyMap (E2)_.
> !https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/figures/heapdump.png?raw=true|width=100%!
> *[Question]* 
> Why the released _ExternalAppendOnlyMap_ is still in memory?
> The source code of _ExternalAppendOnlyMap_ shows that the _currentMap_ (_AppendOnlyMap_) has been set to _null_ when the spill action is finished.
> !https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/figures/SourceCode.png?raw=true|width=100%!
> *[Root cause in the source code]* I further analyze the reference chain of unreleased _ExternalAppendOnlyMap_. The reference chain shows that the 3.3GB _ExternalAppendOnlyMap_ is still referenced by the _upstream/readingIterator_ and further referenced by _TaskMemoryManager_ as follows. So, the root cause in the source code is that the _ExternalAppendOnlyMap_ is still referenced by other iterators (setting the _currentMap_ to _null_ is not enough).
> !https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/figures/References.png?raw=true|width=100%!
> *[Potential solution]*
> Setting the _upstream/readingIterator_ to _null_ after the _forceSpill_() action. I will try this solution in these days.
> [References]
> [1] PageRank source code. https://github.com/JerryLead/SparkGC/blob/master/src/main/scala/applications/graph/PageRank.scala
> [2] Task execution log. https://github.com/JerryLead/Misc/blob/master/OOM-TasksMemoryManager/log/TaskExecutionLog.txt 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org