You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:20:04 UTC

[jira] [Updated] (SPARK-17325) Inconsistent Spillable threshold and AppendOnlyMap growing threshold may trigger out-of-memory errors

     [ https://issues.apache.org/jira/browse/SPARK-17325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-17325:
---------------------------------
    Labels: bulk-closed  (was: )

> Inconsistent Spillable threshold and AppendOnlyMap growing threshold may trigger out-of-memory errors
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-17325
>                 URL: https://issues.apache.org/jira/browse/SPARK-17325
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle, Spark Core
>    Affects Versions: 1.6.2, 2.0.0
>            Reporter: Lijie Xu
>            Priority: Major
>              Labels: bulk-closed
>
> I am reading the shuffle source code and guessing that there may be a potential out-of-memory error in ExternalSorter.
> The problem is that the memory usage of AppendOnlyMap (i.e., PartitionedAppendOnlyMap in ExternalSorter) can greatly exceed its spillable threshold (i.e., `currentMemory` can be 2 times the size of `myMemoryThreshold` in `Spillable.maybeSpill()`). This means that the task's current execution memory usage (AppendOnlyMap) has greatly exceeded its defined execution memory limit ((1 - spark.memory.storageFraction) * 1 / #taskNum), which will lead to potential out-of-memory errors.
> Example: Current spillable threshold has become 250MB, while the AppendOnlyMap is 200MB. At this time, an incoming key/value record triggers AppendOnlyMap's size expansion (AppendOnlyMap is full). After expansion, the AppendOnlyMap may become 400MB (or slightly smaller), which is greatly larger than the spillable threshold and execution memory limit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org