You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2019/12/31 03:46:00 UTC

[jira] [Resolved] (SPARK-30379) Avoid OOM when using collection accumulator

     [ https://issues.apache.org/jira/browse/SPARK-30379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-30379.
----------------------------------
    Fix Version/s: 3.0.0
       Resolution: Fixed

Issue resolved by pull request 27038
[https://github.com/apache/spark/pull/27038]

> Avoid OOM when using collection accumulator
> -------------------------------------------
>
>                 Key: SPARK-30379
>                 URL: https://issues.apache.org/jira/browse/SPARK-30379
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>             Fix For: 3.0.0
>
>
> One Spark job on our cluster uses collection accumulator to collect something and has encountered an exception like:
> ```
> java.lang.OutOfMemoryError: Java heap space
>     at java.util.Arrays.copyOf(Arrays.java:3332)
>     at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
>     at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
>     at java.lang.StringBuilder.append(StringBuilder.java:136)
>     at java.lang.StringBuilder.append(StringBuilder.java:131)
>     at java.util.AbstractCollection.toString(AbstractCollection.java:462)
>     at java.util.Collections$UnmodifiableCollection.toString(Collections.java:1035)
>     at org.apache.spark.status.LiveEntityHelpers$$anonfun$newAccumulatorInfos$2$$anonfun$apply$3.apply(LiveEntity.scala:596)
>     at org.apache.spark.status.LiveEntityHelpers$$anonfun$newAccumulatorInfos$2$$anonfun$apply$3.apply(LiveEntity.scala:596)
>     at scala.Option.map(Option.scala:146)
>     at org.apache.spark.status.LiveEntityHelpers$$anonfun$newAccumulatorInfos$2.apply(LiveEntity.scala:596)
>     at org.apache.spark.status.LiveEntityHelpers$$anonfun$newAccumulatorInfos$2.apply(LiveEntity.scala:591)
> ```
> `LiveEntityHelpers.newAccumulatorInfos` converts `AccumulableInfo`s to `v1.AccumulableInfo` by calling `toString` on accumulator's value. For collection accumulator, it might take much more memory when in string representation, for example, collection accumulator of long values, and cause OOM (in this job, the driver memory is 6g).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org