You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Victor Tso (Jira)" <ji...@apache.org> on 2020/09/04 04:27:00 UTC

[jira] [Created] (SPARK-32795) ApplicationInfo#removedExecutors can cause OOM

Victor Tso created SPARK-32795:
----------------------------------

             Summary: ApplicationInfo#removedExecutors can cause OOM
                 Key: SPARK-32795
                 URL: https://issues.apache.org/jira/browse/SPARK-32795
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.4.0
            Reporter: Victor Tso


!image-2020-09-03-23-23-45-294.png!

In my case, the Standalone Spark master process had a max heap of 1g. 738mb were consumed by these ExecutorDesc objects, the vast majority of which were the 18.5M removedExecutors. This caused the master to OOM and leave the application driver process dangling.

The reason for this is that the worker node ran out of disk space, so for whatever reason decided to go in a fast and endless loop trying to launch new executors and they in turn crashed too. It got up to the 18M before the master just couldn't handle the history anymore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org