You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Stanislav Savulchik (Jira)" <ji...@apache.org> on 2022/01/16 16:38:00 UTC

[jira] [Commented] (SPARK-27337) QueryExecutionListener never cleans up listeners from the bus after SparkSession is cleared

    [ https://issues.apache.org/jira/browse/SPARK-27337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17476829#comment-17476829 ] 

Stanislav Savulchik commented on SPARK-27337:
---------------------------------------------

Hi,

I've found this ticket while investigating an apparent memory leak in an instance of a long running spark 3.1.1 driver java process executing various jobs posted by an external scheduler.

I took a heap dump (jmap -dump:live,file=dump.hprof <pid>) during an idle period when there were no running jobs and opened it with Eclipse Memory Analyzer. I saw a similar picture as posted by [~vinooganesh] .

[^Screenshot 2022-01-16 at 23.16.10.png]

Every posted job is given a fresh SparkSession instance using SparkSession#newSession method. After a job is done its SparkSession instance is no longer referenced and is expected to be garbage collected with all accumulated session state.

Apparently in some cases some old SparkSessions are still referenced from AsyncEventQueue even after manual or scheduled System.gc() calls by spark context cleaner, more specifically from ExecutionListenerBus instances still residing in a listeners queue.

I tried to correlate this with spark driver metrics and my current guess is that the reason of stuck ExecutionListenerBus instances – dropped events on a _shared_ queue.

I would appreciate if anyone could verify my reasoning. Thank you.

> QueryExecutionListener never cleans up listeners from the bus after SparkSession is cleared
> -------------------------------------------------------------------------------------------
>
>                 Key: SPARK-27337
>                 URL: https://issues.apache.org/jira/browse/SPARK-27337
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Vinoo Ganesh
>            Priority: Major
>         Attachments: Screenshot 2022-01-16 at 23.16.10.png, image001-1.png
>
>
> As a result of [https://github.com/apache/spark/commit/9690eba16efe6d25261934d8b73a221972b684f3], it looks like there is a memory leak (specifically [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131).] 
> Because the Listener Bus on the context still has a reference to the listener (even after the SparkSession is cleared), they are never cleaned up. This means that if you close and remake spark sessions fairly frequently, you're leaking every single time. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org