You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Stanislav Savulchik (Jira)" <ji...@apache.org> on 2022/01/24 15:00:00 UTC

[jira] [Comment Edited] (SPARK-27337) QueryExecutionListener never cleans up listeners from the bus after SparkSession is cleared

    [ https://issues.apache.org/jira/browse/SPARK-27337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477546#comment-17477546 ] 

Stanislav Savulchik edited comment on SPARK-27337 at 1/24/22, 2:59 PM:
-----------------------------------------------------------------------

[~vinooganesh] thanks for the response.

I didn't try the 3.2.0 yet but the linked PR looks promising!

Although I tried to increase the shared queue size (spark.scheduler.listenerbus.eventqueue.shared.capacity = 80000) in order to keep up with the incoming events rate and resolve the dropped events issue. I still observing the effect on metrics after the change but it seems that the memory leak is gone. Though I still have to verify it but taking another heap dump.

UPDATE – getting rid of dropped events in fact doesn't solve the memory leak.


was (Author: savulchik):
[~vinooganesh] thanks for the response.

I didn't try the 3.2.0 yet but the linked PR looks promising!

Although I tried to increase the shared queue size (spark.scheduler.listenerbus.eventqueue.shared.capacity = 80000) in order to keep up with the incoming events rate and resolve the dropped events issue. I still observing the effect on metrics after the change but it seems that the memory leak is gone. Though I still have to verify it but taking another heap dump.

> QueryExecutionListener never cleans up listeners from the bus after SparkSession is cleared
> -------------------------------------------------------------------------------------------
>
>                 Key: SPARK-27337
>                 URL: https://issues.apache.org/jira/browse/SPARK-27337
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Vinoo Ganesh
>            Priority: Major
>         Attachments: Screenshot 2022-01-16 at 23.16.10.png, image001-1.png
>
>
> As a result of [https://github.com/apache/spark/commit/9690eba16efe6d25261934d8b73a221972b684f3], it looks like there is a memory leak (specifically [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131).] 
> Because the Listener Bus on the context still has a reference to the listener (even after the SparkSession is cleared), they are never cleaned up. This means that if you close and remake spark sessions fairly frequently, you're leaking every single time. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org