You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Min Shen (Jira)" <ji...@apache.org> on 2019/10/14 16:56:00 UTC

[jira] [Commented] (SPARK-21492) Memory leak in SortMergeJoin

    [ https://issues.apache.org/jira/browse/SPARK-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951154#comment-16951154 ] 

Min Shen commented on SPARK-21492:
----------------------------------

We have deployed the latest version of the PR in [https://github.com/apache/spark/pull/25888] in LinkedIn's production clusters for a week now.

With the most recent changes, all corner cases seem to have been handled.

We are seeing jobs previously failing due to this issue now able to complete.

We have also observed a general reduction of spills during join in our cluster.

Want to see if the community is also working on a fix of this issue, and if so whether there's a timeline for the fix.

[~cloud_fan] [~jiangxb1987] [~taoluo]

> Memory leak in SortMergeJoin
> ----------------------------
>
>                 Key: SPARK-21492
>                 URL: https://issues.apache.org/jira/browse/SPARK-21492
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0, 2.3.0, 2.3.1, 3.0.0
>            Reporter: Zhan Zhang
>            Priority: Major
>
> In SortMergeJoin, if the iterator is not exhausted, there will be memory leak caused by the Sort. The memory is not released until the task end, and cannot be used by other operators causing performance drop or OOM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org