You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Lantao Jin (Jira)" <ji...@apache.org> on 2020/01/13 02:35:00 UTC

[jira] [Commented] (SPARK-30494) Duplicates cached RDD when create or replace an existing view

    [ https://issues.apache.org/jira/browse/SPARK-30494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013976#comment-17013976 ] 

Lantao Jin commented on SPARK-30494:
------------------------------------

I will file a PR soon.

> Duplicates cached RDD when create or replace an existing view
> -------------------------------------------------------------
>
>                 Key: SPARK-30494
>                 URL: https://issues.apache.org/jira/browse/SPARK-30494
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Lantao Jin
>            Priority: Major
>
> We can reproduce by below commands:
> {code}
> beeline> create or replace temporary view temp1 as select 1
> beeline> cache table tempView
> beeline> create or replace temporary view temp1 as select 1, 2
> beeline> cache table tempView
> {code}
> The cached RDD for plan "select 1" stays in memory forever until the session close. This cached data cannot be used since the view temp1 has been replaced by another plan. It's a memory leak.
> assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1, 2")).isDefined)
> assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isDefined)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org