You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2022/06/01 16:05:00 UTC

[jira] [Assigned] (SPARK-39357) pmCache memory leak caused by IsolatedClassLoader

     [ https://issues.apache.org/jira/browse/SPARK-39357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-39357:
------------------------------------

    Assignee:     (was: Apache Spark)

> pmCache memory leak caused by IsolatedClassLoader
> -------------------------------------------------
>
>                 Key: SPARK-39357
>                 URL: https://issues.apache.org/jira/browse/SPARK-39357
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.4, 3.2.1
>            Reporter: tianshuang
>            Priority: Major
>         Attachments: Xnip2022-06-01_23-09-35.jpg, Xnip2022-06-01_23-19-35.jpeg, Xnip2022-06-01_23-32-39.jpg
>
>
> I found this bug in Spark 2.4.4, because the related code has not changed, so this bug still exists on master, the following is a brief description of this bug:
> In May 2015, [SPARK-6907|https://github.com/apache/spark/commit/daa70bf135f23381f5f410aa95a1c0e5a2888568] introduced isolated classloader for HiveMetastore to support Hive multi-version loading, but this PR resulted in [RawStore cleanup mechanism|https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/java/org/apache/hive/service/server/ThreadFactoryWithGarbageCleanup.java #L27-L42] is broken because the `ThreadWithGarbageCleanup` class used by `HiveServer2-Handler-Pool` and `HiveServer2-Background-Pool` and `HiveServer2-HttpHandler-Pool` is loaded by AppClassLoader, in the source code of `ThreadWithGarbageCleanup` class: `RawStore threadLocalRawStore = HiveMetaStore.HMSHandler.getRawStore();` This line of code will use the `threadLocalMS` instance in `HiveMetaStore.HMSHandler` (loaded by AppClassLoader), and in the process of thread execution, the `client` actually created by isolatedClassLoader, in the process of obtaining `RawStore` instance through `HiveMetaStore.HMSHandler#getMSForConf`, the `ms` instance is set to `threadLocalMS`, but the static `threadLocalMS` instance belongs to `HMSHandler`(loaded by IsolatedClassLoader$$anon$1), that is, the set and get methods do not operate on the same `threadLocalMS` instance, so in `ThreadWithGarbageCleanup#cacheThreadLocalRawStore` method, the obtained `RawStore` instance is null, so the subsequent `RawStore` cleaning logic does not take effect, because the `shutdown` method of `RawStore` instance is not called, resulting in `pmCache` of `JDOPersistenceManagerFactory` memory leak.
> Long-running Spark ThriftServer end up with frequent GCs, resulting in poor performance.
> I analyzed the heap dump using MAT and executed the following OQL: `SELECT * FROM INSTANCEOF java.lang.Class c WHERE c.@displayName.contains("class org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler ")`, two instances of the `HMSHandler` *Class* can be found in the heap. Also know that they each hold a static `threadLocalMS` instance.
> We execute the following OQL: `select * from org.datanucleus.api.jdo.JDOPersistenceManagerFactory`, we can see that the `pmCache` of the `JDOPersistenceManagerFactory` instance occupies 1.3GB of memory.
> We execute the following OQL: `SELECT * FROM INSTANCEOF java.lang.Class c WHERE c.@displayName.contains("class org.apache.hive.service.server.ThreadFactoryWithGarbageCleanup")`, we can see that there is no element in the static instance `threadRawStoreMap` of `ThreadFactoryWithGarbageCleanup`, which confirms the above statement, because `HMSHandler.getRawStore()` in `ThreadWithGarbageCleanup#cacheThreadLocalRawStore` is called on the `threadLocalMS` instance in `HMSHandler`(loaded by AppClassLoader) instead of `threadLocalMS` instance in `HMSHandler`(loaded by IsolatedClassLoader$$anon$1).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org