You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dimitris Batis (Jira)" <ji...@apache.org> on 2021/03/10 11:22:00 UTC
[jira] [Created] (SPARK-34689) Spark Thrift Server: Memory leak for
SparkSession objects
Dimitris Batis created SPARK-34689:
--------------------------------------
Summary: Spark Thrift Server: Memory leak for SparkSession objects
Key: SPARK-34689
URL: https://issues.apache.org/jira/browse/SPARK-34689
Project: Spark
Issue Type: Bug
Components: Spark Core, SQL
Affects Versions: 3.1.1, 3.0.1
Environment:
Our settings are as follows. Please note that this was occuring even before we added the ExplicitGCInvokesConcurrent option (i.e. it happened even when a full GC was performed every 20 minutes).
spark-defaults.conf:
{code}
spark.master spark://...:7077,...:7077
spark.master.rest.enabled true
spark.eventLog.enabled false
spark.eventLog.dir file:///...
spark.driver.cores 1
spark.driver.maxResultSize 4g
spark.driver.memory 5g
spark.executor.memory 1g
spark.executor.logs.rolling.maxRetainedFiles 2
spark.executor.logs.rolling.strategy size
spark.executor.logs.rolling.maxSize 1G
spark.local.dir ...
spark.sql.ui.retainedExecutions=10
spark.ui.retainedDeadExecutors=10
spark.worker.ui.retainedExecutors=10
spark.worker.ui.retainedDrivers=10
spark.ui.retainedJobs=30
spark.ui.retainedStages=100
spark.ui.retainedTasks=500
spark.appStateStore.asyncTracking.enable=false
spark.sql.shuffle.partitions=200
spark.default.parallelism=200
spark.task.reaper.enabled=true
spark.task.reaper.threadDump=false
spark.memory.offHeap.enabled=true
spark.memory.offHeap.size=4g
{code}
spark-env.sh:
{code}
HADOOP_CONF_DIR="/.../hadoop/etc/hadoop"
SPARK_WORKER_CORES=28
SPARK_WORKER_MEMORY=54g
SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true -Dspark.worker.cleanup.appDataTtl=172800 -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=40 "
SPARK_DAEMON_JAVA_OPTS="-Dlog4j.configuration=file:///.../log4j.properties -Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.dir="..." -Dspark.deploy.zookeeper.url=...:2181,...:2181,...:2181 -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=40"
{code}
start-thriftserver.sh:
{code}
export SPARK_DAEMON_MEMORY=16g
exec "${SPARK_HOME}"/sbin/spark-daemon.sh submit $CLASS 1 \
--conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" \
--conf "spark.ui.retainedJobs=30" \
--conf "spark.ui.retainedStages=100" \
--conf "spark.ui.retainedTasks=500" \
--conf "spark.sql.ui.retainedExecutions=10" \
--conf "spark.appStateStore.asyncTracking.enable=false" \
--conf "spark.cleaner.periodicGC.interval=20min" \
--conf "spark.sql.autoBroadcastJoinThreshold=-1" \
--conf "spark.executor.extraJavaOptions=-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseG1GC -XX:MaxGCPauseMillis=200" \
--conf "spark.driver.extraJavaOptions=-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xloggc:/.../thrift_driver_gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=7 -XX:GCLogFileSize=35M -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=11990 -XX:+ExplicitGCInvokesConcurrent" \
--conf "spark.metrics.namespace=..." --name "..." --packages io.delta:delta-core_2.12:0.7.0 --hiveconf spark.ui.port=4038 --hiveconf spark.cores.max=22 --hiveconf spark.executor.cores=3 --hiveconf spark.executor.memory=6144M --hiveconf spark.scheduler.mode=FAIR --hiveconf spark.scheduler.allocation.file=.../conf/thrift-scheduler.xml \
--conf spark.sql.thriftServer.incrementalCollect=true "$@"
{code}
Reporter: Dimitris Batis
When running the Spark Thrift Server (3.0.1), we have noticed that each new JDBC connection creates a new SparkSession object. This object (and anything being referenced by it), however, remains in memory indefinitely even though the JDBC connection is closed, and full GCs do not remove it. After about 18 hours of heavy use, we get more than 46.000 such objects.
In a small local installation test, I replicated the behavior by simply opening a JDBC connection, executing SHOW SCHEMAS and closing the connection. For each connection, a new SparkSession object is created and never removed. I have noticed the same behavior in Spark 3.1.1 as well.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org