You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jungtaek Lim (JIRA)" <ji...@apache.org> on 2018/05/31 14:37:00 UTC
[jira] [Created] (SPARK-24441) Expose total size of states in
HDFSBackedStateStoreProvider
Jungtaek Lim created SPARK-24441:
------------------------------------
Summary: Expose total size of states in HDFSBackedStateStoreProvider
Key: SPARK-24441
URL: https://issues.apache.org/jira/browse/SPARK-24441
Project: Spark
Issue Type: Improvement
Components: Structured Streaming
Affects Versions: 2.3.0
Reporter: Jungtaek Lim
While Spark exposes state metrics for single state, Spark still doesn't expose overall memory usage of state (loadedMaps) in HDFSBackedStateStoreProvider.
Since HDFSBackedStateStoreProvider caches multiple versions of entire state in hashmap, this can occupy much memory than single version of state. Based on the default value of minVersionsToRetain, the size of cache map can grow more than 100 times of the size of single state. It would be better to expose it as well so that end users can determine actual memory usage for state.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org