You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Gengliang Wang (Jira)" <ji...@apache.org> on 2022/11/08 22:14:00 UTC

[jira] [Created] (SPARK-41053) Support disk-based KV store in Spark live UI

Gengliang Wang created SPARK-41053:
--------------------------------------

             Summary: Support disk-based KV store in Spark live UI
                 Key: SPARK-41053
                 URL: https://issues.apache.org/jira/browse/SPARK-41053
             Project: Spark
          Issue Type: Umbrella
          Components: Spark Core, Web UI
    Affects Versions: 3.4.0
            Reporter: Gengliang Wang


The current architecture of Spark live UI and Spark history server(SHS) is too simple to serve large clusters and heavy workloads:
 * Spark stores all the live UI date in memory. The size can be a few GBs and affects the driver's stability (OOM). 
 * There is a limitation of storing 1000 queries only. Note that we can’t simply increase the limitation under the current Architecture. I did a memory profiling. Storing one query execution detail can take 800KB while storing one task requires 0.3KB. So for 1000 SQL queries with 1000* 2000 tasks, the memory usage for query execution and task data will be 1.4GB. Spark UI stores UI data for jobs/stages/executors as well.  So to store 10k queries, it may take more than 14GB.
 * SHS has to parse JSON format event log for the initial start.  The uncompressed event logs can be as big as a few GBs, and the parse can be quite slow. Some users reported they had to wait for more than half an hour.

 

The proposal is to:
 # Store all the live UI data in local RocksDB with protobuf serialization.
 # The RocksDB files of live UI can be used on SHS directly.
 # If the RocksDB file is unavailable for SHS, event logs can be written with protobuf for faster replay.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org