You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "L. C. Hsieh (Jira)" <ji...@apache.org> on 2021/07/28 19:53:00 UTC

[jira] [Resolved] (SPARK-36236) RocksDB state store: Add additional metrics for better observability into state store operations

     [ https://issues.apache.org/jira/browse/SPARK-36236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

L. C. Hsieh resolved SPARK-36236.
---------------------------------
    Fix Version/s: 3.2.0
       Resolution: Fixed

Issue resolved by pull request 33455
[https://github.com/apache/spark/pull/33455]

> RocksDB state store: Add additional metrics for better observability into state store operations
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-36236
>                 URL: https://issues.apache.org/jira/browse/SPARK-36236
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Structured Streaming
>    Affects Versions: 3.1.2
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>            Priority: Major
>             Fix For: 3.2.0
>
>
> Proposing adding following new metrics to {{customMetrics}} under the {{stateOperators}} in {{StreamingQueryProgress}} event These metrics help have better visibility into the RocksDB based state store in streaming jobs.
>  * {{rocksdbGetCount}} number of get calls to the DB (doesn’t include Gets from WriteBatch - in memory batch used for staging writes) 
>  * {{rocksdbPutCount}} number of put calls to the DB (doesn’t include Puts to WriteBatch - in memory batch used for staging writes)
>  * {{rocksdbTotalBytesReadByGet/rocksdbTotalBytesWrittenByPut}}: Number of uncompressed bytes read/written by get/put operations
>  * {{rocksdbReadBlockCacheHitCount/rocksdbReadBlockCacheMissCount}} indicates how much of the block cache in RocksDB is useful or not and avoiding local disk reads
>  * {{rocksdbTotalBytesReadByCompaction/rocksdbTotalBytesWrittenByCompaction}}: How many bytes the compaction process read from disk and written to disk. 
>  * {{rocksdbTotalCompactionTime}}: Time (in ns) took for compactions (both background and the optional compaction initiated during the commit)
>  * {{rocksdbWriterStallDuration}} Time (in ns) the writer has stalled due to a background compaction or flushing of the immutable memtables to disk. 
>  * {{rocksdbTotalBytesReadThroughIterator}} Some of the stateful operations (such as timeout processing in FlatMapGroupsWithState and watermarking) requires reading entire data in DB through iterator. This metric tells the total size of uncompressed data read using the iterator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org