You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Abdul Rahman <ab...@gmail.com> on 2022/01/22 06:51:10 UTC
Question about MapState size
Hello,
I have a streaming application that has an operator based on the
KeyedCoProcessFunction. The operator has a MapState object. I store
some data in this operator with a fixed ttl. I would like to monitor
the size/count of this state over time since its related to some
operational metrics we want to track. Seems like a simple thing to do;
but I havent come up with a way to do so
Given that iterating over the complete map is an expensive operation,
I only plan to do so periodically. The first issue is that , the
stream is keyed, so any time i do a count of the mapstate, i dont get
the complete size of the state object, but only count pertaining to
the specific key of partition. Is there a way around this ?
Secondly, is there a way to monitor rocksdb usage over time. I can
find managed memory metrics. but this does not include disk space
rocksdb used. is there a way to get this from standard flink metrics;
either task manager or job manager ?
Re: Question about MapState size
Posted by Yun Tang <my...@live.com>.
Hi Abdul,
What does "only count pertaining to the specific key of partition" mean? The counting size is for the map related to a specific selected key or the all the maps in the whole map state?
You can leverage RocksDB's native metrics to monitor the rocksDB usage, such as total-sst-files-size[1] to know the total sst files on disks of each rocksDB.
[1] https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/config/#state-backend-rocksdb-metrics-total-sst-files-size
Best
Yun Tang
________________________________
From: Abdul Rahman <ab...@gmail.com>
Sent: Saturday, January 22, 2022 14:51
To: user@flink.apache.org <us...@flink.apache.org>
Subject: Question about MapState size
Hello,
I have a streaming application that has an operator based on the
KeyedCoProcessFunction. The operator has a MapState object. I store
some data in this operator with a fixed ttl. I would like to monitor
the size/count of this state over time since its related to some
operational metrics we want to track. Seems like a simple thing to do;
but I havent come up with a way to do so
Given that iterating over the complete map is an expensive operation,
I only plan to do so periodically. The first issue is that , the
stream is keyed, so any time i do a count of the mapstate, i dont get
the complete size of the state object, but only count pertaining to
the specific key of partition. Is there a way around this ?
Secondly, is there a way to monitor rocksdb usage over time. I can
find managed memory metrics. but this does not include disk space
rocksdb used. is there a way to get this from standard flink metrics;
either task manager or job manager ?