You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by burgesschen <tc...@bloomberg.net> on 2018/07/16 15:57:42 UTC

Ever increasing key space

Hi every one,

We are building a flink job that keys on a dynamic value. Only a few events
share the same key and events with new keys are consumed constantly.

For each key, there are some keyedState created the first time it is seen.
And we clean up the keyedState if the key has not been seen for X minutes
using a timer.

My question is:
If the key space is ever increasing? Does it result in an ever increasing
checkpoint size even I clean up the keyedState? 

Thank you!


Best,
-Chen



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Ever increasing key space

Posted by Yun Tang <my...@live.com>.
Hi Chen

From your description, I think you called keyedState.clear() to clear up the key which has not been seen for several minutes.

  *   For HeapKeyedStateBackend, it will just remove the related content from memory immediately, no worry about the increasing checkpoint size.
  *   For RocksDBKeyedStateBackend, it will record delete operation for the key bytes in the DB, but the actual 'remove' (not occupying any space for the to-delete-key) would happen when compaction executed generally. In other words, if you called keyedState.clear() to clean up current key related bytes, you might not expect the checkpoint size decreased immediately but it eventually decreases as rocksDB always running compaction. If you still worry about this, consider to increase the background compaction threads for RocksDB by calling DBOptions.setMaxBackgroundCompactions or DBOptions.setIncreaseParallelism .

Best,
Yun
________________________________
From: burgesschen <tc...@bloomberg.net>
Sent: Monday, July 16, 2018 23:57
To: user@flink.apache.org
Subject: Ever increasing key space

Hi every one,

We are building a flink job that keys on a dynamic value. Only a few events
share the same key and events with new keys are consumed constantly.

For each key, there are some keyedState created the first time it is seen.
And we clean up the keyedState if the key has not been seen for X minutes
using a timer.

My question is:
If the key space is ever increasing? Does it result in an ever increasing
checkpoint size even I clean up the keyedState?

Thank you!


Best,
-Chen



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/