You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by bupt_ljy <bu...@163.com> on 2019/11/05 13:15:07 UTC

Re: RocksDB state on HDFS seems not being cleanned up

This should be sent to user mailing list. Moving it here...


 Original Message 
Sender: bupt_ljy<bu...@163.com>
Recipient: dev<de...@flink.apache.org>
Date: Tuesday, Nov 5, 2019 21:13
Subject: Re: RocksDB state on HDFS seems not being cleanned up


Hi Shuwen, The “shared” means that the state files are shared among multiple checkpoints, which happens when you enable incremental checkpointing[1]. Therefore, it’s reasonable that the size keeps growing if you set “state.checkpoint.num-retained” to be a big value. [1] https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html Best, Jiayi Liao Original Message Sender: shuwen zhou<ja...@gmail.com> Recipient: dev<de...@flink.apache.org> Date: Tuesday, Nov 5, 2019 17:59 Subject: RocksDB state on HDFS seems not being cleanned up Hi Community, I have a job running on Flink1.9.0 on YARN with rocksDB on HDFS with incremental checkpoint enabled. I have some MapState in code with following config: val ttlConfig = StateTtlConfig .newBuilder(Time.minutes(30) .updateTtlOnCreateAndWrite() .cleanupInBackground() .cleanupFullSnapshot() .setStateVisibility(StateTtlConfig.StateVisibility.ReturnExpiredIfNotCleanedUp) After running for around 2 days, I observed checkpoint folder is showing 44.4 M /flink-chk743e4568a70b626837b/chk-40 65.9 M /flink-chk743e4568a70b626837b/chk-41 91.7 M /flink-chk743e4568a70b626837b/chk-42 96.1 M /flink-chk743e4568a70b626837b/chk-43 48.1 M /flink-chk743e4568a70b626837b/chk-44 71.6 M /flink-chk743e4568a70b626837b/chk-45 50.9 M /flink-chk743e4568a70b626837b/chk-46 90.2 M /flink-chk743e4568a70b626837b/chk-37 49.3 M /flink-chk743e4568a70b626837b/chk-38 96.9 M /flink-chk743e4568a70b626837b/chk-39 797.9 G /flink-chk743e4568a70b626837b/shared The ./shared folder size seems continuing increasing and seems the folder is not being clean up. However while I disabled incremental cleanup, the expired full snapshot will be removed automatically. Is there any way to remove outdated state on HDFS to stop it from increasing? Thanks. -- Best Wishes, Shuwen Zhou