You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Till Rohrmann <tr...@apache.org> on 2018/11/07 15:54:33 UTC

Re: RocksDB checkpointing dir per TM

This is a very good point Elias. We actually forgot to add these options to
the configuration documentation after a refactoring. I will fix it.

Cheers,
Till

On Fri, Oct 26, 2018 at 8:27 PM Elias Levy <fe...@gmail.com>
wrote:

> There is also state.backend.rocksdb.localdir.  Oddly, I can find the
> documentation for it in the 1.5 docs
> <https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/state/checkpointing.html#state-backend-rocksdb-localdir>,
> but not in the 1.6 docs
> <https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/state/checkpointing.html>.
> The option is still in master
> <https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBOptions.java#L33-L37>,
> and it is used
> <https://github.com/apache/flink/blob/e62a7eabe34da42e4cb56d1b14905a3b9e9c9bd4/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBStateBackend.java#L289>
> .
>
> On Fri, Oct 26, 2018 at 3:01 AM Andrey Zagrebin <an...@data-artisans.com>
> wrote:
>
>> Hi Taher,
>>
>> TMs keep state locally while running, in this case RocksDB files already
>> belong to TM.
>> You can point it to the same NVME disk location on each node, relevant
>> Flink options here are:
>> - io.tmp.dirs
>> - taskmanager.state.local.root-dirs
>> This data is transient and has temporary nature. It does not survive a
>> job failure.
>>
>> The checkpoint is a logical snapshot of the operator state for all
>> involved TMs,
>> so it belongs to the job and usually uploaded to a distributed file
>> system available on all TMs.
>> The location is set in Flink option ‘state.checkpoints.dir'.
>> This way job can restore from it with different set of TMs.
>>
>> Best,
>> Andrey
>>
>> > On 26 Oct 2018, at 08:29, Taher Koitawala <ta...@gslab.com>
>> wrote:
>> >
>> > Hi All,
>> >           Our current cluster configuration uses one HDD which is
>> mainly for root and an other NVME disk per node, [1]we want make sure all
>> TMs write their own RocksDB files to the NVME disk only, how do we do that?
>> >
>> > [2] Is it also possible to specify multiple directories per TMs so that
>> we have an even spread when the RocksDB files are written?
>> >
>> > Thanks,
>> > Taher Koitawala
>>
>>