You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Taher Koitawala <ta...@gslab.com> on 2018/10/26 06:29:16 UTC

RocksDB checkpointing dir per TM

Hi All,
          Our current cluster configuration uses one HDD which is mainly
for root and an other NVME disk per node, [1]we want make sure all TMs
write their own RocksDB files to the NVME disk only, how do we do that?

[2] Is it also possible to specify multiple directories per TMs so that we
have an even spread when the RocksDB files are written?

Thanks,
Taher Koitawala

Re: RocksDB checkpointing dir per TM

Posted by Till Rohrmann <tr...@apache.org>.
This is a very good point Elias. We actually forgot to add these options to
the configuration documentation after a refactoring. I will fix it.

Cheers,
Till

On Fri, Oct 26, 2018 at 8:27 PM Elias Levy <fe...@gmail.com>
wrote:

> There is also state.backend.rocksdb.localdir.  Oddly, I can find the
> documentation for it in the 1.5 docs
> <https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/state/checkpointing.html#state-backend-rocksdb-localdir>,
> but not in the 1.6 docs
> <https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/state/checkpointing.html>.
> The option is still in master
> <https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBOptions.java#L33-L37>,
> and it is used
> <https://github.com/apache/flink/blob/e62a7eabe34da42e4cb56d1b14905a3b9e9c9bd4/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBStateBackend.java#L289>
> .
>
> On Fri, Oct 26, 2018 at 3:01 AM Andrey Zagrebin <an...@data-artisans.com>
> wrote:
>
>> Hi Taher,
>>
>> TMs keep state locally while running, in this case RocksDB files already
>> belong to TM.
>> You can point it to the same NVME disk location on each node, relevant
>> Flink options here are:
>> - io.tmp.dirs
>> - taskmanager.state.local.root-dirs
>> This data is transient and has temporary nature. It does not survive a
>> job failure.
>>
>> The checkpoint is a logical snapshot of the operator state for all
>> involved TMs,
>> so it belongs to the job and usually uploaded to a distributed file
>> system available on all TMs.
>> The location is set in Flink option ‘state.checkpoints.dir'.
>> This way job can restore from it with different set of TMs.
>>
>> Best,
>> Andrey
>>
>> > On 26 Oct 2018, at 08:29, Taher Koitawala <ta...@gslab.com>
>> wrote:
>> >
>> > Hi All,
>> >           Our current cluster configuration uses one HDD which is
>> mainly for root and an other NVME disk per node, [1]we want make sure all
>> TMs write their own RocksDB files to the NVME disk only, how do we do that?
>> >
>> > [2] Is it also possible to specify multiple directories per TMs so that
>> we have an even spread when the RocksDB files are written?
>> >
>> > Thanks,
>> > Taher Koitawala
>>
>>

Re: RocksDB checkpointing dir per TM

Posted by Elias Levy <fe...@gmail.com>.
There is also state.backend.rocksdb.localdir.  Oddly, I can find the
documentation for it in the 1.5 docs
<https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/state/checkpointing.html#state-backend-rocksdb-localdir>,
but not in the 1.6 docs
<https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/state/checkpointing.html>.
The option is still in master
<https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBOptions.java#L33-L37>,
and it is used
<https://github.com/apache/flink/blob/e62a7eabe34da42e4cb56d1b14905a3b9e9c9bd4/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBStateBackend.java#L289>
.

On Fri, Oct 26, 2018 at 3:01 AM Andrey Zagrebin <an...@data-artisans.com>
wrote:

> Hi Taher,
>
> TMs keep state locally while running, in this case RocksDB files already
> belong to TM.
> You can point it to the same NVME disk location on each node, relevant
> Flink options here are:
> - io.tmp.dirs
> - taskmanager.state.local.root-dirs
> This data is transient and has temporary nature. It does not survive a job
> failure.
>
> The checkpoint is a logical snapshot of the operator state for all
> involved TMs,
> so it belongs to the job and usually uploaded to a distributed file system
> available on all TMs.
> The location is set in Flink option ‘state.checkpoints.dir'.
> This way job can restore from it with different set of TMs.
>
> Best,
> Andrey
>
> > On 26 Oct 2018, at 08:29, Taher Koitawala <ta...@gslab.com>
> wrote:
> >
> > Hi All,
> >           Our current cluster configuration uses one HDD which is mainly
> for root and an other NVME disk per node, [1]we want make sure all TMs
> write their own RocksDB files to the NVME disk only, how do we do that?
> >
> > [2] Is it also possible to specify multiple directories per TMs so that
> we have an even spread when the RocksDB files are written?
> >
> > Thanks,
> > Taher Koitawala
>
>

Re: RocksDB checkpointing dir per TM

Posted by Taher Koitawala <ta...@gslab.com>.
Thanks!

On Fri 26 Oct, 2018, 3:31 PM Andrey Zagrebin, <an...@data-artisans.com>
wrote:

> Hi Taher,
>
> TMs keep state locally while running, in this case RocksDB files already
> belong to TM.
> You can point it to the same NVME disk location on each node, relevant
> Flink options here are:
> - io.tmp.dirs
> - taskmanager.state.local.root-dirs
> This data is transient and has temporary nature. It does not survive a job
> failure.
>
> The checkpoint is a logical snapshot of the operator state for all
> involved TMs,
> so it belongs to the job and usually uploaded to a distributed file system
> available on all TMs.
> The location is set in Flink option ‘state.checkpoints.dir'.
> This way job can restore from it with different set of TMs.
>
> Best,
> Andrey
>
> > On 26 Oct 2018, at 08:29, Taher Koitawala <ta...@gslab.com>
> wrote:
> >
> > Hi All,
> >           Our current cluster configuration uses one HDD which is mainly
> for root and an other NVME disk per node, [1]we want make sure all TMs
> write their own RocksDB files to the NVME disk only, how do we do that?
> >
> > [2] Is it also possible to specify multiple directories per TMs so that
> we have an even spread when the RocksDB files are written?
> >
> > Thanks,
> > Taher Koitawala
>
>

Re: RocksDB checkpointing dir per TM

Posted by Andrey Zagrebin <an...@data-artisans.com>.
Hi Taher,

TMs keep state locally while running, in this case RocksDB files already belong to TM.
You can point it to the same NVME disk location on each node, relevant Flink options here are:
- io.tmp.dirs
- taskmanager.state.local.root-dirs
This data is transient and has temporary nature. It does not survive a job failure.

The checkpoint is a logical snapshot of the operator state for all involved TMs, 
so it belongs to the job and usually uploaded to a distributed file system available on all TMs.
The location is set in Flink option ‘state.checkpoints.dir'.
This way job can restore from it with different set of TMs.

Best,
Andrey

> On 26 Oct 2018, at 08:29, Taher Koitawala <ta...@gslab.com> wrote:
> 
> Hi All,
>           Our current cluster configuration uses one HDD which is mainly for root and an other NVME disk per node, [1]we want make sure all TMs write their own RocksDB files to the NVME disk only, how do we do that? 
> 
> [2] Is it also possible to specify multiple directories per TMs so that we have an even spread when the RocksDB files are written?  
> 
> Thanks,
> Taher Koitawala