You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Krzysztof Zarzycki <k....@gmail.com> on 2016/04/07 09:54:57 UTC

RocksDB state checkpointing is expensive?

Hi,
I saw the documentation and source code of the state management with
RocksDB and before I use it, I'm concerned of one thing: Am I right that
currently when state is being checkpointed, the whole RocksDB state is
snapshotted? There is no incremental, diff snapshotting, is it? If so, this
seems to be unfeasible for keeping state counted in tens or hundreds of GBs
(and you reach that size of a state, when you want to keep an embedded
state of the streaming application instead of going out to Cassandra/Hbase
or other DB). It will just cost too much to do snapshots of such large
state.

Samza as a good example to compare, writes every state change to Kafka
topic, considering it a snapshot in the shape of changelog. Of course in
the moment of app restart, recovering the state from the changelog would be
too costly, that is why the changelog topic is compacted. Plus, I think
Samza does a state snapshot from time to time anyway (but I'm not sure of
that).

Thanks for answering my doubts,
Krzysztof

Re: RocksDB state checkpointing is expensive?

Posted by Krzysztof Zarzycki <k....@gmail.com>.

OK, Thanks Aljoscha for the info!
Guys, great work on Flink, I really love it :)

Cheers,
Krzysztof

czw., 7.04.2016 o 10:48 użytkownik Aljoscha Krettek <al...@apache.org>
napisał:

> Hi,
> you are right. Currently there is no incremental checkpointing and
> therefore, at each checkpoint, we essentially copy the whole RocksDB
> database to HDFS (or whatever filesystem you chose as a backup location).
> As far as I know, Stephan will start working on adding support for
> incremental snapshots this week or next week.
>
> Cheers,
> Aljoscha
>
> On Thu, 7 Apr 2016 at 09:55 Krzysztof Zarzycki <k....@gmail.com>
> wrote:
>
>> Hi,
>> I saw the documentation and source code of the state management with
>> RocksDB and before I use it, I'm concerned of one thing: Am I right that
>> currently when state is being checkpointed, the whole RocksDB state is
>> snapshotted? There is no incremental, diff snapshotting, is it? If so, this
>> seems to be unfeasible for keeping state counted in tens or hundreds of GBs
>> (and you reach that size of a state, when you want to keep an embedded
>> state of the streaming application instead of going out to Cassandra/Hbase
>> or other DB). It will just cost too much to do snapshots of such large
>> state.
>>
>> Samza as a good example to compare, writes every state change to Kafka
>> topic, considering it a snapshot in the shape of changelog. Of course in
>> the moment of app restart, recovering the state from the changelog would be
>> too costly, that is why the changelog topic is compacted. Plus, I think
>> Samza does a state snapshot from time to time anyway (but I'm not sure of
>> that).
>>
>> Thanks for answering my doubts,
>> Krzysztof
>>
>>

Re: RocksDB state checkpointing is expensive?

Posted by Aljoscha Krettek <al...@apache.org>.

Hi,
you are right. Currently there is no incremental checkpointing and
therefore, at each checkpoint, we essentially copy the whole RocksDB
database to HDFS (or whatever filesystem you chose as a backup location).
As far as I know, Stephan will start working on adding support for
incremental snapshots this week or next week.

Cheers,
Aljoscha

On Thu, 7 Apr 2016 at 09:55 Krzysztof Zarzycki <k....@gmail.com> wrote:

> Hi,
> I saw the documentation and source code of the state management with
> RocksDB and before I use it, I'm concerned of one thing: Am I right that
> currently when state is being checkpointed, the whole RocksDB state is
> snapshotted? There is no incremental, diff snapshotting, is it? If so, this
> seems to be unfeasible for keeping state counted in tens or hundreds of GBs
> (and you reach that size of a state, when you want to keep an embedded
> state of the streaming application instead of going out to Cassandra/Hbase
> or other DB). It will just cost too much to do snapshots of such large
> state.
>
> Samza as a good example to compare, writes every state change to Kafka
> topic, considering it a snapshot in the shape of changelog. Of course in
> the moment of app restart, recovering the state from the changelog would be
> too costly, that is why the changelog topic is compacted. Plus, I think
> Samza does a state snapshot from time to time anyway (but I'm not sure of
> that).
>
> Thanks for answering my doubts,
> Krzysztof
>
>