You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Zach Cox <zc...@gmail.com> on 2016/04/05 22:24:00 UTC

Kafka state backend?

Hi - as clarified in another thread [1] stateful operators store all of
their current state in the backend on each checkpoint. Just curious if
Kafka topics with log compaction have ever been considered as a possible
state backend?

Samza [2] uses RocksDB as a local state store, with all writes also going
to a log-compacted Kafka topic for persistence. This seems like it might
also be a good alternative backend in Flink for jobs with large amounts of
long-lasting state. You would give up some throughput (due to Kafka
producer writes) but there would be almost nothing to do on checkpoints.

Just wanted to propose the idea and see if it has already been discussed,
or maybe I'm missing some reasons why it would be a bad idea.

Thanks,
Zach

[1]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-state-stored-in-backend-and-deleting-old-checkpoint-state-td5927.html
[2]
http://samza.apache.org/learn/documentation/0.10/container/state-management.html#local-state-in-samza

Re: Kafka state backend?

Posted by Zach Cox <zc...@gmail.com>.
Ah I didn't see that, thanks for the link! Glad this is being discussed.

On Thu, Apr 7, 2016 at 5:06 AM Aljoscha Krettek <al...@apache.org> wrote:

> Hi Zach,
> I'm afraid someone already beat you to it :-)
> https://issues.apache.org/jira/browse/FLINK-3692
>
> In the issue we touch on some of the difficulties with this that stem from
> the differences in the guarantees that Flink and Samza try to give.
>
> Cheers,
> Aljoscha
>
> On Tue, 5 Apr 2016 at 22:24 Zach Cox <zc...@gmail.com> wrote:
>
>> Hi - as clarified in another thread [1] stateful operators store all of
>> their current state in the backend on each checkpoint. Just curious if
>> Kafka topics with log compaction have ever been considered as a possible
>> state backend?
>>
>> Samza [2] uses RocksDB as a local state store, with all writes also going
>> to a log-compacted Kafka topic for persistence. This seems like it might
>> also be a good alternative backend in Flink for jobs with large amounts of
>> long-lasting state. You would give up some throughput (due to Kafka
>> producer writes) but there would be almost nothing to do on checkpoints.
>>
>> Just wanted to propose the idea and see if it has already been discussed,
>> or maybe I'm missing some reasons why it would be a bad idea.
>>
>> Thanks,
>> Zach
>>
>> [1]
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-state-stored-in-backend-and-deleting-old-checkpoint-state-td5927.html
>> [2]
>> http://samza.apache.org/learn/documentation/0.10/container/state-management.html#local-state-in-samza
>>
>>

Re: Kafka state backend?

Posted by Aljoscha Krettek <al...@apache.org>.
Hi Zach,
I'm afraid someone already beat you to it :-)
https://issues.apache.org/jira/browse/FLINK-3692

In the issue we touch on some of the difficulties with this that stem from
the differences in the guarantees that Flink and Samza try to give.

Cheers,
Aljoscha

On Tue, 5 Apr 2016 at 22:24 Zach Cox <zc...@gmail.com> wrote:

> Hi - as clarified in another thread [1] stateful operators store all of
> their current state in the backend on each checkpoint. Just curious if
> Kafka topics with log compaction have ever been considered as a possible
> state backend?
>
> Samza [2] uses RocksDB as a local state store, with all writes also going
> to a log-compacted Kafka topic for persistence. This seems like it might
> also be a good alternative backend in Flink for jobs with large amounts of
> long-lasting state. You would give up some throughput (due to Kafka
> producer writes) but there would be almost nothing to do on checkpoints.
>
> Just wanted to propose the idea and see if it has already been discussed,
> or maybe I'm missing some reasons why it would be a bad idea.
>
> Thanks,
> Zach
>
> [1]
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-state-stored-in-backend-and-deleting-old-checkpoint-state-td5927.html
> [2]
> http://samza.apache.org/learn/documentation/0.10/container/state-management.html#local-state-in-samza
>
>