You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Nikolay Izhikov (Jira)" <ji...@apache.org> on 2020/04/30 16:33:00 UTC

[jira] [Commented] (KAFKA-3184) Add Checkpoint for In-memory State Store

    [ https://issues.apache.org/jira/browse/KAFKA-3184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096718#comment-17096718 ] 

Nikolay Izhikov commented on KAFKA-3184:
----------------------------------------

Hello, [~guozhang]

I've prepared PR with the simplest implementation of an in-memory state store checkpointer.

If persistent mode is enabled then:
  * checkpoint thread started on KeyValueStore#init
  * every InMemoryKeyValueStore#COUNT_FLUSH_TO_STORE flush execution copy of the InMemoryKeyValueStore#map passed to checkpoint thread.
  * checkpoint thread persists data every time it sees a new instance of InMemoryKeyValueStore#map.
  * persisted data are loaded on KeyValueStore#init.

Can you, please, take a look.

> Add Checkpoint for In-memory State Store
> ----------------------------------------
>
>                 Key: KAFKA-3184
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3184
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Guozhang Wang
>            Assignee: Nikolay Izhikov
>            Priority: Major
>              Labels: user-experience
>
> Currently Kafka Streams does not make a checkpoint of the persistent state store upon committing, which would be expensive since it is "stopping the world" and write on disks: for example, RocksDB would require you to copy the file directory to make a copy naively. 
> However, for in-memory stores checkpointing maybe doable in an asynchronous manner hence it can be done quickly. And the benefit of having intermediate checkpoint is to avoid restoring from scratch if standby tasks are not present.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)