You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Thomas Becker <to...@Tivo.com> on 2017/05/09 13:36:55 UTC

GlobalKTable not checkpointing offsets but reusing store

I'm experimenting with a streams application that does a KStream-
GlobalKTable join, and I'm seeing some unexpected behavior when re-
running the application. First, it does not appear that the offsets in
the topic backing the GlobalKTable are being checkpointed to a file as
I expected. This results in the RocksDB store being rebuilt everytime I
run the app, which is time consuming. After some investigation, it
appears that the offset map that is written to the checkpoint file is
only updated once the application is at a steady-state. So the initial
state of the global table after restore is never checkpointed unless
additional messages come in. This results in the entire topic backing
the global table being re-read and re-inserted into the same RocksDB
instance every time the app starts, which makes the store very large
(since it then contains multiple copies of every message) and triggers
lots of compactions. Is this intended? Should I open a JIRA?


--


    Tommy Becker

    Senior Software Engineer

    O +1 919.460.4747

    tivo.com


________________________________

This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.