You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Nicholas Telford (Jira)" <ji...@apache.org> on 2022/01/28 14:54:00 UTC

[jira] [Created] (KAFKA-13627) Topology changes shouldn't require a full reset of local state

Nicholas Telford created KAFKA-13627:
----------------------------------------

             Summary: Topology changes shouldn't require a full reset of local state
                 Key: KAFKA-13627
                 URL: https://issues.apache.org/jira/browse/KAFKA-13627
             Project: Kafka
          Issue Type: Improvement
          Components: streams
    Affects Versions: 3.1.0
            Reporter: Nicholas Telford


[KIP-816|https://cwiki.apache.org/confluence/display/KAFKA/KIP-816%3A+Topology+changes+without+local+state+reset]

When changes are made to a Topology that modifies its structure, users must use the Application Reset tool to reset the local state of their application prior to deploying the change. Consequently, these changes require rebuilding all local state stores from their changelog topics in Kafka.

The time and cost of rebuilding state stores is determined by the size of the state stores, and their recent write history, as rebuilding a store entails replaying all recent writes to the store. For applications that have very large stores, or stores with extremely high write-rates, the time and cost of rebuilding all state in the application can be prohibitively expensive. This is a significant barrier to building highly scalable applications with good availability.

Changes to the Topology that do not directly affect a state store should not require the local state of that store to be reset/deleted. This would allow applications to scale to very large data sets, whilst permitting the application behaviour to evolve over time.
h1. Background

Tasks in a Kafka Streams Topology are logically grouped by “Topic Group'' (aka. Subtopology). Topic Groups are assigned an ordinal (number), based on their position in the Topology. This Topic Group ordinal is used as the prefix for all Task IDs: {{{}<topic-group-ordinal>_<partition-number>{}}}, e.g. {{2_14}}

If new Topic Groups are added, old Topic Groups are removed, or existing Topic Groups are re-arranged, this can cause the assignment of ordinals to change {_}even for Topic Groups that have not been modified{_}.

When the assignment of ordinals to Topic Groups changes, existing Tasks are invalidated, as they no longer correspond to the correct Topic Groups. Local state is located in directories that include the Task ID (e.g. {{{}/state/dir/2_14/mystore/rocksdb/…{}}}), and since the Tasks have all been invalidated, all existing local state directories are also invalid.

Attempting to start an application that has undergone these ordinal changes, without first clearing the local state, will cause Kafka Streams to attempt to use the existing local state for the wrong Tasks. Kafka Streams detects this discrepancy and prevents the application from starting.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)