You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Sagar Rao (Jira)" <ji...@apache.org> on 2021/09/20 17:21:00 UTC

[jira] [Assigned] (KAFKA-13296) Verify old assignment within StreamsPartitionAssignor

     [ https://issues.apache.org/jira/browse/KAFKA-13296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sagar Rao reassigned KAFKA-13296:
---------------------------------

    Assignee: Sagar Rao

> Verify old assignment within StreamsPartitionAssignor
> -----------------------------------------------------
>
>                 Key: KAFKA-13296
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13296
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Matthias J. Sax
>            Assignee: Sagar Rao
>            Priority: Major
>
> `StreamsPartitionAssignor` is responsible to assign partitions and tasks to all StreamsThreads within an application.
> While it ensures to not assign a single partition/task to two threads, there is limited verification about it. In particular, we had one incident for with a zombie thread/consumer did not cleanup its own internal state correctly due to KAFKA-12983. This unclean zombie-state implied that the _old assignment_ reported to `StreamsPartitionAssignor` contained a single partition for two consumers. As a result, both threads/consumers later revoked the same partition and the zombie-thread could commit it's unclean work (even if it should have been fenced), leading to duplicate output under EOS_v2.
> We should consider to add a check to `StreamsPartitionAssignor` if the _old assignment_ is valid, ie, no partition should be missing and no partition should be assigned to two consumers. For this case, we should log the invalid _old assignment_ and send an error code back to all consumer that indicates that they should shut down "unclean" (ie, without and flushing and no committing any offsets or transactions).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)