You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "A. Sophie Blee-Goldman (Jira)" <ji...@apache.org> on 2021/02/05 22:59:01 UTC

[jira] [Resolved] (KAFKA-10678) Re-deploying Streams app causes rebalance and task migration

     [ https://issues.apache.org/jira/browse/KAFKA-10678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

A. Sophie Blee-Goldman resolved KAFKA-10678.
--------------------------------------------
      Assignee: A. Sophie Blee-Goldman
    Resolution: Fixed

Resolved via [https://github.com/apache/kafka/pull/9978]

[~thebearmayor] this should be fixed in the upcoming 2.8.0 release and 2.6.2 releases which are currently in progress (and in 2.7.1 but I'm not sure of the schedule for that yet). If/when you're able to upgrade to one of these, please verify that the task shuffling due to redeployment has been mitigated. And obviously, reopen this ticket if not – thanks!

> Re-deploying Streams app causes rebalance and task migration
> ------------------------------------------------------------
>
>                 Key: KAFKA-10678
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10678
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.6.0, 2.6.1
>            Reporter: Bradley Peterson
>            Assignee: A. Sophie Blee-Goldman
>            Priority: Major
>             Fix For: 2.8.0, 2.7.1, 2.6.2
>
>         Attachments: after, before, broker
>
>
> Re-deploying our Streams app causes a rebalance, even when using static group membership. Worse, the rebalance creates standby tasks, even when the previous task assignment was balanced and stable.
> Our app is currently using Streams 2.6.1-SNAPSHOT (due to [KAFKA-10633]) but we saw the same behavior in 2.6.0. The app runs on 4 EC2 instances, each with 4 streams threads, and data stored on persistent EBS volumes.. During a redeploy, all EC2 instances are stopped, new instances are launched, and the EBS volumes are attached to the new instances. We do not use interactive queries. {{session.timeout.ms}} is set to 30 minutes, and the deployment finishes well under that. {{num.standby.replicas}} is 0.
> h2. Expected Behavior
> Given a stable and balanced task assignment prior to deploying, we expect to see the same task assignment after deploying. Even if a rebalance is triggered, we do not expect to see new standby tasks.
> h2. Observed Behavior
> Attached are the "Assigned tasks to clients" log lines from before and after deploying. The "before" is from over 24 hours ago, the task assignment is well balanced and "Finished stable assignment of tasks, no followup rebalances required." is logged. The "after" log lines show the same assignment of active tasks, but some additional standby tasks. There are additional log lines about adding and removing active tasks, which I don't quite understand.
> I've also included logs from the broker showing the rebalance was triggered for "Updating metadata".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)