You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Guozhang Wang (Jira)" <ji...@apache.org> on 2020/04/01 21:53:00 UTC

[jira] [Created] (KAFKA-9801) Static member could get empty assignment unexpectedly

Guozhang Wang created KAFKA-9801:
------------------------------------

             Summary: Static member could get empty assignment unexpectedly
                 Key: KAFKA-9801
                 URL: https://issues.apache.org/jira/browse/KAFKA-9801
             Project: Kafka
          Issue Type: Bug
          Components: consumer, streams
    Affects Versions: 2.4.0
            Reporter: Guozhang Wang
            Assignee: Guozhang Wang


Take the following example trace where static members are joining the group:

1. Static member with instance A joined the group with empty member, the coordinator generated member.id 1 for A and added it to the group. The group state is PreparingRebalance.

2. The group is formed and now we move on to CompletingRebalance.

3. Another member joins the group, causing it to transit back to PreparingRebalance, which would potentially send a REBALANCE_IN_PROGRESS to member A as well.

4. Member A gets the REBALANCE_IN_PROGRESS error, trying to re-join (again with an empty member.id)

5. The group is not advanced to CompletingRebalance again.

6. The group get the second join-group from the known instance A with an empty member.id, will generated a new member.id 2 and replace the member.id 1.

7. The group gets the assignment from leader which only includes member.id 1 and not member.id 2.

8. The assignment for member.id 1 is dropped on the broker side while the assignment for member.id 2 is set to an empty byte array.

9. The empty byte array is sent back to the instance A causing it the following error:

{code}
[2020-03-27T21:13:01-05:00] (streams-soak-2-5_soak_i-054b83e98b7ed6285_streamslog) org.apache.kafka.common.protocol.types.SchemaException: Error reading field 'version': java.nio.BufferUnderflowException
	at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:110)
{code}

This error has to be triggered when quite a few cases are aligned together, and hence it was not triggered very frequently.

Personally I think there's a correlation with this error to the observed https://issues.apache.org/jira/browse/KAFKA-9659 as well, which I'd keep investigating (will update in this ticket).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)