You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Randall Hauch (Jira)" <ji...@apache.org> on 2020/06/24 20:04:02 UTC

[jira] [Updated] (KAFKA-9624) test_throttled_reassignment as EndToEndTest

     [ https://issues.apache.org/jira/browse/KAFKA-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Randall Hauch updated KAFKA-9624:
---------------------------------
    Fix Version/s:     (was: 2.6.0)
                   2.7.0

Since this is not a blocker issue, as part of the 2.6.0 release process I'm changing the fix version to `2.7.0`. If this is incorrect, please respond and discuss on the "[DISCUSS] Apache Kafka 2.6.0 release" discussion mailing list thread.

> test_throttled_reassignment as EndToEndTest
> -------------------------------------------
>
>                 Key: KAFKA-9624
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9624
>             Project: Kafka
>          Issue Type: Bug
>          Components: system tests
>    Affects Versions: 2.4.1
>            Reporter: Matthew Wong
>            Priority: Minor
>              Labels: test
>             Fix For: 2.7.0
>
>
> The test_throttled_reassignment test fails because the consumer that is used to validate reassignment does not start on time to consume all messages. This does not seem like an issue with the throttling of the reassignment, since increasing the timeout allowed the test to pass multiple consecutive runs locally. This test seemed to rely on the default JmxTool for the console consumer that was removed in this commit: [{{179d0d7}}|https://github.com/apache/kafka/commit/179d0d73d65ab2c3eb8bc79c70b9893f07038447]
> The console consumer would check to see if it had partitions assigned to it before beginning to consume. Although the test occasionally failed with the JmxTool, it began to fail much more after the removal. Error messages of failures followed the below format with varying numbers of missed messages. They are the first messages by the producer.
> ```535 acked message did not make it to the Consumer. They are: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19...plus 515 more. Total Acked: 192792, Total Consumed: 192259. We validated that the first 535 of these missing messages correctly made it into Kafka's data files. This suggests they were lost on their way to the consumer.```
> In the scope of the test, this error suggests that the test is falling into the race condition described in produce_consume_validate.py, which has the timeout to prevent the consumer from missing initial messages. Rewriting this test as an EndToEndTest allows to use its verifiable consumer that can await partition assignment, addressing the race condition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)