You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Amit Daga (JIRA)" <ji...@apache.org> on 2017/05/08 19:19:04 UTC
[jira] [Commented] (KAFKA-4479) Streams tests should pass without
hardcoded Time.SYSTEM in GroupCoordinator
[ https://issues.apache.org/jira/browse/KAFKA-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001370#comment-16001370 ]
Amit Daga commented on KAFKA-4479:
----------------------------------
[~ijuma] Initial findings: After trying to debug for a while, I found that the test fails even before we try to close the stream runnables. Lines beyond [1] are not executed.
Test report has been attached for your reference (test.zip). Please let me know your inputs on this.
[1] https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/QueryableStateIntegrationTest.java#L357
> Streams tests should pass without hardcoded Time.SYSTEM in GroupCoordinator
> ---------------------------------------------------------------------------
>
> Key: KAFKA-4479
> URL: https://issues.apache.org/jira/browse/KAFKA-4479
> Project: Kafka
> Issue Type: Improvement
> Reporter: Ismael Juma
> Assignee: Amit Daga
> Priority: Minor
> Labels: newbie, newbie++
> Attachments: test.zip
>
>
> If we pass `KafkaServer.time` to `GroupCoordinator`[1], some streams tests like QueryableStateIntegrationTest fail sem-regularly. [~damianguy] looked into it and described it as:
> {quote}
> Looking at the sequence of events, one thread is stopped, and hence leaves the group triggering a rebalance, but the other thread doesn’t seem to get the memo, tries to commit, fails, and then game-over.
> So.. the case that it fails the one alive thread is not getting a rebalance. This would happen during a `poll(..)` right? However i can see the thread is polling many times after the other thread has shutdown.
> It tries to commit every time around the loop, so:
> poll(..)
> process(..)
> maybeCommit(..)
> and there is like < 10ms between calls to `poll`.
> {quote}
> A theory was that the mock time was not advancing enough to trigger a rebalance in the group coordinator. However, the consumer is closed, so that should trigger a `LeaveGroup` request and it's unclear why a rebalance is not triggered for the live consumer.
> PR where this issue was first seen and discussed: https://github.com/apache/kafka/pull/2095
> [1] https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaServer.scala#L222
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)