You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by "Amit Daga (JIRA)" <ji...@apache.org> on 2017/05/08 19:19:04 UTC

[jira] [Commented] (KAFKA-4479) Streams tests should pass without hardcoded Time.SYSTEM in GroupCoordinator

    [ https://issues.apache.org/jira/browse/KAFKA-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001370#comment-16001370 ] 

Amit Daga commented on KAFKA-4479:
----------------------------------

[~ijuma] Initial findings: After trying to debug for a while, I found that the test fails even before we try to close the stream runnables. Lines beyond [1] are not executed. 

Test report has been attached for your reference (test.zip). Please let me know your inputs on this.

[1] https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/QueryableStateIntegrationTest.java#L357

> Streams tests should pass without hardcoded Time.SYSTEM in GroupCoordinator
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-4479
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4479
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Ismael Juma
>            Assignee: Amit Daga
>            Priority: Minor
>              Labels: newbie, newbie++
>         Attachments: test.zip
>
>
> If we pass `KafkaServer.time` to `GroupCoordinator`[1], some streams tests like QueryableStateIntegrationTest fail sem-regularly. [~damianguy] looked into it and described it as:
> {quote}
> Looking at the sequence of events, one thread is stopped, and hence leaves the group triggering a rebalance, but the other thread doesn’t seem to get the memo, tries to commit, fails, and then game-over.
> So.. the case that it fails the one alive thread is not getting a rebalance. This would happen during  a `poll(..)` right? However i can see the thread is polling many times after the other thread has shutdown.
> It tries to commit every time around the loop, so:
> poll(..)
> process(..)
> maybeCommit(..)
> and there is like < 10ms between calls to `poll`.
> {quote}
> A theory was that the mock time was not advancing enough to trigger a rebalance in the group coordinator. However, the consumer is closed, so that should trigger a `LeaveGroup` request and it's unclear why a rebalance is not triggered for the live consumer.
> PR where this issue was first seen and discussed: https://github.com/apache/kafka/pull/2095
> [1] https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaServer.scala#L222



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)