You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Justine Olshan (Jira)" <ji...@apache.org> on 2021/07/18 03:12:00 UTC

[jira] [Updated] (KAFKA-13102) Topic IDs not propagated to metadata cache quickly enough for Fetch path

     [ https://issues.apache.org/jira/browse/KAFKA-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Justine Olshan updated KAFKA-13102:
-----------------------------------
    Description: 
Currently, the fetch path for replicas relies on the topic IDs in the metadata cache. However, the propagation of topic ID information is done through the UpdateMetadata request and is too slow. At first the topic will have no ID in the metadata cache and we will send an older request and then we get the ID and have to close the session. This will likely happen on broker startup and with new topics. This has resulted in increased partitions in error, frequent closing of sessions and made tests like ConsumerBounceTest#testCloseDuringRebalance extremely flaky.

A quick test with topic IDs stored in the replica manager during the handling of LISR requests showed that significantly fewer errors and made ConsumerBounceTest#testCloseDuringRebalance much less flaky (passing 50/50 runs vs. 11/50 runs).

The task now is figuring out the best strategy to store topic IDs for the fetch path using the IDs from the LISR request.

  was:
Currently, the fetch path for replicas relies on the topic IDs in the metadata cache. However, the propagation of topic ID information is done through the UpdateMetadata request and is too slow. This has resulted in increased partitions in error, frequent closing of sessions and made tests like ConsumerBounceTest#testCloseDuringRebalance extremely flaky.

A quick test with topic IDs stored in the replica manager during the handling of LISR requests showed that significantly fewer errors and made ConsumerBounceTest#testCloseDuringRebalance much less flaky (passing 50/50 runs vs. 11/50 runs).

The task now is figuring out the best strategy to store topic IDs for the fetch path using the IDs from the LISR request.


> Topic IDs not propagated to metadata cache quickly enough for Fetch path
> ------------------------------------------------------------------------
>
>                 Key: KAFKA-13102
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13102
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Justine Olshan
>            Assignee: Justine Olshan
>            Priority: Major
>
> Currently, the fetch path for replicas relies on the topic IDs in the metadata cache. However, the propagation of topic ID information is done through the UpdateMetadata request and is too slow. At first the topic will have no ID in the metadata cache and we will send an older request and then we get the ID and have to close the session. This will likely happen on broker startup and with new topics. This has resulted in increased partitions in error, frequent closing of sessions and made tests like ConsumerBounceTest#testCloseDuringRebalance extremely flaky.
> A quick test with topic IDs stored in the replica manager during the handling of LISR requests showed that significantly fewer errors and made ConsumerBounceTest#testCloseDuringRebalance much less flaky (passing 50/50 runs vs. 11/50 runs).
> The task now is figuring out the best strategy to store topic IDs for the fetch path using the IDs from the LISR request.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)