You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2015/03/11 20:31:38 UTC

[jira] [Resolved] (SAMZA-591) Slow down reconnects in KafkaSystemConsumer

     [ https://issues.apache.org/jira/browse/SAMZA-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Riccomini resolved SAMZA-591.
-----------------------------------
    Resolution: Not a Problem

I know why. It's because all BrokerProxies are calling refreshBroekrs simultaneously due to line 140 of BrokerProxy.scala. When a partition is dropped, they all trigger a refresh. The refresh is thread safe, but it results in the same dropped partition getting checked a bunch of times, aggressively. They all use exponential backoff, but if 20 BP threads all start with 100ms delays, we can expected to see this.

Closing as not a problem.

> Slow down reconnects in KafkaSystemConsumer
> -------------------------------------------
>
>                 Key: SAMZA-591
>                 URL: https://issues.apache.org/jira/browse/SAMZA-591
>             Project: Samza
>          Issue Type: Bug
>          Components: kafka
>    Affects Versions: 0.9.0
>            Reporter: Chris Riccomini
>            Assignee: Chris Riccomini
>              Labels: newbie
>             Fix For: 0.9.0
>
>         Attachments: SAMZA-591-0.patch
>
>
> During a preferred leadership election in Kafka, I see a ton of these messages:
> {noformat}
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> 2015-03-11 17:25:40 GetOffset [INFO] Validating offset 1625488 for topic and partition [topic1,7]
> 2015-03-11 17:25:40 KafkaSystemConsumer [WARN] While refreshing brokers for [topic1,7]: kafka.common.NotLeaderForPartitionException. Retrying.
> {noformat}
> It looks like we are not pausing between retries in KafkaSystemConsumer.refreshBrokers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)