You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Jakob Homan (JIRA)" <ji...@apache.org> on 2014/10/23 00:29:34 UTC

[jira] [Commented] (SAMZA-440) UnknownTopicOrPartitionCode results in infinite loop in BrokerProxy

    [ https://issues.apache.org/jira/browse/SAMZA-440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180675#comment-14180675 ] 

Jakob Homan commented on SAMZA-440:
-----------------------------------

+1

> UnknownTopicOrPartitionCode results in infinite loop in BrokerProxy
> -------------------------------------------------------------------
>
>                 Key: SAMZA-440
>                 URL: https://issues.apache.org/jira/browse/SAMZA-440
>             Project: Samza
>          Issue Type: Bug
>          Components: kafka
>    Affects Versions: 0.8.0
>            Reporter: Chris Riccomini
>            Assignee: Chris Riccomini
>             Fix For: 0.8.0
>
>         Attachments: SAMZA-440-0.patch
>
>
> We have seen several occasions where shifting partitions in a Kafka cluster results in some Samza containers getting stuck with:
> {noformat}
> 2014-10-22 15:10:48 BrokerProxy [INFO] Creating new SimpleConsumer for host eat1-app582.corp:10251 for system kafka
> 2014-10-22 15:10:48 BrokerProxy [WARN] Got non-recoverable error codes during multifetch. Throwing an exception to trigger reconnect. Errors: Error([all-service-call-events,10],3,kafka.common.UnknownTopicOrPartitionException)
> 2014-10-22 15:10:48 BrokerProxy [WARN] Restarting consumer due to kafka.common.UnknownTopicOrPartitionException. Turn on debugging to get a full stack trace.
> 2014-10-22 15:10:58 BrokerProxy [INFO] Creating new SimpleConsumer for host eat1-app582.corp:10251 for system kafka
> 2014-10-22 15:10:58 BrokerProxy [WARN] Got non-recoverable error codes during multifetch. Throwing an exception to trigger reconnect. Errors: Error([all-service-call-events,10],3,kafka.common.UnknownTopicOrPartitionException)
> 2014-10-22 15:10:58 BrokerProxy [WARN] Restarting consumer due to kafka.common.UnknownTopicOrPartitionException. Turn on debugging to get a full stack trace.
> 2014-10-22 15:11:08 BrokerProxy [INFO] Creating new SimpleConsumer for host eat1-app582.corp:10251 for system kafka
> {noformat}
> The problem appears to be a misunderstanding in how Kafka works. If a partition is moved to another broker, and the BrokerProxy continues fetching on the old broker, it will throw an UnknownTopicOrPartitionException, and try and try and reconnect to the same broker. It will do this indefinitely. Instead, the BrokerProxy should abdicate the TopicAndPartition, and allow the new broker to pick it up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)