You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jun Rao (JIRA)" <ji...@apache.org> on 2016/09/23 20:45:21 UTC

[jira] [Commented] (KAFKA-4207) Partitions stopped after a rapid restart of a broker

    [ https://issues.apache.org/jira/browse/KAFKA-4207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15517526#comment-15517526 ] 

Jun Rao commented on KAFKA-4207:
--------------------------------

This seems to be the same issue as reported in https://issues.apache.org/jira/browse/KAFKA-1342. We can probably just consolidate.

> Partitions stopped after a rapid restart of a broker
> ----------------------------------------------------
>
>                 Key: KAFKA-4207
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4207
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.9.0.1, 0.10.0.1
>            Reporter: Dustin Cote
>
> Environment:
> 4 Kafka brokers
> 10,000 topics with one partition each, replication factor 3
> Partitions with 4KB data each
> No data being produced or consumed
> Scenario:
> Initiate controlled shutdown on one broker
> Interrupt controlled shutdown prior completion with a SIGKILL
> Start a new broker with the same broker ID as broker that was just killed immediately
> Symptoms:
> After starting the new broker, the other three brokers in the cluster will see under replicated partitions forever for some partitions that are hosted on the broker that was killed and restarted
> Cause:
> Today, the controller sends a StopReplica command for each replica hosted on a broker that has initiated a controlled shutdown.  For a large number of replicas this can take awhile.  When the broker that is doing the controlled shutdown is killed, the StopReplica commands are queued up even though the request queue to the broker is cleared.  When the broker comes back online, the StopReplica commands that were queued, get sent to the broker that just started up.  
> CC: [~junrao] since he's familiar with the scenario seen here



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)