You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "Yu Bo (技术中心)" <bo...@sohu-inc.com> on 2014/01/13 15:18:13 UTC
[Kafka-users] badversion exception and unable to recover
Hi all
Because of badversion exception, the id 1 broker is not a leader. But the kafka thread on that server is already in work. The id 1 broker is only for replication.
I want to recover the id 1 broker and became a leader.
I had done :
1、 use the “kafka-reassign-partitions.sh” to remove the broker id 1 from replicas group. And the broker id 1 should be out of touch with kafka cluster.
But, the server.log of other servers are printing lots of error,like:
[2014-01-13 21:56:58,360] INFO Reconnect due to socket error: null (kafka.consumer.SimpleConsumer)
[2014-01-13 21:56:58,362] WARN [ReplicaFetcherThread-0-1], Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 35321893; ClientId: ReplicaFetcherThread-0-1; ReplicaId: 2; MaxWait: 500 ms; MinBytes: 1 bytes; RequestInfo: [co,4] -> PartitionFetchInfo(0,1048576) (kafka.server.ReplicaFetcherThread)
java.net.ConnectException: Connection refused
at sun.nio.ch.Net.connect(Native Method)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:500)
at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57)
at kafka.consumer.SimpleConsumer.connect(SimpleConsumer.scala:44)
at kafka.consumer.SimpleConsumer.reconnect(SimpleConsumer.scala:57)
at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:79)
at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:110)
2、kill the kafka thread running on id 1 broker.
3、I start the kafka service on broker id 1, and it never add to replication group. Not to speak of becoming a leader.
The question is : what should we do inorder to recover the kafka cluster to original?
Thanks.
Best regards,
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
于博 | 搜狐媒体大厦 精准广告研发中心 技术组 | tel:61135569 15101511427 (H) | boyu@sohu-inc.com<ma...@sohu-inc.com>
Re: [Kafka-users] badversion exception and unable to recover
Posted by Guozhang Wang <wa...@gmail.com>.
Hello Bo,
Doing 1 should be good enough under normal operations. Did you see any
errors/exceptions on the controller log regarding partition reassignment?
Guozhang
On Mon, Jan 13, 2014 at 6:18 AM, Yu Bo(技术中心) <bo...@sohu-inc.com> wrote:
> Hi all
> Because of badversion exception, the id 1 broker is not a leader. But the
> kafka thread on that server is already in work. The id 1 broker is only for
> replication.
> I want to recover the id 1 broker and became a leader.
> I had done :
>
> 1、 use the “kafka-reassign-partitions.sh” to remove the broker id 1 from
> replicas group. And the broker id 1 should be out of touch with kafka
> cluster.
>
> But, the server.log of other servers are printing lots of error,like:
>
> [2014-01-13 21:56:58,360] INFO Reconnect due to socket error: null
> (kafka.consumer.SimpleConsumer)
>
> [2014-01-13 21:56:58,362] WARN [ReplicaFetcherThread-0-1], Error in fetch
> Name: FetchRequest; Version: 0; CorrelationId: 35321893; ClientId:
> ReplicaFetcherThread-0-1; ReplicaId: 2; MaxWait: 500 ms; MinBytes: 1 bytes;
> RequestInfo: [co,4] -> PartitionFetchInfo(0,1048576)
> (kafka.server.ReplicaFetcherThread)
>
> java.net.ConnectException: Connection refused
>
> at sun.nio.ch.Net.connect(Native Method)
>
> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:500)
>
> at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57)
>
> at kafka.consumer.SimpleConsumer.connect(SimpleConsumer.scala:44)
>
> at kafka.consumer.SimpleConsumer.reconnect(SimpleConsumer.scala:57)
>
> at
> kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:79)
>
> at
> kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71)
>
> at
> kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:110)
>
>
> 2、kill the kafka thread running on id 1 broker.
>
> 3、I start the kafka service on broker id 1, and it never add to
> replication group. Not to speak of becoming a leader.
>
> The question is : what should we do inorder to recover the kafka cluster
> to original?
>
> Thanks.
>
>
>
> Best regards,
>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 于博 | 搜狐媒体大厦 精准广告研发中心 技术组 | tel:61135569 15101511427 (H) |
> boyu@sohu-inc.com<ma...@sohu-inc.com>
>
>
--
-- Guozhang