You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Bongyeon Kim <bo...@gmail.com> on 2014/06/10 05:11:28 UTC

org.apache.zookeeper.KeeperException$BadVersionException

Hi, team. 

I’m using 0.8.1.
I found some strange log repeatedly on server.log in one of my brokers and it keeps logging until now.

server.log ======================================================================================
...
[2014-06-09 10:41:47,402] ERROR Conditional update of path /brokers/topics/topicTRACE/partitions/1/state with data {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[4,2]} and expected version 439 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/topicTRACE/partitions/1/state (kafka.utils.ZkUtils$)
[2014-06-09 10:41:47,402] INFO Partition [topicTRACE,1] on broker 2: Cached zkVersion [439] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
[2014-06-09 10:41:47,402] INFO Partition [topicDEBUG,0] on broker 2: Shrinking ISR for partition [topicDEBUG,0] from 1,3,2 to 2 (kafka.cluster.Partition)
[2014-06-09 10:41:47,416] ERROR Conditional update of path /brokers/topics/topicDEBUG/partitions/0/state with data {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[2]} and expected version 1424 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/topicDEBUG/partitions/0/state (kafka.utils.ZkUtils$)
[2014-06-09 10:41:47,432] INFO Partition [topicDEBUG,0] on broker 2: Cached zkVersion [1424] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
[2014-06-09 10:41:47,432] INFO Partition [topicCDR,3] on broker 2: Shrinking ISR for partition [topicCDR,3] from 4,1,2 to 2 (kafka.cluster.Partition)
[2014-06-09 10:41:47,435] ERROR Conditional update of path /brokers/topics/topicCDR/partitions/3/state with data {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":46,"isr":[2]} and expected version 541 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/topicCDR/partitions/3/state (kafka.utils.ZkUtils$)
[2014-06-09 10:41:47,435] INFO Partition [topicCDR,3] on broker 2: Cached zkVersion [541] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
[2014-06-09 10:41:48,426] INFO Partition [topicTRACE,1] on broker 2: Shrinking ISR for partition [topicTRACE,1] from 4,3,2 to 4,2 (kafka.cluster.Partition)
...
=================================================================================================

and found some error and warning in controller.log


controller.log ======================================================================================
...
[2014-06-09 10:42:03,962] WARN [Controller-3-to-broker-1-send-thread], Controller 3 fails to send a request to broker id:1,host:c-ccp-tk1-a58,port:9091 (kafka.controller.RequestSendThread)
java.net.SocketTimeoutException
        at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
        at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
        at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
        at kafka.utils.Utils$.read(Utils.scala:375)
        at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
        at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
        at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
        at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
        at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
[2014-06-09 10:42:03,964] ERROR [Controller-3-to-broker-1-send-thread], Controller 3 epoch 21 failed to send UpdateMetadata request with correlation id 1 to broker id:1,host:c-ccp-tk1-a58,port:9091. Reconnecting to broker. (kafka.controller.RequestSendThread)
java.nio.channels.ClosedChannelException
        at kafka.network.BlockingChannel.send(BlockingChannel.scala:89)
        at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
        at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)

...

[2014-06-09 10:42:38,064] WARN [OfflinePartitionLeaderSelector]: No broker in ISR is alive for [topicTRACE,0]. Elect leader 3 from live brokers 3. There's potential data loss. (kafka.controller.OfflinePartitionLeaderSelector)
...
=================================================================================================

Why is this happen? Is there any possibilities data loss?
To normalize my brokers, What I have to do? Do I have to restart this broker?


Thanks in advance.



Re: org.apache.zookeeper.KeeperException$BadVersionException

Posted by Bongyeon Kim <bo...@gmail.com>.
There is no error in controller.log and no log in state-change.log


On Wed, Jun 11, 2014 at 11:03 PM, Jun Rao <ju...@gmail.com> wrote:

> It sounds like broker 5 didn't get the needed message from the controller.
> Was there any error in the controller and state-change log when the above
> error started?
>
> Thanks,
>
> Jun
>
>
> On Tue, Jun 10, 2014 at 10:00 PM, Bongyeon Kim <bo...@gmail.com>
> wrote:
>
> > No, broker 5 is alive with log.
> >
> >
> > [2014-06-11 13:59:45,170] ERROR Conditional update of path
> > /brokers/topics/topicTRACE/partitions/0/state with data
> > {"controller_epoch":1,"leader":5,"version":1,"leader_epoch":0,"isr":[5]}
> > and expected version 2 failed due to
> > org.apache.zookeeper.KeeperException$BadVersionException:
> KeeperErrorCode =
> > BadVersion for /brokers/topics/topicTRACE/partitions/0/state
> > (kafka.utils.ZkUtils$)
> >
> >
> > How can I issue metadata request?
> >
> >
> > Thanks.
> >
> > On Jun 11, 2014, at 1:26 PM, Jun Rao <ju...@gmail.com> wrote:
> >
> > > The info from kafka-topics is the correct one. Is broker 5 dead? It
> seems
> > > that you can issue metadata request to it.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Tue, Jun 10, 2014 at 8:26 PM, Bongyeon Kim <bo...@gmail.com>
> > > wrote:
> > >
> > >> With some WARN log, Yes, it is.
> > >>
> > >> and I found interesting things separately before I mentioned.
> > >> I have another clusters. I run 2 brokers on 1 machine for test. and I
> > see
> > >> same problem before I mentioned, but I can’t see any error log on
> > >> controller.log.
> > >>
> > >> At this time, when I list topics with kafka-topic tool, I see
> > information
> > >> like below.
> > >>
> > >> == kafka-topics.sh ===================================================
> > >>
> > >> Topic:topicTRACE        PartitionCount:2        ReplicationFactor:2
> > >> Configs:retention.ms=3600000
> > >>        Topic: topicTRACE       Partition: 0    Leader: 6
> Replicas:
> > >> 5,6   Isr: 6
> > >>        Topic: topicTRACE       Partition: 1    Leader: 6
> Replicas:
> > >> 6,5   Isr: 6,5
> > >>
> > >> ======================================================================
> > >>
> > >> but, producer keeps producing broker 5 which seems to be dead.
> > >>
> > >> when I get metadata from broker using my version of java api tool,
> info
> > of
> > >> leader and isr is different.
> > >> Speaking properly, metadata from one broker is same as data from
> > >> kafka-topics tool, metadata from another broker is different.
> > >>
> > >>
> ========================================================================
> > >>
> > >> $ bin/kafka-run-class.sh com.kthcorp.daisy.ccprt.util.KafkaMetadata
> > >> c-ccp-tk1-a60:9091 topicTRACE
> > >> ---- topic info ----
> > >> partition: 0, leader: 5, replica: [id:5,host:c-ccp-tk1-a60,port:9091,
> > >> id:6,host:c-ccp-tk1-a60,port:9092], isr:
> > >> [id:5,host:c-ccp-tk1-a60,port:9091, id:6,host:c-ccp-tk1-a60,port:9092]
> > >> partition: 1, leader: 6, replica: [id:6,host:c-ccp-tk1-a60,port:9092,
> > >> id:5,host:c-ccp-tk1-a60,port:9091], isr:
> > >> [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091]
> > >>
> > >>
> ========================================================================
> > >>
> > >> $ bin/kafka-run-class.sh com.kthcorp.daisy.ccprt.util.KafkaMetadata
> > >> c-ccp-tk1-a60:9092 topicTRACE
> > >> ---- topic info ----
> > >> partition: 0, leader: 6, replica: [id:5,host:c-ccp-tk1-a60,port:9091,
> > >> id:6,host:c-ccp-tk1-a60,port:9092], isr:
> > [id:6,host:c-ccp-tk1-a60,port:9092]
> > >> partition: 1, leader: 6, replica: [id:6,host:c-ccp-tk1-a60,port:9092,
> > >> id:5,host:c-ccp-tk1-a60,port:9091], isr:
> > >> [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091]
> > >>
> > >>
> ========================================================================
> > >>
> > >> which one is correct? why is it happened?
> > >>
> > >>
> > >> Thanks~
> > >>
> > >>
> > >>
> > >> On Jun 10, 2014, at 11:28 PM, Jun Rao <ju...@gmail.com> wrote:
> > >>
> > >>> Ok. Was this host (broker id:1,host:c-ccp-tk1-a58,port:9091) up when
> > the
> > >>> controller had SocketTimeoutException?
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Jun
> > >>>
> > >>>
> > >>> On Mon, Jun 9, 2014 at 10:11 PM, Bongyeon Kim <
> bongyeon.kim@gmail.com>
> > >>> wrote:
> > >>>
> > >>>> No, I can see any ZK session expiration log.
> > >>>>
> > >>>> What I have to do to prevent this? Increasing '
> > >>>> zookeeper.session.timeout.ms'
> > >>>> can help?
> > >>>>
> > >>>>
> > >>>> On Tue, Jun 10, 2014 at 12:58 PM, Jun Rao <ju...@gmail.com> wrote:
> > >>>>
> > >>>>> This is probably related to kafka-1382. The root cause is likely ZK
> > >>>> session
> > >>>>> expiration in the broker. Did you see any?
> > >>>>>
> > >>>>> Thanks,
> > >>>>>
> > >>>>> Jun
> > >>>>>
> > >>>>>
> > >>>>> On Mon, Jun 9, 2014 at 8:11 PM, Bongyeon Kim <
> bongyeon.kim@gmail.com
> > >
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hi, team.
> > >>>>>>
> > >>>>>> I’m using 0.8.1.
> > >>>>>> I found some strange log repeatedly on server.log in one of my
> > brokers
> > >>>>> and
> > >>>>>> it keeps logging until now.
> > >>>>>>
> > >>>>>> server.log
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> ======================================================================================
> > >>>>>> ...
> > >>>>>> [2014-06-09 10:41:47,402] ERROR Conditional update of path
> > >>>>>> /brokers/topics/topicTRACE/partitions/1/state with data
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[4,2]}
> > >>>>>> and expected version 439 failed due to
> > >>>>>> org.apache.zookeeper.KeeperException$BadVersionException:
> > >>>>> KeeperErrorCode =
> > >>>>>> BadVersion for /brokers/topics/topicTRACE/partitions/1/state
> > >>>>>> (kafka.utils.ZkUtils$)
> > >>>>>> [2014-06-09 10:41:47,402] INFO Partition [topicTRACE,1] on broker
> 2:
> > >>>>>> Cached zkVersion [439] not equal to that in zookeeper, skip
> updating
> > >>>> ISR
> > >>>>>> (kafka.cluster.Partition)
> > >>>>>> [2014-06-09 10:41:47,402] INFO Partition [topicDEBUG,0] on broker
> 2:
> > >>>>>> Shrinking ISR for partition [topicDEBUG,0] from 1,3,2 to 2
> > >>>>>> (kafka.cluster.Partition)
> > >>>>>> [2014-06-09 10:41:47,416] ERROR Conditional update of path
> > >>>>>> /brokers/topics/topicDEBUG/partitions/0/state with data
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[2]}
> > >>>>>> and expected version 1424 failed due to
> > >>>>>> org.apache.zookeeper.KeeperException$BadVersionException:
> > >>>>> KeeperErrorCode =
> > >>>>>> BadVersion for /brokers/topics/topicDEBUG/partitions/0/state
> > >>>>>> (kafka.utils.ZkUtils$)
> > >>>>>> [2014-06-09 10:41:47,432] INFO Partition [topicDEBUG,0] on broker
> 2:
> > >>>>>> Cached zkVersion [1424] not equal to that in zookeeper, skip
> > updating
> > >>>> ISR
> > >>>>>> (kafka.cluster.Partition)
> > >>>>>> [2014-06-09 10:41:47,432] INFO Partition [topicCDR,3] on broker 2:
> > >>>>>> Shrinking ISR for partition [topicCDR,3] from 4,1,2 to 2
> > >>>>>> (kafka.cluster.Partition)
> > >>>>>> [2014-06-09 10:41:47,435] ERROR Conditional update of path
> > >>>>>> /brokers/topics/topicCDR/partitions/3/state with data
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":46,"isr":[2]}
> > >>>>>> and expected version 541 failed due to
> > >>>>>> org.apache.zookeeper.KeeperException$BadVersionException:
> > >>>>> KeeperErrorCode =
> > >>>>>> BadVersion for /brokers/topics/topicCDR/partitions/3/state
> > >>>>>> (kafka.utils.ZkUtils$)
> > >>>>>> [2014-06-09 10:41:47,435] INFO Partition [topicCDR,3] on broker 2:
> > >>>> Cached
> > >>>>>> zkVersion [541] not equal to that in zookeeper, skip updating ISR
> > >>>>>> (kafka.cluster.Partition)
> > >>>>>> [2014-06-09 10:41:48,426] INFO Partition [topicTRACE,1] on broker
> 2:
> > >>>>>> Shrinking ISR for partition [topicTRACE,1] from 4,3,2 to 4,2
> > >>>>>> (kafka.cluster.Partition)
> > >>>>>> ...
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> =================================================================================================
> > >>>>>>
> > >>>>>> and found some error and warning in controller.log
> > >>>>>>
> > >>>>>>
> > >>>>>> controller.log
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> ======================================================================================
> > >>>>>> ...
> > >>>>>> [2014-06-09 10:42:03,962] WARN
> > [Controller-3-to-broker-1-send-thread],
> > >>>>>> Controller 3 fails to send a request to broker
> > >>>>>> id:1,host:c-ccp-tk1-a58,port:9091
> > (kafka.controller.RequestSendThread)
> > >>>>>> java.net.SocketTimeoutException
> > >>>>>>       at
> > >>>>>>
> > >>
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> > >>>>>>       at
> > >>>>> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> > >>>>>>       at
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> > >>>>>>       at kafka.utils.Utils$.read(Utils.scala:375)
> > >>>>>>       at
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> > >>>>>>       at
> > >>>>>> kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> > >>>>>>       at
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> > >>>>>>       at
> > >>>>> kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
> > >>>>>>       at
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
> > >>>>>>       at
> > >>>>> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> > >>>>>> [2014-06-09 10:42:03,964] ERROR
> > >> [Controller-3-to-broker-1-send-thread],
> > >>>>>> Controller 3 epoch 21 failed to send UpdateMetadata request with
> > >>>>>> correlation id 1 to broker id:1,host:c-ccp-tk1-a58,port:9091.
> > >>>>> Reconnecting
> > >>>>>> to broker. (kafka.controller.RequestSendThread)
> > >>>>>> java.nio.channels.ClosedChannelException
> > >>>>>>       at
> > kafka.network.BlockingChannel.send(BlockingChannel.scala:89)
> > >>>>>>       at
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
> > >>>>>>       at
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
> > >>>>>>       at
> > >>>>> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> > >>>>>>
> > >>>>>> ...
> > >>>>>>
> > >>>>>> [2014-06-09 10:42:38,064] WARN [OfflinePartitionLeaderSelector]:
> No
> > >>>>> broker
> > >>>>>> in ISR is alive for [topicTRACE,0]. Elect leader 3 from live
> brokers
> > >> 3.
> > >>>>>> There's potential data loss.
> > >>>>>> (kafka.controller.OfflinePartitionLeaderSelector)
> > >>>>>> ...
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> >
> =================================================================================================
> > >>>>>>
> > >>>>>> Why is this happen? Is there any possibilities data loss?
> > >>>>>> To normalize my brokers, What I have to do? Do I have to restart
> > this
> > >>>>>> broker?
> > >>>>>>
> > >>>>>>
> > >>>>>> Thanks in advance.
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> *Sincerely*
> > >>>> *,**Bongyeon Kim*
> > >>>>
> > >>>> Java Developer & Engineer
> > >>>> Seoul, Korea
> > >>>> Mobile:  +82-10-9369-1314
> > >>>> Email:  bongyeonkim@gmail.com
> > >>>> Twitter:  http://twitter.com/tigerby
> > >>>> Facebook:  http://facebook.com/tigerby
> > >>>> Wiki: http://tigerby.com
> > >>>>
> > >>
> > >>
> >
> >
>



-- 
*Sincerely*
*,**Bongyeon Kim*

Java Developer & Engineer
Seoul, Korea
Mobile:  +82-10-9369-1314
Email:  bongyeonkim@gmail.com
Twitter:  http://twitter.com/tigerby
Facebook:  http://facebook.com/tigerby
Wiki: http://tigerby.com

Re: org.apache.zookeeper.KeeperException$BadVersionException

Posted by Jun Rao <ju...@gmail.com>.
It sounds like broker 5 didn't get the needed message from the controller.
Was there any error in the controller and state-change log when the above
error started?

Thanks,

Jun


On Tue, Jun 10, 2014 at 10:00 PM, Bongyeon Kim <bo...@gmail.com>
wrote:

> No, broker 5 is alive with log.
>
>
> [2014-06-11 13:59:45,170] ERROR Conditional update of path
> /brokers/topics/topicTRACE/partitions/0/state with data
> {"controller_epoch":1,"leader":5,"version":1,"leader_epoch":0,"isr":[5]}
> and expected version 2 failed due to
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode =
> BadVersion for /brokers/topics/topicTRACE/partitions/0/state
> (kafka.utils.ZkUtils$)
>
>
> How can I issue metadata request?
>
>
> Thanks.
>
> On Jun 11, 2014, at 1:26 PM, Jun Rao <ju...@gmail.com> wrote:
>
> > The info from kafka-topics is the correct one. Is broker 5 dead? It seems
> > that you can issue metadata request to it.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Tue, Jun 10, 2014 at 8:26 PM, Bongyeon Kim <bo...@gmail.com>
> > wrote:
> >
> >> With some WARN log, Yes, it is.
> >>
> >> and I found interesting things separately before I mentioned.
> >> I have another clusters. I run 2 brokers on 1 machine for test. and I
> see
> >> same problem before I mentioned, but I can’t see any error log on
> >> controller.log.
> >>
> >> At this time, when I list topics with kafka-topic tool, I see
> information
> >> like below.
> >>
> >> == kafka-topics.sh ===================================================
> >>
> >> Topic:topicTRACE        PartitionCount:2        ReplicationFactor:2
> >> Configs:retention.ms=3600000
> >>        Topic: topicTRACE       Partition: 0    Leader: 6       Replicas:
> >> 5,6   Isr: 6
> >>        Topic: topicTRACE       Partition: 1    Leader: 6       Replicas:
> >> 6,5   Isr: 6,5
> >>
> >> ======================================================================
> >>
> >> but, producer keeps producing broker 5 which seems to be dead.
> >>
> >> when I get metadata from broker using my version of java api tool, info
> of
> >> leader and isr is different.
> >> Speaking properly, metadata from one broker is same as data from
> >> kafka-topics tool, metadata from another broker is different.
> >>
> >> ========================================================================
> >>
> >> $ bin/kafka-run-class.sh com.kthcorp.daisy.ccprt.util.KafkaMetadata
> >> c-ccp-tk1-a60:9091 topicTRACE
> >> ---- topic info ----
> >> partition: 0, leader: 5, replica: [id:5,host:c-ccp-tk1-a60,port:9091,
> >> id:6,host:c-ccp-tk1-a60,port:9092], isr:
> >> [id:5,host:c-ccp-tk1-a60,port:9091, id:6,host:c-ccp-tk1-a60,port:9092]
> >> partition: 1, leader: 6, replica: [id:6,host:c-ccp-tk1-a60,port:9092,
> >> id:5,host:c-ccp-tk1-a60,port:9091], isr:
> >> [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091]
> >>
> >> ========================================================================
> >>
> >> $ bin/kafka-run-class.sh com.kthcorp.daisy.ccprt.util.KafkaMetadata
> >> c-ccp-tk1-a60:9092 topicTRACE
> >> ---- topic info ----
> >> partition: 0, leader: 6, replica: [id:5,host:c-ccp-tk1-a60,port:9091,
> >> id:6,host:c-ccp-tk1-a60,port:9092], isr:
> [id:6,host:c-ccp-tk1-a60,port:9092]
> >> partition: 1, leader: 6, replica: [id:6,host:c-ccp-tk1-a60,port:9092,
> >> id:5,host:c-ccp-tk1-a60,port:9091], isr:
> >> [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091]
> >>
> >> ========================================================================
> >>
> >> which one is correct? why is it happened?
> >>
> >>
> >> Thanks~
> >>
> >>
> >>
> >> On Jun 10, 2014, at 11:28 PM, Jun Rao <ju...@gmail.com> wrote:
> >>
> >>> Ok. Was this host (broker id:1,host:c-ccp-tk1-a58,port:9091) up when
> the
> >>> controller had SocketTimeoutException?
> >>>
> >>> Thanks,
> >>>
> >>> Jun
> >>>
> >>>
> >>> On Mon, Jun 9, 2014 at 10:11 PM, Bongyeon Kim <bo...@gmail.com>
> >>> wrote:
> >>>
> >>>> No, I can see any ZK session expiration log.
> >>>>
> >>>> What I have to do to prevent this? Increasing '
> >>>> zookeeper.session.timeout.ms'
> >>>> can help?
> >>>>
> >>>>
> >>>> On Tue, Jun 10, 2014 at 12:58 PM, Jun Rao <ju...@gmail.com> wrote:
> >>>>
> >>>>> This is probably related to kafka-1382. The root cause is likely ZK
> >>>> session
> >>>>> expiration in the broker. Did you see any?
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Jun
> >>>>>
> >>>>>
> >>>>> On Mon, Jun 9, 2014 at 8:11 PM, Bongyeon Kim <bongyeon.kim@gmail.com
> >
> >>>>> wrote:
> >>>>>
> >>>>>> Hi, team.
> >>>>>>
> >>>>>> I’m using 0.8.1.
> >>>>>> I found some strange log repeatedly on server.log in one of my
> brokers
> >>>>> and
> >>>>>> it keeps logging until now.
> >>>>>>
> >>>>>> server.log
> >>>>>>
> >>>>>
> >>>>
> >>
> ======================================================================================
> >>>>>> ...
> >>>>>> [2014-06-09 10:41:47,402] ERROR Conditional update of path
> >>>>>> /brokers/topics/topicTRACE/partitions/1/state with data
> >>>>>>
> >>>>>
> >>>>
> >>
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[4,2]}
> >>>>>> and expected version 439 failed due to
> >>>>>> org.apache.zookeeper.KeeperException$BadVersionException:
> >>>>> KeeperErrorCode =
> >>>>>> BadVersion for /brokers/topics/topicTRACE/partitions/1/state
> >>>>>> (kafka.utils.ZkUtils$)
> >>>>>> [2014-06-09 10:41:47,402] INFO Partition [topicTRACE,1] on broker 2:
> >>>>>> Cached zkVersion [439] not equal to that in zookeeper, skip updating
> >>>> ISR
> >>>>>> (kafka.cluster.Partition)
> >>>>>> [2014-06-09 10:41:47,402] INFO Partition [topicDEBUG,0] on broker 2:
> >>>>>> Shrinking ISR for partition [topicDEBUG,0] from 1,3,2 to 2
> >>>>>> (kafka.cluster.Partition)
> >>>>>> [2014-06-09 10:41:47,416] ERROR Conditional update of path
> >>>>>> /brokers/topics/topicDEBUG/partitions/0/state with data
> >>>>>>
> >>>>>
> >>>>
> >>
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[2]}
> >>>>>> and expected version 1424 failed due to
> >>>>>> org.apache.zookeeper.KeeperException$BadVersionException:
> >>>>> KeeperErrorCode =
> >>>>>> BadVersion for /brokers/topics/topicDEBUG/partitions/0/state
> >>>>>> (kafka.utils.ZkUtils$)
> >>>>>> [2014-06-09 10:41:47,432] INFO Partition [topicDEBUG,0] on broker 2:
> >>>>>> Cached zkVersion [1424] not equal to that in zookeeper, skip
> updating
> >>>> ISR
> >>>>>> (kafka.cluster.Partition)
> >>>>>> [2014-06-09 10:41:47,432] INFO Partition [topicCDR,3] on broker 2:
> >>>>>> Shrinking ISR for partition [topicCDR,3] from 4,1,2 to 2
> >>>>>> (kafka.cluster.Partition)
> >>>>>> [2014-06-09 10:41:47,435] ERROR Conditional update of path
> >>>>>> /brokers/topics/topicCDR/partitions/3/state with data
> >>>>>>
> >>>>>
> >>>>
> >>
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":46,"isr":[2]}
> >>>>>> and expected version 541 failed due to
> >>>>>> org.apache.zookeeper.KeeperException$BadVersionException:
> >>>>> KeeperErrorCode =
> >>>>>> BadVersion for /brokers/topics/topicCDR/partitions/3/state
> >>>>>> (kafka.utils.ZkUtils$)
> >>>>>> [2014-06-09 10:41:47,435] INFO Partition [topicCDR,3] on broker 2:
> >>>> Cached
> >>>>>> zkVersion [541] not equal to that in zookeeper, skip updating ISR
> >>>>>> (kafka.cluster.Partition)
> >>>>>> [2014-06-09 10:41:48,426] INFO Partition [topicTRACE,1] on broker 2:
> >>>>>> Shrinking ISR for partition [topicTRACE,1] from 4,3,2 to 4,2
> >>>>>> (kafka.cluster.Partition)
> >>>>>> ...
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> =================================================================================================
> >>>>>>
> >>>>>> and found some error and warning in controller.log
> >>>>>>
> >>>>>>
> >>>>>> controller.log
> >>>>>>
> >>>>>
> >>>>
> >>
> ======================================================================================
> >>>>>> ...
> >>>>>> [2014-06-09 10:42:03,962] WARN
> [Controller-3-to-broker-1-send-thread],
> >>>>>> Controller 3 fails to send a request to broker
> >>>>>> id:1,host:c-ccp-tk1-a58,port:9091
> (kafka.controller.RequestSendThread)
> >>>>>> java.net.SocketTimeoutException
> >>>>>>       at
> >>>>>>
> >> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> >>>>>>       at
> >>>>> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> >>>>>>       at
> >>>>>>
> >>>>>
> >>>>
> >>
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> >>>>>>       at kafka.utils.Utils$.read(Utils.scala:375)
> >>>>>>       at
> >>>>>>
> >>>>>
> >>>>
> >>
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> >>>>>>       at
> >>>>>> kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> >>>>>>       at
> >>>>>>
> >>>>>
> >>>>
> >>
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> >>>>>>       at
> >>>>> kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
> >>>>>>       at
> >>>>>>
> >>>>>
> >>>>
> >>
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
> >>>>>>       at
> >>>>> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> >>>>>> [2014-06-09 10:42:03,964] ERROR
> >> [Controller-3-to-broker-1-send-thread],
> >>>>>> Controller 3 epoch 21 failed to send UpdateMetadata request with
> >>>>>> correlation id 1 to broker id:1,host:c-ccp-tk1-a58,port:9091.
> >>>>> Reconnecting
> >>>>>> to broker. (kafka.controller.RequestSendThread)
> >>>>>> java.nio.channels.ClosedChannelException
> >>>>>>       at
> kafka.network.BlockingChannel.send(BlockingChannel.scala:89)
> >>>>>>       at
> >>>>>>
> >>>>>
> >>>>
> >>
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
> >>>>>>       at
> >>>>>>
> >>>>>
> >>>>
> >>
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
> >>>>>>       at
> >>>>> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> >>>>>>
> >>>>>> ...
> >>>>>>
> >>>>>> [2014-06-09 10:42:38,064] WARN [OfflinePartitionLeaderSelector]: No
> >>>>> broker
> >>>>>> in ISR is alive for [topicTRACE,0]. Elect leader 3 from live brokers
> >> 3.
> >>>>>> There's potential data loss.
> >>>>>> (kafka.controller.OfflinePartitionLeaderSelector)
> >>>>>> ...
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> =================================================================================================
> >>>>>>
> >>>>>> Why is this happen? Is there any possibilities data loss?
> >>>>>> To normalize my brokers, What I have to do? Do I have to restart
> this
> >>>>>> broker?
> >>>>>>
> >>>>>>
> >>>>>> Thanks in advance.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> *Sincerely*
> >>>> *,**Bongyeon Kim*
> >>>>
> >>>> Java Developer & Engineer
> >>>> Seoul, Korea
> >>>> Mobile:  +82-10-9369-1314
> >>>> Email:  bongyeonkim@gmail.com
> >>>> Twitter:  http://twitter.com/tigerby
> >>>> Facebook:  http://facebook.com/tigerby
> >>>> Wiki: http://tigerby.com
> >>>>
> >>
> >>
>
>

Re: org.apache.zookeeper.KeeperException$BadVersionException

Posted by Bongyeon Kim <bo...@gmail.com>.
No, broker 5 is alive with log.


[2014-06-11 13:59:45,170] ERROR Conditional update of path /brokers/topics/topicTRACE/partitions/0/state with data {"controller_epoch":1,"leader":5,"version":1,"leader_epoch":0,"isr":[5]} and expected version 2 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/topicTRACE/partitions/0/state (kafka.utils.ZkUtils$)


How can I issue metadata request?


Thanks.

On Jun 11, 2014, at 1:26 PM, Jun Rao <ju...@gmail.com> wrote:

> The info from kafka-topics is the correct one. Is broker 5 dead? It seems
> that you can issue metadata request to it.
> 
> Thanks,
> 
> Jun
> 
> 
> On Tue, Jun 10, 2014 at 8:26 PM, Bongyeon Kim <bo...@gmail.com>
> wrote:
> 
>> With some WARN log, Yes, it is.
>> 
>> and I found interesting things separately before I mentioned.
>> I have another clusters. I run 2 brokers on 1 machine for test. and I see
>> same problem before I mentioned, but I can’t see any error log on
>> controller.log.
>> 
>> At this time, when I list topics with kafka-topic tool, I see information
>> like below.
>> 
>> == kafka-topics.sh ===================================================
>> 
>> Topic:topicTRACE        PartitionCount:2        ReplicationFactor:2
>> Configs:retention.ms=3600000
>>        Topic: topicTRACE       Partition: 0    Leader: 6       Replicas:
>> 5,6   Isr: 6
>>        Topic: topicTRACE       Partition: 1    Leader: 6       Replicas:
>> 6,5   Isr: 6,5
>> 
>> ======================================================================
>> 
>> but, producer keeps producing broker 5 which seems to be dead.
>> 
>> when I get metadata from broker using my version of java api tool, info of
>> leader and isr is different.
>> Speaking properly, metadata from one broker is same as data from
>> kafka-topics tool, metadata from another broker is different.
>> 
>> ========================================================================
>> 
>> $ bin/kafka-run-class.sh com.kthcorp.daisy.ccprt.util.KafkaMetadata
>> c-ccp-tk1-a60:9091 topicTRACE
>> ---- topic info ----
>> partition: 0, leader: 5, replica: [id:5,host:c-ccp-tk1-a60,port:9091,
>> id:6,host:c-ccp-tk1-a60,port:9092], isr:
>> [id:5,host:c-ccp-tk1-a60,port:9091, id:6,host:c-ccp-tk1-a60,port:9092]
>> partition: 1, leader: 6, replica: [id:6,host:c-ccp-tk1-a60,port:9092,
>> id:5,host:c-ccp-tk1-a60,port:9091], isr:
>> [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091]
>> 
>> ========================================================================
>> 
>> $ bin/kafka-run-class.sh com.kthcorp.daisy.ccprt.util.KafkaMetadata
>> c-ccp-tk1-a60:9092 topicTRACE
>> ---- topic info ----
>> partition: 0, leader: 6, replica: [id:5,host:c-ccp-tk1-a60,port:9091,
>> id:6,host:c-ccp-tk1-a60,port:9092], isr: [id:6,host:c-ccp-tk1-a60,port:9092]
>> partition: 1, leader: 6, replica: [id:6,host:c-ccp-tk1-a60,port:9092,
>> id:5,host:c-ccp-tk1-a60,port:9091], isr:
>> [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091]
>> 
>> ========================================================================
>> 
>> which one is correct? why is it happened?
>> 
>> 
>> Thanks~
>> 
>> 
>> 
>> On Jun 10, 2014, at 11:28 PM, Jun Rao <ju...@gmail.com> wrote:
>> 
>>> Ok. Was this host (broker id:1,host:c-ccp-tk1-a58,port:9091) up when the
>>> controller had SocketTimeoutException?
>>> 
>>> Thanks,
>>> 
>>> Jun
>>> 
>>> 
>>> On Mon, Jun 9, 2014 at 10:11 PM, Bongyeon Kim <bo...@gmail.com>
>>> wrote:
>>> 
>>>> No, I can see any ZK session expiration log.
>>>> 
>>>> What I have to do to prevent this? Increasing '
>>>> zookeeper.session.timeout.ms'
>>>> can help?
>>>> 
>>>> 
>>>> On Tue, Jun 10, 2014 at 12:58 PM, Jun Rao <ju...@gmail.com> wrote:
>>>> 
>>>>> This is probably related to kafka-1382. The root cause is likely ZK
>>>> session
>>>>> expiration in the broker. Did you see any?
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Jun
>>>>> 
>>>>> 
>>>>> On Mon, Jun 9, 2014 at 8:11 PM, Bongyeon Kim <bo...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Hi, team.
>>>>>> 
>>>>>> I’m using 0.8.1.
>>>>>> I found some strange log repeatedly on server.log in one of my brokers
>>>>> and
>>>>>> it keeps logging until now.
>>>>>> 
>>>>>> server.log
>>>>>> 
>>>>> 
>>>> 
>> ======================================================================================
>>>>>> ...
>>>>>> [2014-06-09 10:41:47,402] ERROR Conditional update of path
>>>>>> /brokers/topics/topicTRACE/partitions/1/state with data
>>>>>> 
>>>>> 
>>>> 
>> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[4,2]}
>>>>>> and expected version 439 failed due to
>>>>>> org.apache.zookeeper.KeeperException$BadVersionException:
>>>>> KeeperErrorCode =
>>>>>> BadVersion for /brokers/topics/topicTRACE/partitions/1/state
>>>>>> (kafka.utils.ZkUtils$)
>>>>>> [2014-06-09 10:41:47,402] INFO Partition [topicTRACE,1] on broker 2:
>>>>>> Cached zkVersion [439] not equal to that in zookeeper, skip updating
>>>> ISR
>>>>>> (kafka.cluster.Partition)
>>>>>> [2014-06-09 10:41:47,402] INFO Partition [topicDEBUG,0] on broker 2:
>>>>>> Shrinking ISR for partition [topicDEBUG,0] from 1,3,2 to 2
>>>>>> (kafka.cluster.Partition)
>>>>>> [2014-06-09 10:41:47,416] ERROR Conditional update of path
>>>>>> /brokers/topics/topicDEBUG/partitions/0/state with data
>>>>>> 
>>>>> 
>>>> 
>> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[2]}
>>>>>> and expected version 1424 failed due to
>>>>>> org.apache.zookeeper.KeeperException$BadVersionException:
>>>>> KeeperErrorCode =
>>>>>> BadVersion for /brokers/topics/topicDEBUG/partitions/0/state
>>>>>> (kafka.utils.ZkUtils$)
>>>>>> [2014-06-09 10:41:47,432] INFO Partition [topicDEBUG,0] on broker 2:
>>>>>> Cached zkVersion [1424] not equal to that in zookeeper, skip updating
>>>> ISR
>>>>>> (kafka.cluster.Partition)
>>>>>> [2014-06-09 10:41:47,432] INFO Partition [topicCDR,3] on broker 2:
>>>>>> Shrinking ISR for partition [topicCDR,3] from 4,1,2 to 2
>>>>>> (kafka.cluster.Partition)
>>>>>> [2014-06-09 10:41:47,435] ERROR Conditional update of path
>>>>>> /brokers/topics/topicCDR/partitions/3/state with data
>>>>>> 
>>>>> 
>>>> 
>> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":46,"isr":[2]}
>>>>>> and expected version 541 failed due to
>>>>>> org.apache.zookeeper.KeeperException$BadVersionException:
>>>>> KeeperErrorCode =
>>>>>> BadVersion for /brokers/topics/topicCDR/partitions/3/state
>>>>>> (kafka.utils.ZkUtils$)
>>>>>> [2014-06-09 10:41:47,435] INFO Partition [topicCDR,3] on broker 2:
>>>> Cached
>>>>>> zkVersion [541] not equal to that in zookeeper, skip updating ISR
>>>>>> (kafka.cluster.Partition)
>>>>>> [2014-06-09 10:41:48,426] INFO Partition [topicTRACE,1] on broker 2:
>>>>>> Shrinking ISR for partition [topicTRACE,1] from 4,3,2 to 4,2
>>>>>> (kafka.cluster.Partition)
>>>>>> ...
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> =================================================================================================
>>>>>> 
>>>>>> and found some error and warning in controller.log
>>>>>> 
>>>>>> 
>>>>>> controller.log
>>>>>> 
>>>>> 
>>>> 
>> ======================================================================================
>>>>>> ...
>>>>>> [2014-06-09 10:42:03,962] WARN [Controller-3-to-broker-1-send-thread],
>>>>>> Controller 3 fails to send a request to broker
>>>>>> id:1,host:c-ccp-tk1-a58,port:9091 (kafka.controller.RequestSendThread)
>>>>>> java.net.SocketTimeoutException
>>>>>>       at
>>>>>> 
>> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
>>>>>>       at
>>>>> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
>>>>>>       at
>>>>>> 
>>>>> 
>>>> 
>> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
>>>>>>       at kafka.utils.Utils$.read(Utils.scala:375)
>>>>>>       at
>>>>>> 
>>>>> 
>>>> 
>> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>>>>>>       at
>>>>>> kafka.network.Receive$class.readCompletely(Transmission.scala:56)
>>>>>>       at
>>>>>> 
>>>>> 
>>>> 
>> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
>>>>>>       at
>>>>> kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
>>>>>>       at
>>>>>> 
>>>>> 
>>>> 
>> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
>>>>>>       at
>>>>> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
>>>>>> [2014-06-09 10:42:03,964] ERROR
>> [Controller-3-to-broker-1-send-thread],
>>>>>> Controller 3 epoch 21 failed to send UpdateMetadata request with
>>>>>> correlation id 1 to broker id:1,host:c-ccp-tk1-a58,port:9091.
>>>>> Reconnecting
>>>>>> to broker. (kafka.controller.RequestSendThread)
>>>>>> java.nio.channels.ClosedChannelException
>>>>>>       at kafka.network.BlockingChannel.send(BlockingChannel.scala:89)
>>>>>>       at
>>>>>> 
>>>>> 
>>>> 
>> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
>>>>>>       at
>>>>>> 
>>>>> 
>>>> 
>> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>>>>>>       at
>>>>> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
>>>>>> 
>>>>>> ...
>>>>>> 
>>>>>> [2014-06-09 10:42:38,064] WARN [OfflinePartitionLeaderSelector]: No
>>>>> broker
>>>>>> in ISR is alive for [topicTRACE,0]. Elect leader 3 from live brokers
>> 3.
>>>>>> There's potential data loss.
>>>>>> (kafka.controller.OfflinePartitionLeaderSelector)
>>>>>> ...
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> =================================================================================================
>>>>>> 
>>>>>> Why is this happen? Is there any possibilities data loss?
>>>>>> To normalize my brokers, What I have to do? Do I have to restart this
>>>>>> broker?
>>>>>> 
>>>>>> 
>>>>>> Thanks in advance.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> *Sincerely*
>>>> *,**Bongyeon Kim*
>>>> 
>>>> Java Developer & Engineer
>>>> Seoul, Korea
>>>> Mobile:  +82-10-9369-1314
>>>> Email:  bongyeonkim@gmail.com
>>>> Twitter:  http://twitter.com/tigerby
>>>> Facebook:  http://facebook.com/tigerby
>>>> Wiki: http://tigerby.com
>>>> 
>> 
>> 


Re: org.apache.zookeeper.KeeperException$BadVersionException

Posted by Jun Rao <ju...@gmail.com>.
The info from kafka-topics is the correct one. Is broker 5 dead? It seems
that you can issue metadata request to it.

Thanks,

Jun


On Tue, Jun 10, 2014 at 8:26 PM, Bongyeon Kim <bo...@gmail.com>
wrote:

> With some WARN log, Yes, it is.
>
> and I found interesting things separately before I mentioned.
> I have another clusters. I run 2 brokers on 1 machine for test. and I see
> same problem before I mentioned, but I can’t see any error log on
> controller.log.
>
> At this time, when I list topics with kafka-topic tool, I see information
> like below.
>
> == kafka-topics.sh ===================================================
>
> Topic:topicTRACE        PartitionCount:2        ReplicationFactor:2
> Configs:retention.ms=3600000
>         Topic: topicTRACE       Partition: 0    Leader: 6       Replicas:
> 5,6   Isr: 6
>         Topic: topicTRACE       Partition: 1    Leader: 6       Replicas:
> 6,5   Isr: 6,5
>
> ======================================================================
>
> but, producer keeps producing broker 5 which seems to be dead.
>
> when I get metadata from broker using my version of java api tool, info of
> leader and isr is different.
> Speaking properly, metadata from one broker is same as data from
> kafka-topics tool, metadata from another broker is different.
>
> ========================================================================
>
> $ bin/kafka-run-class.sh com.kthcorp.daisy.ccprt.util.KafkaMetadata
> c-ccp-tk1-a60:9091 topicTRACE
> ---- topic info ----
> partition: 0, leader: 5, replica: [id:5,host:c-ccp-tk1-a60,port:9091,
> id:6,host:c-ccp-tk1-a60,port:9092], isr:
> [id:5,host:c-ccp-tk1-a60,port:9091, id:6,host:c-ccp-tk1-a60,port:9092]
> partition: 1, leader: 6, replica: [id:6,host:c-ccp-tk1-a60,port:9092,
> id:5,host:c-ccp-tk1-a60,port:9091], isr:
> [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091]
>
> ========================================================================
>
> $ bin/kafka-run-class.sh com.kthcorp.daisy.ccprt.util.KafkaMetadata
> c-ccp-tk1-a60:9092 topicTRACE
> ---- topic info ----
> partition: 0, leader: 6, replica: [id:5,host:c-ccp-tk1-a60,port:9091,
> id:6,host:c-ccp-tk1-a60,port:9092], isr: [id:6,host:c-ccp-tk1-a60,port:9092]
> partition: 1, leader: 6, replica: [id:6,host:c-ccp-tk1-a60,port:9092,
> id:5,host:c-ccp-tk1-a60,port:9091], isr:
> [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091]
>
> ========================================================================
>
> which one is correct? why is it happened?
>
>
> Thanks~
>
>
>
> On Jun 10, 2014, at 11:28 PM, Jun Rao <ju...@gmail.com> wrote:
>
> > Ok. Was this host (broker id:1,host:c-ccp-tk1-a58,port:9091) up when the
> > controller had SocketTimeoutException?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Mon, Jun 9, 2014 at 10:11 PM, Bongyeon Kim <bo...@gmail.com>
> > wrote:
> >
> >> No, I can see any ZK session expiration log.
> >>
> >> What I have to do to prevent this? Increasing '
> >> zookeeper.session.timeout.ms'
> >> can help?
> >>
> >>
> >> On Tue, Jun 10, 2014 at 12:58 PM, Jun Rao <ju...@gmail.com> wrote:
> >>
> >>> This is probably related to kafka-1382. The root cause is likely ZK
> >> session
> >>> expiration in the broker. Did you see any?
> >>>
> >>> Thanks,
> >>>
> >>> Jun
> >>>
> >>>
> >>> On Mon, Jun 9, 2014 at 8:11 PM, Bongyeon Kim <bo...@gmail.com>
> >>> wrote:
> >>>
> >>>> Hi, team.
> >>>>
> >>>> I’m using 0.8.1.
> >>>> I found some strange log repeatedly on server.log in one of my brokers
> >>> and
> >>>> it keeps logging until now.
> >>>>
> >>>> server.log
> >>>>
> >>>
> >>
> ======================================================================================
> >>>> ...
> >>>> [2014-06-09 10:41:47,402] ERROR Conditional update of path
> >>>> /brokers/topics/topicTRACE/partitions/1/state with data
> >>>>
> >>>
> >>
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[4,2]}
> >>>> and expected version 439 failed due to
> >>>> org.apache.zookeeper.KeeperException$BadVersionException:
> >>> KeeperErrorCode =
> >>>> BadVersion for /brokers/topics/topicTRACE/partitions/1/state
> >>>> (kafka.utils.ZkUtils$)
> >>>> [2014-06-09 10:41:47,402] INFO Partition [topicTRACE,1] on broker 2:
> >>>> Cached zkVersion [439] not equal to that in zookeeper, skip updating
> >> ISR
> >>>> (kafka.cluster.Partition)
> >>>> [2014-06-09 10:41:47,402] INFO Partition [topicDEBUG,0] on broker 2:
> >>>> Shrinking ISR for partition [topicDEBUG,0] from 1,3,2 to 2
> >>>> (kafka.cluster.Partition)
> >>>> [2014-06-09 10:41:47,416] ERROR Conditional update of path
> >>>> /brokers/topics/topicDEBUG/partitions/0/state with data
> >>>>
> >>>
> >>
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[2]}
> >>>> and expected version 1424 failed due to
> >>>> org.apache.zookeeper.KeeperException$BadVersionException:
> >>> KeeperErrorCode =
> >>>> BadVersion for /brokers/topics/topicDEBUG/partitions/0/state
> >>>> (kafka.utils.ZkUtils$)
> >>>> [2014-06-09 10:41:47,432] INFO Partition [topicDEBUG,0] on broker 2:
> >>>> Cached zkVersion [1424] not equal to that in zookeeper, skip updating
> >> ISR
> >>>> (kafka.cluster.Partition)
> >>>> [2014-06-09 10:41:47,432] INFO Partition [topicCDR,3] on broker 2:
> >>>> Shrinking ISR for partition [topicCDR,3] from 4,1,2 to 2
> >>>> (kafka.cluster.Partition)
> >>>> [2014-06-09 10:41:47,435] ERROR Conditional update of path
> >>>> /brokers/topics/topicCDR/partitions/3/state with data
> >>>>
> >>>
> >>
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":46,"isr":[2]}
> >>>> and expected version 541 failed due to
> >>>> org.apache.zookeeper.KeeperException$BadVersionException:
> >>> KeeperErrorCode =
> >>>> BadVersion for /brokers/topics/topicCDR/partitions/3/state
> >>>> (kafka.utils.ZkUtils$)
> >>>> [2014-06-09 10:41:47,435] INFO Partition [topicCDR,3] on broker 2:
> >> Cached
> >>>> zkVersion [541] not equal to that in zookeeper, skip updating ISR
> >>>> (kafka.cluster.Partition)
> >>>> [2014-06-09 10:41:48,426] INFO Partition [topicTRACE,1] on broker 2:
> >>>> Shrinking ISR for partition [topicTRACE,1] from 4,3,2 to 4,2
> >>>> (kafka.cluster.Partition)
> >>>> ...
> >>>>
> >>>>
> >>>
> >>
> =================================================================================================
> >>>>
> >>>> and found some error and warning in controller.log
> >>>>
> >>>>
> >>>> controller.log
> >>>>
> >>>
> >>
> ======================================================================================
> >>>> ...
> >>>> [2014-06-09 10:42:03,962] WARN [Controller-3-to-broker-1-send-thread],
> >>>> Controller 3 fails to send a request to broker
> >>>> id:1,host:c-ccp-tk1-a58,port:9091 (kafka.controller.RequestSendThread)
> >>>> java.net.SocketTimeoutException
> >>>>        at
> >>>>
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> >>>>        at
> >>> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> >>>>        at
> >>>>
> >>>
> >>
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> >>>>        at kafka.utils.Utils$.read(Utils.scala:375)
> >>>>        at
> >>>>
> >>>
> >>
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> >>>>        at
> >>>> kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> >>>>        at
> >>>>
> >>>
> >>
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> >>>>        at
> >>> kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
> >>>>        at
> >>>>
> >>>
> >>
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
> >>>>        at
> >>> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> >>>> [2014-06-09 10:42:03,964] ERROR
> [Controller-3-to-broker-1-send-thread],
> >>>> Controller 3 epoch 21 failed to send UpdateMetadata request with
> >>>> correlation id 1 to broker id:1,host:c-ccp-tk1-a58,port:9091.
> >>> Reconnecting
> >>>> to broker. (kafka.controller.RequestSendThread)
> >>>> java.nio.channels.ClosedChannelException
> >>>>        at kafka.network.BlockingChannel.send(BlockingChannel.scala:89)
> >>>>        at
> >>>>
> >>>
> >>
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
> >>>>        at
> >>>>
> >>>
> >>
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
> >>>>        at
> >>> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> >>>>
> >>>> ...
> >>>>
> >>>> [2014-06-09 10:42:38,064] WARN [OfflinePartitionLeaderSelector]: No
> >>> broker
> >>>> in ISR is alive for [topicTRACE,0]. Elect leader 3 from live brokers
> 3.
> >>>> There's potential data loss.
> >>>> (kafka.controller.OfflinePartitionLeaderSelector)
> >>>> ...
> >>>>
> >>>>
> >>>
> >>
> =================================================================================================
> >>>>
> >>>> Why is this happen? Is there any possibilities data loss?
> >>>> To normalize my brokers, What I have to do? Do I have to restart this
> >>>> broker?
> >>>>
> >>>>
> >>>> Thanks in advance.
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> *Sincerely*
> >> *,**Bongyeon Kim*
> >>
> >> Java Developer & Engineer
> >> Seoul, Korea
> >> Mobile:  +82-10-9369-1314
> >> Email:  bongyeonkim@gmail.com
> >> Twitter:  http://twitter.com/tigerby
> >> Facebook:  http://facebook.com/tigerby
> >> Wiki: http://tigerby.com
> >>
>
>

Re: org.apache.zookeeper.KeeperException$BadVersionException

Posted by Bongyeon Kim <bo...@gmail.com>.
With some WARN log, Yes, it is. 

and I found interesting things separately before I mentioned.
I have another clusters. I run 2 brokers on 1 machine for test. and I see same problem before I mentioned, but I can’t see any error log on controller.log.

At this time, when I list topics with kafka-topic tool, I see information like below.

== kafka-topics.sh ===================================================

Topic:topicTRACE        PartitionCount:2        ReplicationFactor:2     Configs:retention.ms=3600000
        Topic: topicTRACE       Partition: 0    Leader: 6       Replicas: 5,6   Isr: 6
        Topic: topicTRACE       Partition: 1    Leader: 6       Replicas: 6,5   Isr: 6,5

======================================================================

but, producer keeps producing broker 5 which seems to be dead.

when I get metadata from broker using my version of java api tool, info of leader and isr is different. 
Speaking properly, metadata from one broker is same as data from kafka-topics tool, metadata from another broker is different.

========================================================================

$ bin/kafka-run-class.sh com.kthcorp.daisy.ccprt.util.KafkaMetadata c-ccp-tk1-a60:9091 topicTRACE
---- topic info ----
partition: 0, leader: 5, replica: [id:5,host:c-ccp-tk1-a60,port:9091, id:6,host:c-ccp-tk1-a60,port:9092], isr: [id:5,host:c-ccp-tk1-a60,port:9091, id:6,host:c-ccp-tk1-a60,port:9092]
partition: 1, leader: 6, replica: [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091], isr: [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091]

========================================================================

$ bin/kafka-run-class.sh com.kthcorp.daisy.ccprt.util.KafkaMetadata c-ccp-tk1-a60:9092 topicTRACE
---- topic info ----
partition: 0, leader: 6, replica: [id:5,host:c-ccp-tk1-a60,port:9091, id:6,host:c-ccp-tk1-a60,port:9092], isr: [id:6,host:c-ccp-tk1-a60,port:9092]
partition: 1, leader: 6, replica: [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091], isr: [id:6,host:c-ccp-tk1-a60,port:9092, id:5,host:c-ccp-tk1-a60,port:9091]        

========================================================================

which one is correct? why is it happened?


Thanks~



On Jun 10, 2014, at 11:28 PM, Jun Rao <ju...@gmail.com> wrote:

> Ok. Was this host (broker id:1,host:c-ccp-tk1-a58,port:9091) up when the
> controller had SocketTimeoutException?
> 
> Thanks,
> 
> Jun
> 
> 
> On Mon, Jun 9, 2014 at 10:11 PM, Bongyeon Kim <bo...@gmail.com>
> wrote:
> 
>> No, I can see any ZK session expiration log.
>> 
>> What I have to do to prevent this? Increasing '
>> zookeeper.session.timeout.ms'
>> can help?
>> 
>> 
>> On Tue, Jun 10, 2014 at 12:58 PM, Jun Rao <ju...@gmail.com> wrote:
>> 
>>> This is probably related to kafka-1382. The root cause is likely ZK
>> session
>>> expiration in the broker. Did you see any?
>>> 
>>> Thanks,
>>> 
>>> Jun
>>> 
>>> 
>>> On Mon, Jun 9, 2014 at 8:11 PM, Bongyeon Kim <bo...@gmail.com>
>>> wrote:
>>> 
>>>> Hi, team.
>>>> 
>>>> I’m using 0.8.1.
>>>> I found some strange log repeatedly on server.log in one of my brokers
>>> and
>>>> it keeps logging until now.
>>>> 
>>>> server.log
>>>> 
>>> 
>> ======================================================================================
>>>> ...
>>>> [2014-06-09 10:41:47,402] ERROR Conditional update of path
>>>> /brokers/topics/topicTRACE/partitions/1/state with data
>>>> 
>>> 
>> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[4,2]}
>>>> and expected version 439 failed due to
>>>> org.apache.zookeeper.KeeperException$BadVersionException:
>>> KeeperErrorCode =
>>>> BadVersion for /brokers/topics/topicTRACE/partitions/1/state
>>>> (kafka.utils.ZkUtils$)
>>>> [2014-06-09 10:41:47,402] INFO Partition [topicTRACE,1] on broker 2:
>>>> Cached zkVersion [439] not equal to that in zookeeper, skip updating
>> ISR
>>>> (kafka.cluster.Partition)
>>>> [2014-06-09 10:41:47,402] INFO Partition [topicDEBUG,0] on broker 2:
>>>> Shrinking ISR for partition [topicDEBUG,0] from 1,3,2 to 2
>>>> (kafka.cluster.Partition)
>>>> [2014-06-09 10:41:47,416] ERROR Conditional update of path
>>>> /brokers/topics/topicDEBUG/partitions/0/state with data
>>>> 
>>> 
>> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[2]}
>>>> and expected version 1424 failed due to
>>>> org.apache.zookeeper.KeeperException$BadVersionException:
>>> KeeperErrorCode =
>>>> BadVersion for /brokers/topics/topicDEBUG/partitions/0/state
>>>> (kafka.utils.ZkUtils$)
>>>> [2014-06-09 10:41:47,432] INFO Partition [topicDEBUG,0] on broker 2:
>>>> Cached zkVersion [1424] not equal to that in zookeeper, skip updating
>> ISR
>>>> (kafka.cluster.Partition)
>>>> [2014-06-09 10:41:47,432] INFO Partition [topicCDR,3] on broker 2:
>>>> Shrinking ISR for partition [topicCDR,3] from 4,1,2 to 2
>>>> (kafka.cluster.Partition)
>>>> [2014-06-09 10:41:47,435] ERROR Conditional update of path
>>>> /brokers/topics/topicCDR/partitions/3/state with data
>>>> 
>>> 
>> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":46,"isr":[2]}
>>>> and expected version 541 failed due to
>>>> org.apache.zookeeper.KeeperException$BadVersionException:
>>> KeeperErrorCode =
>>>> BadVersion for /brokers/topics/topicCDR/partitions/3/state
>>>> (kafka.utils.ZkUtils$)
>>>> [2014-06-09 10:41:47,435] INFO Partition [topicCDR,3] on broker 2:
>> Cached
>>>> zkVersion [541] not equal to that in zookeeper, skip updating ISR
>>>> (kafka.cluster.Partition)
>>>> [2014-06-09 10:41:48,426] INFO Partition [topicTRACE,1] on broker 2:
>>>> Shrinking ISR for partition [topicTRACE,1] from 4,3,2 to 4,2
>>>> (kafka.cluster.Partition)
>>>> ...
>>>> 
>>>> 
>>> 
>> =================================================================================================
>>>> 
>>>> and found some error and warning in controller.log
>>>> 
>>>> 
>>>> controller.log
>>>> 
>>> 
>> ======================================================================================
>>>> ...
>>>> [2014-06-09 10:42:03,962] WARN [Controller-3-to-broker-1-send-thread],
>>>> Controller 3 fails to send a request to broker
>>>> id:1,host:c-ccp-tk1-a58,port:9091 (kafka.controller.RequestSendThread)
>>>> java.net.SocketTimeoutException
>>>>        at
>>>> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
>>>>        at
>>> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
>>>>        at
>>>> 
>>> 
>> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
>>>>        at kafka.utils.Utils$.read(Utils.scala:375)
>>>>        at
>>>> 
>>> 
>> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>>>>        at
>>>> kafka.network.Receive$class.readCompletely(Transmission.scala:56)
>>>>        at
>>>> 
>>> 
>> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
>>>>        at
>>> kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
>>>>        at
>>>> 
>>> 
>> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
>>>>        at
>>> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
>>>> [2014-06-09 10:42:03,964] ERROR [Controller-3-to-broker-1-send-thread],
>>>> Controller 3 epoch 21 failed to send UpdateMetadata request with
>>>> correlation id 1 to broker id:1,host:c-ccp-tk1-a58,port:9091.
>>> Reconnecting
>>>> to broker. (kafka.controller.RequestSendThread)
>>>> java.nio.channels.ClosedChannelException
>>>>        at kafka.network.BlockingChannel.send(BlockingChannel.scala:89)
>>>>        at
>>>> 
>>> 
>> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
>>>>        at
>>>> 
>>> 
>> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>>>>        at
>>> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
>>>> 
>>>> ...
>>>> 
>>>> [2014-06-09 10:42:38,064] WARN [OfflinePartitionLeaderSelector]: No
>>> broker
>>>> in ISR is alive for [topicTRACE,0]. Elect leader 3 from live brokers 3.
>>>> There's potential data loss.
>>>> (kafka.controller.OfflinePartitionLeaderSelector)
>>>> ...
>>>> 
>>>> 
>>> 
>> =================================================================================================
>>>> 
>>>> Why is this happen? Is there any possibilities data loss?
>>>> To normalize my brokers, What I have to do? Do I have to restart this
>>>> broker?
>>>> 
>>>> 
>>>> Thanks in advance.
>>>> 
>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> *Sincerely*
>> *,**Bongyeon Kim*
>> 
>> Java Developer & Engineer
>> Seoul, Korea
>> Mobile:  +82-10-9369-1314
>> Email:  bongyeonkim@gmail.com
>> Twitter:  http://twitter.com/tigerby
>> Facebook:  http://facebook.com/tigerby
>> Wiki: http://tigerby.com
>> 


Re: org.apache.zookeeper.KeeperException$BadVersionException

Posted by Jun Rao <ju...@gmail.com>.
Ok. Was this host (broker id:1,host:c-ccp-tk1-a58,port:9091) up when the
controller had SocketTimeoutException?

Thanks,

Jun


On Mon, Jun 9, 2014 at 10:11 PM, Bongyeon Kim <bo...@gmail.com>
wrote:

> No, I can see any ZK session expiration log.
>
> What I have to do to prevent this? Increasing '
> zookeeper.session.timeout.ms'
> can help?
>
>
> On Tue, Jun 10, 2014 at 12:58 PM, Jun Rao <ju...@gmail.com> wrote:
>
> > This is probably related to kafka-1382. The root cause is likely ZK
> session
> > expiration in the broker. Did you see any?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Mon, Jun 9, 2014 at 8:11 PM, Bongyeon Kim <bo...@gmail.com>
> > wrote:
> >
> > > Hi, team.
> > >
> > > I’m using 0.8.1.
> > > I found some strange log repeatedly on server.log in one of my brokers
> > and
> > > it keeps logging until now.
> > >
> > > server.log
> > >
> >
> ======================================================================================
> > > ...
> > > [2014-06-09 10:41:47,402] ERROR Conditional update of path
> > > /brokers/topics/topicTRACE/partitions/1/state with data
> > >
> >
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[4,2]}
> > > and expected version 439 failed due to
> > > org.apache.zookeeper.KeeperException$BadVersionException:
> > KeeperErrorCode =
> > > BadVersion for /brokers/topics/topicTRACE/partitions/1/state
> > > (kafka.utils.ZkUtils$)
> > > [2014-06-09 10:41:47,402] INFO Partition [topicTRACE,1] on broker 2:
> > > Cached zkVersion [439] not equal to that in zookeeper, skip updating
> ISR
> > > (kafka.cluster.Partition)
> > > [2014-06-09 10:41:47,402] INFO Partition [topicDEBUG,0] on broker 2:
> > > Shrinking ISR for partition [topicDEBUG,0] from 1,3,2 to 2
> > > (kafka.cluster.Partition)
> > > [2014-06-09 10:41:47,416] ERROR Conditional update of path
> > > /brokers/topics/topicDEBUG/partitions/0/state with data
> > >
> >
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[2]}
> > > and expected version 1424 failed due to
> > > org.apache.zookeeper.KeeperException$BadVersionException:
> > KeeperErrorCode =
> > > BadVersion for /brokers/topics/topicDEBUG/partitions/0/state
> > > (kafka.utils.ZkUtils$)
> > > [2014-06-09 10:41:47,432] INFO Partition [topicDEBUG,0] on broker 2:
> > > Cached zkVersion [1424] not equal to that in zookeeper, skip updating
> ISR
> > > (kafka.cluster.Partition)
> > > [2014-06-09 10:41:47,432] INFO Partition [topicCDR,3] on broker 2:
> > > Shrinking ISR for partition [topicCDR,3] from 4,1,2 to 2
> > > (kafka.cluster.Partition)
> > > [2014-06-09 10:41:47,435] ERROR Conditional update of path
> > > /brokers/topics/topicCDR/partitions/3/state with data
> > >
> >
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":46,"isr":[2]}
> > > and expected version 541 failed due to
> > > org.apache.zookeeper.KeeperException$BadVersionException:
> > KeeperErrorCode =
> > > BadVersion for /brokers/topics/topicCDR/partitions/3/state
> > > (kafka.utils.ZkUtils$)
> > > [2014-06-09 10:41:47,435] INFO Partition [topicCDR,3] on broker 2:
> Cached
> > > zkVersion [541] not equal to that in zookeeper, skip updating ISR
> > > (kafka.cluster.Partition)
> > > [2014-06-09 10:41:48,426] INFO Partition [topicTRACE,1] on broker 2:
> > > Shrinking ISR for partition [topicTRACE,1] from 4,3,2 to 4,2
> > > (kafka.cluster.Partition)
> > > ...
> > >
> > >
> >
> =================================================================================================
> > >
> > > and found some error and warning in controller.log
> > >
> > >
> > > controller.log
> > >
> >
> ======================================================================================
> > > ...
> > > [2014-06-09 10:42:03,962] WARN [Controller-3-to-broker-1-send-thread],
> > > Controller 3 fails to send a request to broker
> > > id:1,host:c-ccp-tk1-a58,port:9091 (kafka.controller.RequestSendThread)
> > > java.net.SocketTimeoutException
> > >         at
> > > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> > >         at
> > sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> > >         at
> > >
> >
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> > >         at kafka.utils.Utils$.read(Utils.scala:375)
> > >         at
> > >
> >
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> > >         at
> > > kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> > >         at
> > >
> >
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> > >         at
> > kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
> > >         at
> > >
> >
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
> > >         at
> > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> > > [2014-06-09 10:42:03,964] ERROR [Controller-3-to-broker-1-send-thread],
> > > Controller 3 epoch 21 failed to send UpdateMetadata request with
> > > correlation id 1 to broker id:1,host:c-ccp-tk1-a58,port:9091.
> > Reconnecting
> > > to broker. (kafka.controller.RequestSendThread)
> > > java.nio.channels.ClosedChannelException
> > >         at kafka.network.BlockingChannel.send(BlockingChannel.scala:89)
> > >         at
> > >
> >
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
> > >         at
> > >
> >
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
> > >         at
> > kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> > >
> > > ...
> > >
> > > [2014-06-09 10:42:38,064] WARN [OfflinePartitionLeaderSelector]: No
> > broker
> > > in ISR is alive for [topicTRACE,0]. Elect leader 3 from live brokers 3.
> > > There's potential data loss.
> > > (kafka.controller.OfflinePartitionLeaderSelector)
> > > ...
> > >
> > >
> >
> =================================================================================================
> > >
> > > Why is this happen? Is there any possibilities data loss?
> > > To normalize my brokers, What I have to do? Do I have to restart this
> > > broker?
> > >
> > >
> > > Thanks in advance.
> > >
> > >
> > >
> >
>
>
>
> --
> *Sincerely*
> *,**Bongyeon Kim*
>
> Java Developer & Engineer
> Seoul, Korea
> Mobile:  +82-10-9369-1314
> Email:  bongyeonkim@gmail.com
> Twitter:  http://twitter.com/tigerby
> Facebook:  http://facebook.com/tigerby
> Wiki: http://tigerby.com
>

Re: org.apache.zookeeper.KeeperException$BadVersionException

Posted by Bongyeon Kim <bo...@gmail.com>.
No, I can see any ZK session expiration log.

What I have to do to prevent this? Increasing 'zookeeper.session.timeout.ms'
can help?


On Tue, Jun 10, 2014 at 12:58 PM, Jun Rao <ju...@gmail.com> wrote:

> This is probably related to kafka-1382. The root cause is likely ZK session
> expiration in the broker. Did you see any?
>
> Thanks,
>
> Jun
>
>
> On Mon, Jun 9, 2014 at 8:11 PM, Bongyeon Kim <bo...@gmail.com>
> wrote:
>
> > Hi, team.
> >
> > I’m using 0.8.1.
> > I found some strange log repeatedly on server.log in one of my brokers
> and
> > it keeps logging until now.
> >
> > server.log
> >
> ======================================================================================
> > ...
> > [2014-06-09 10:41:47,402] ERROR Conditional update of path
> > /brokers/topics/topicTRACE/partitions/1/state with data
> >
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[4,2]}
> > and expected version 439 failed due to
> > org.apache.zookeeper.KeeperException$BadVersionException:
> KeeperErrorCode =
> > BadVersion for /brokers/topics/topicTRACE/partitions/1/state
> > (kafka.utils.ZkUtils$)
> > [2014-06-09 10:41:47,402] INFO Partition [topicTRACE,1] on broker 2:
> > Cached zkVersion [439] not equal to that in zookeeper, skip updating ISR
> > (kafka.cluster.Partition)
> > [2014-06-09 10:41:47,402] INFO Partition [topicDEBUG,0] on broker 2:
> > Shrinking ISR for partition [topicDEBUG,0] from 1,3,2 to 2
> > (kafka.cluster.Partition)
> > [2014-06-09 10:41:47,416] ERROR Conditional update of path
> > /brokers/topics/topicDEBUG/partitions/0/state with data
> >
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[2]}
> > and expected version 1424 failed due to
> > org.apache.zookeeper.KeeperException$BadVersionException:
> KeeperErrorCode =
> > BadVersion for /brokers/topics/topicDEBUG/partitions/0/state
> > (kafka.utils.ZkUtils$)
> > [2014-06-09 10:41:47,432] INFO Partition [topicDEBUG,0] on broker 2:
> > Cached zkVersion [1424] not equal to that in zookeeper, skip updating ISR
> > (kafka.cluster.Partition)
> > [2014-06-09 10:41:47,432] INFO Partition [topicCDR,3] on broker 2:
> > Shrinking ISR for partition [topicCDR,3] from 4,1,2 to 2
> > (kafka.cluster.Partition)
> > [2014-06-09 10:41:47,435] ERROR Conditional update of path
> > /brokers/topics/topicCDR/partitions/3/state with data
> >
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":46,"isr":[2]}
> > and expected version 541 failed due to
> > org.apache.zookeeper.KeeperException$BadVersionException:
> KeeperErrorCode =
> > BadVersion for /brokers/topics/topicCDR/partitions/3/state
> > (kafka.utils.ZkUtils$)
> > [2014-06-09 10:41:47,435] INFO Partition [topicCDR,3] on broker 2: Cached
> > zkVersion [541] not equal to that in zookeeper, skip updating ISR
> > (kafka.cluster.Partition)
> > [2014-06-09 10:41:48,426] INFO Partition [topicTRACE,1] on broker 2:
> > Shrinking ISR for partition [topicTRACE,1] from 4,3,2 to 4,2
> > (kafka.cluster.Partition)
> > ...
> >
> >
> =================================================================================================
> >
> > and found some error and warning in controller.log
> >
> >
> > controller.log
> >
> ======================================================================================
> > ...
> > [2014-06-09 10:42:03,962] WARN [Controller-3-to-broker-1-send-thread],
> > Controller 3 fails to send a request to broker
> > id:1,host:c-ccp-tk1-a58,port:9091 (kafka.controller.RequestSendThread)
> > java.net.SocketTimeoutException
> >         at
> > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> >         at
> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> >         at
> >
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> >         at kafka.utils.Utils$.read(Utils.scala:375)
> >         at
> >
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> >         at
> > kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> >         at
> >
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> >         at
> kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
> >         at
> >
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
> >         at
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> > [2014-06-09 10:42:03,964] ERROR [Controller-3-to-broker-1-send-thread],
> > Controller 3 epoch 21 failed to send UpdateMetadata request with
> > correlation id 1 to broker id:1,host:c-ccp-tk1-a58,port:9091.
> Reconnecting
> > to broker. (kafka.controller.RequestSendThread)
> > java.nio.channels.ClosedChannelException
> >         at kafka.network.BlockingChannel.send(BlockingChannel.scala:89)
> >         at
> >
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
> >         at
> >
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
> >         at
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> >
> > ...
> >
> > [2014-06-09 10:42:38,064] WARN [OfflinePartitionLeaderSelector]: No
> broker
> > in ISR is alive for [topicTRACE,0]. Elect leader 3 from live brokers 3.
> > There's potential data loss.
> > (kafka.controller.OfflinePartitionLeaderSelector)
> > ...
> >
> >
> =================================================================================================
> >
> > Why is this happen? Is there any possibilities data loss?
> > To normalize my brokers, What I have to do? Do I have to restart this
> > broker?
> >
> >
> > Thanks in advance.
> >
> >
> >
>



-- 
*Sincerely*
*,**Bongyeon Kim*

Java Developer & Engineer
Seoul, Korea
Mobile:  +82-10-9369-1314
Email:  bongyeonkim@gmail.com
Twitter:  http://twitter.com/tigerby
Facebook:  http://facebook.com/tigerby
Wiki: http://tigerby.com

Re: org.apache.zookeeper.KeeperException$BadVersionException

Posted by Jun Rao <ju...@gmail.com>.
This is probably related to kafka-1382. The root cause is likely ZK session
expiration in the broker. Did you see any?

Thanks,

Jun


On Mon, Jun 9, 2014 at 8:11 PM, Bongyeon Kim <bo...@gmail.com> wrote:

> Hi, team.
>
> I’m using 0.8.1.
> I found some strange log repeatedly on server.log in one of my brokers and
> it keeps logging until now.
>
> server.log
> ======================================================================================
> ...
> [2014-06-09 10:41:47,402] ERROR Conditional update of path
> /brokers/topics/topicTRACE/partitions/1/state with data
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[4,2]}
> and expected version 439 failed due to
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode =
> BadVersion for /brokers/topics/topicTRACE/partitions/1/state
> (kafka.utils.ZkUtils$)
> [2014-06-09 10:41:47,402] INFO Partition [topicTRACE,1] on broker 2:
> Cached zkVersion [439] not equal to that in zookeeper, skip updating ISR
> (kafka.cluster.Partition)
> [2014-06-09 10:41:47,402] INFO Partition [topicDEBUG,0] on broker 2:
> Shrinking ISR for partition [topicDEBUG,0] from 1,3,2 to 2
> (kafka.cluster.Partition)
> [2014-06-09 10:41:47,416] ERROR Conditional update of path
> /brokers/topics/topicDEBUG/partitions/0/state with data
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":43,"isr":[2]}
> and expected version 1424 failed due to
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode =
> BadVersion for /brokers/topics/topicDEBUG/partitions/0/state
> (kafka.utils.ZkUtils$)
> [2014-06-09 10:41:47,432] INFO Partition [topicDEBUG,0] on broker 2:
> Cached zkVersion [1424] not equal to that in zookeeper, skip updating ISR
> (kafka.cluster.Partition)
> [2014-06-09 10:41:47,432] INFO Partition [topicCDR,3] on broker 2:
> Shrinking ISR for partition [topicCDR,3] from 4,1,2 to 2
> (kafka.cluster.Partition)
> [2014-06-09 10:41:47,435] ERROR Conditional update of path
> /brokers/topics/topicCDR/partitions/3/state with data
> {"controller_epoch":19,"leader":2,"version":1,"leader_epoch":46,"isr":[2]}
> and expected version 541 failed due to
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode =
> BadVersion for /brokers/topics/topicCDR/partitions/3/state
> (kafka.utils.ZkUtils$)
> [2014-06-09 10:41:47,435] INFO Partition [topicCDR,3] on broker 2: Cached
> zkVersion [541] not equal to that in zookeeper, skip updating ISR
> (kafka.cluster.Partition)
> [2014-06-09 10:41:48,426] INFO Partition [topicTRACE,1] on broker 2:
> Shrinking ISR for partition [topicTRACE,1] from 4,3,2 to 4,2
> (kafka.cluster.Partition)
> ...
>
> =================================================================================================
>
> and found some error and warning in controller.log
>
>
> controller.log
> ======================================================================================
> ...
> [2014-06-09 10:42:03,962] WARN [Controller-3-to-broker-1-send-thread],
> Controller 3 fails to send a request to broker
> id:1,host:c-ccp-tk1-a58,port:9091 (kafka.controller.RequestSendThread)
> java.net.SocketTimeoutException
>         at
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
>         at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
>         at
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
>         at kafka.utils.Utils$.read(Utils.scala:375)
>         at
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>         at
> kafka.network.Receive$class.readCompletely(Transmission.scala:56)
>         at
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
>         at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
>         at
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:146)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> [2014-06-09 10:42:03,964] ERROR [Controller-3-to-broker-1-send-thread],
> Controller 3 epoch 21 failed to send UpdateMetadata request with
> correlation id 1 to broker id:1,host:c-ccp-tk1-a58,port:9091. Reconnecting
> to broker. (kafka.controller.RequestSendThread)
> java.nio.channels.ClosedChannelException
>         at kafka.network.BlockingChannel.send(BlockingChannel.scala:89)
>         at
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
>         at
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
>
> ...
>
> [2014-06-09 10:42:38,064] WARN [OfflinePartitionLeaderSelector]: No broker
> in ISR is alive for [topicTRACE,0]. Elect leader 3 from live brokers 3.
> There's potential data loss.
> (kafka.controller.OfflinePartitionLeaderSelector)
> ...
>
> =================================================================================================
>
> Why is this happen? Is there any possibilities data loss?
> To normalize my brokers, What I have to do? Do I have to restart this
> broker?
>
>
> Thanks in advance.
>
>
>