You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Bhavesh Mistry <mi...@gmail.com> on 2014/06/13 01:39:00 UTC

Kafka High Level Consumer Fail Over

Hi Kafka Dev Team/ Users,

We have high level consumer group consuming from 32 partitions for a
topic.   We  have been running 48 consumers in this group across multiple
servers.   We have kept 16 as back-up consumers, and hoping when the
consumer dies, meaning when Zookeeper does not have an owner for a
particular partition.  The back-up consumer will take over.  But I do not
see this behavior after an active consumer died, the back-up consumer did
not pick the partitions.  Please let us know what I can do to achieve
this.  This is very likely scenario when rolling out new code on consumer
side (we will be dong incremental code roll out).  Please see the exception
below.   We are using version 0.8 for now.

[mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a],
exception during rebalance
org.I0Itec.zkclient.exception.ZkNoNodeException:
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
NoNode for
/consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a
    at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
    at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
    at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
    at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
    at kafka.utils.ZkUtils$.readData(Unknown Source)
    at kafka.consumer.TopicCount$.constructTopicCount(Unknown Source)
    at
kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(Unknown
Source)
    at
kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(Unknown
Source)
    at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
    at
kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
Source)
    at
kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
Source)
*Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for
/consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a*
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
    at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
    at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950)
    at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
    at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
    at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
    at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)


11 Jun 2014 14:12:16,710 ERROR
[mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a_watcher_executor]
(kafka.utils.Logging$class.error:?)  -
[mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a], error
during syncedRebalance
kafka.common.ConsumerRebalanceFailedException:
mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a can't
rebalance *after 4 retries*
    at
kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
Source)
    at
kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
Source)



Thanks,
Bhavesh

Re: Kafka High Level Consumer Fail Over

Posted by Guozhang Wang <wa...@gmail.com>.
I did not see any attachments in either emails?

Guozhang


On Fri, Jun 13, 2014 at 1:36 PM, Bhavesh Mistry <mi...@gmail.com>
wrote:

> HI Guozhang,
>
> We have own monitoring tool which get the data from the Zookeeper and you
> can see the attached screen which list no owner, but back consumer are not
> picking up the partition ?  I had restart the java process on machine which
> had the above issue.  Also,  my previous email I have send you the
> exception.  Please review the image.
>
>
> Thanks,
>
> Bhavesh
>
>
> On Fri, Jun 13, 2014 at 1:28 PM, Guozhang Wang <wa...@gmail.com> wrote:
>
>> They should automatically pick up the partition with no owner.
>>
>> Could you use kafka.tool.ConsumerOffsetChecker to verify which partition
>> does not have an owner, and check the logs of the back-up consumers for
>> rebalance process, any exceptions/warning there?
>>
>> Guozhang
>>
>>
>> On Fri, Jun 13, 2014 at 12:34 PM, Bhavesh Mistry <
>> mistry.p.bhavesh@gmail.com
>> > wrote:
>>
>> > We have 3 node cluster separate physical box  for consumer group and
>> > consumer that died "mupd_logmon_hb_
>> > events_sdc-q1-logstream-8-1402448850475-6521f70a".  On the box, I show
>> the
>> > above Exception.   What can I configure such way, that when a partition
>> in
>> > COnsumer Group does not have "Owner"  other consumers in group (that are
>> > back up can take over).  Please let me know.
>> >
>> > Thanks in advance for your help.
>> >
>> > Thanks,
>> > Bhavesh
>> >
>> >
>> > On Fri, Jun 13, 2014 at 8:14 AM, Guozhang Wang <wa...@gmail.com>
>> wrote:
>> >
>> > > From which consumer instance did you see these exceptions?
>> > >
>> > > Guozhang
>> > >
>> > >
>> > > On Thu, Jun 12, 2014 at 4:39 PM, Bhavesh Mistry <
>> > > mistry.p.bhavesh@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi Kafka Dev Team/ Users,
>> > > >
>> > > > We have high level consumer group consuming from 32 partitions for a
>> > > > topic.   We  have been running 48 consumers in this group across
>> > multiple
>> > > > servers.   We have kept 16 as back-up consumers, and hoping when the
>> > > > consumer dies, meaning when Zookeeper does not have an owner for a
>> > > > particular partition.  The back-up consumer will take over.  But I
>> do
>> > not
>> > > > see this behavior after an active consumer died, the back-up
>> consumer
>> > did
>> > > > not pick the partitions.  Please let us know what I can do to
>> achieve
>> > > > this.  This is very likely scenario when rolling out new code on
>> > consumer
>> > > > side (we will be dong incremental code roll out).  Please see the
>> > > exception
>> > > > below.   We are using version 0.8 for now.
>> > > >
>> > > > [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a],
>> > > > exception during rebalance
>> > > > org.I0Itec.zkclient.exception.ZkNoNodeException:
>> > > > org.apache.zookeeper.KeeperException$NoNodeException:
>> KeeperErrorCode =
>> > > > NoNode for
>> > > >
>> > > >
>> > >
>> >
>> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a
>> > > >     at
>> > > >
>> org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
>> > > >     at
>> > > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
>> > > >     at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
>> > > >     at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
>> > > >     at kafka.utils.ZkUtils$.readData(Unknown Source)
>> > > >     at kafka.consumer.TopicCount$.constructTopicCount(Unknown
>> Source)
>> > > >     at
>> > > >
>> > > >
>> > >
>> >
>> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(Unknown
>> > > > Source)
>> > > >     at
>> > > >
>> > > >
>> > >
>> >
>> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(Unknown
>> > > > Source)
>> > > >     at
>> scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
>> > > >     at
>> > > >
>> > > >
>> > >
>> >
>> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
>> > > > Source)
>> > > >     at
>> > > >
>> > > >
>> > >
>> >
>> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
>> > > > Source)
>> > > > *Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
>> > > > KeeperErrorCode = NoNode for
>> > > >
>> > > >
>> > >
>> >
>> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a*
>> > > >     at
>> > > >
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>> > > >     at
>> > > org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>> > > >     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
>> > > >     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950)
>> > > >     at
>> org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
>> > > >     at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
>> > > >     at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
>> > > >     at
>> > > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
>> > > >
>> > > >
>> > > > 11 Jun 2014 14:12:16,710 ERROR
>> > > >
>> > > >
>> > >
>> >
>> [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a_watcher_executor]
>> > > > (kafka.utils.Logging$class.error:?)  -
>> > > > [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a],
>> > error
>> > > > during syncedRebalance
>> > > > kafka.common.ConsumerRebalanceFailedException:
>> > > > mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a
>> can't
>> > > > rebalance *after 4 retries*
>> > > >     at
>> > > >
>> > > >
>> > >
>> >
>> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
>> > > > Source)
>> > > >     at
>> > > >
>> > > >
>> > >
>> >
>> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
>> > > > Source)
>> > > >
>> > > >
>> > > >
>> > > > Thanks,
>> > > > Bhavesh
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > -- Guozhang
>> > >
>> >
>>
>>
>>
>> --
>> -- Guozhang
>>
>
>


-- 
-- Guozhang

Re: Kafka High Level Consumer Fail Over

Posted by Bhavesh Mistry <mi...@gmail.com>.
HI Guozhang,

We have own monitoring tool which get the data from the Zookeeper and you
can see the attached screen which list no owner, but back consumer are not
picking up the partition ?  I had restart the java process on machine which
had the above issue.  Also,  my previous email I have send you the
exception.  Please review the image.


Thanks,

Bhavesh


On Fri, Jun 13, 2014 at 1:28 PM, Guozhang Wang <wa...@gmail.com> wrote:

> They should automatically pick up the partition with no owner.
>
> Could you use kafka.tool.ConsumerOffsetChecker to verify which partition
> does not have an owner, and check the logs of the back-up consumers for
> rebalance process, any exceptions/warning there?
>
> Guozhang
>
>
> On Fri, Jun 13, 2014 at 12:34 PM, Bhavesh Mistry <
> mistry.p.bhavesh@gmail.com
> > wrote:
>
> > We have 3 node cluster separate physical box  for consumer group and
> > consumer that died "mupd_logmon_hb_
> > events_sdc-q1-logstream-8-1402448850475-6521f70a".  On the box, I show
> the
> > above Exception.   What can I configure such way, that when a partition
> in
> > COnsumer Group does not have "Owner"  other consumers in group (that are
> > back up can take over).  Please let me know.
> >
> > Thanks in advance for your help.
> >
> > Thanks,
> > Bhavesh
> >
> >
> > On Fri, Jun 13, 2014 at 8:14 AM, Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> > > From which consumer instance did you see these exceptions?
> > >
> > > Guozhang
> > >
> > >
> > > On Thu, Jun 12, 2014 at 4:39 PM, Bhavesh Mistry <
> > > mistry.p.bhavesh@gmail.com>
> > > wrote:
> > >
> > > > Hi Kafka Dev Team/ Users,
> > > >
> > > > We have high level consumer group consuming from 32 partitions for a
> > > > topic.   We  have been running 48 consumers in this group across
> > multiple
> > > > servers.   We have kept 16 as back-up consumers, and hoping when the
> > > > consumer dies, meaning when Zookeeper does not have an owner for a
> > > > particular partition.  The back-up consumer will take over.  But I do
> > not
> > > > see this behavior after an active consumer died, the back-up consumer
> > did
> > > > not pick the partitions.  Please let us know what I can do to achieve
> > > > this.  This is very likely scenario when rolling out new code on
> > consumer
> > > > side (we will be dong incremental code roll out).  Please see the
> > > exception
> > > > below.   We are using version 0.8 for now.
> > > >
> > > > [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a],
> > > > exception during rebalance
> > > > org.I0Itec.zkclient.exception.ZkNoNodeException:
> > > > org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode =
> > > > NoNode for
> > > >
> > > >
> > >
> >
> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a
> > > >     at
> > > > org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
> > > >     at
> > > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
> > > >     at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
> > > >     at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
> > > >     at kafka.utils.ZkUtils$.readData(Unknown Source)
> > > >     at kafka.consumer.TopicCount$.constructTopicCount(Unknown Source)
> > > >     at
> > > >
> > > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(Unknown
> > > > Source)
> > > >     at
> > > >
> > > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(Unknown
> > > > Source)
> > > >     at
> scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> > > >     at
> > > >
> > > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
> > > > Source)
> > > >     at
> > > >
> > > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
> > > > Source)
> > > > *Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> > > > KeeperErrorCode = NoNode for
> > > >
> > > >
> > >
> >
> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a*
> > > >     at
> > > > org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
> > > >     at
> > > org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> > > >     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
> > > >     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950)
> > > >     at
> org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
> > > >     at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
> > > >     at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
> > > >     at
> > > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
> > > >
> > > >
> > > > 11 Jun 2014 14:12:16,710 ERROR
> > > >
> > > >
> > >
> >
> [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a_watcher_executor]
> > > > (kafka.utils.Logging$class.error:?)  -
> > > > [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a],
> > error
> > > > during syncedRebalance
> > > > kafka.common.ConsumerRebalanceFailedException:
> > > > mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a can't
> > > > rebalance *after 4 retries*
> > > >     at
> > > >
> > > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
> > > > Source)
> > > >     at
> > > >
> > > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
> > > > Source)
> > > >
> > > >
> > > >
> > > > Thanks,
> > > > Bhavesh
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: Kafka High Level Consumer Fail Over

Posted by Guozhang Wang <wa...@gmail.com>.
They should automatically pick up the partition with no owner.

Could you use kafka.tool.ConsumerOffsetChecker to verify which partition
does not have an owner, and check the logs of the back-up consumers for
rebalance process, any exceptions/warning there?

Guozhang


On Fri, Jun 13, 2014 at 12:34 PM, Bhavesh Mistry <mistry.p.bhavesh@gmail.com
> wrote:

> We have 3 node cluster separate physical box  for consumer group and
> consumer that died "mupd_logmon_hb_
> events_sdc-q1-logstream-8-1402448850475-6521f70a".  On the box, I show the
> above Exception.   What can I configure such way, that when a partition in
> COnsumer Group does not have "Owner"  other consumers in group (that are
> back up can take over).  Please let me know.
>
> Thanks in advance for your help.
>
> Thanks,
> Bhavesh
>
>
> On Fri, Jun 13, 2014 at 8:14 AM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > From which consumer instance did you see these exceptions?
> >
> > Guozhang
> >
> >
> > On Thu, Jun 12, 2014 at 4:39 PM, Bhavesh Mistry <
> > mistry.p.bhavesh@gmail.com>
> > wrote:
> >
> > > Hi Kafka Dev Team/ Users,
> > >
> > > We have high level consumer group consuming from 32 partitions for a
> > > topic.   We  have been running 48 consumers in this group across
> multiple
> > > servers.   We have kept 16 as back-up consumers, and hoping when the
> > > consumer dies, meaning when Zookeeper does not have an owner for a
> > > particular partition.  The back-up consumer will take over.  But I do
> not
> > > see this behavior after an active consumer died, the back-up consumer
> did
> > > not pick the partitions.  Please let us know what I can do to achieve
> > > this.  This is very likely scenario when rolling out new code on
> consumer
> > > side (we will be dong incremental code roll out).  Please see the
> > exception
> > > below.   We are using version 0.8 for now.
> > >
> > > [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a],
> > > exception during rebalance
> > > org.I0Itec.zkclient.exception.ZkNoNodeException:
> > > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
> > > NoNode for
> > >
> > >
> >
> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a
> > >     at
> > > org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
> > >     at
> > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
> > >     at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
> > >     at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
> > >     at kafka.utils.ZkUtils$.readData(Unknown Source)
> > >     at kafka.consumer.TopicCount$.constructTopicCount(Unknown Source)
> > >     at
> > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(Unknown
> > > Source)
> > >     at
> > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(Unknown
> > > Source)
> > >     at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> > >     at
> > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
> > > Source)
> > >     at
> > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
> > > Source)
> > > *Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> > > KeeperErrorCode = NoNode for
> > >
> > >
> >
> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a*
> > >     at
> > > org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
> > >     at
> > org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> > >     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
> > >     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950)
> > >     at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
> > >     at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
> > >     at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
> > >     at
> > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
> > >
> > >
> > > 11 Jun 2014 14:12:16,710 ERROR
> > >
> > >
> >
> [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a_watcher_executor]
> > > (kafka.utils.Logging$class.error:?)  -
> > > [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a],
> error
> > > during syncedRebalance
> > > kafka.common.ConsumerRebalanceFailedException:
> > > mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a can't
> > > rebalance *after 4 retries*
> > >     at
> > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
> > > Source)
> > >     at
> > >
> > >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
> > > Source)
> > >
> > >
> > >
> > > Thanks,
> > > Bhavesh
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Re: Kafka High Level Consumer Fail Over

Posted by Bhavesh Mistry <mi...@gmail.com>.
We have 3 node cluster separate physical box  for consumer group and
consumer that died "mupd_logmon_hb_
events_sdc-q1-logstream-8-1402448850475-6521f70a".  On the box, I show the
above Exception.   What can I configure such way, that when a partition in
COnsumer Group does not have "Owner"  other consumers in group (that are
back up can take over).  Please let me know.

Thanks in advance for your help.

Thanks,
Bhavesh


On Fri, Jun 13, 2014 at 8:14 AM, Guozhang Wang <wa...@gmail.com> wrote:

> From which consumer instance did you see these exceptions?
>
> Guozhang
>
>
> On Thu, Jun 12, 2014 at 4:39 PM, Bhavesh Mistry <
> mistry.p.bhavesh@gmail.com>
> wrote:
>
> > Hi Kafka Dev Team/ Users,
> >
> > We have high level consumer group consuming from 32 partitions for a
> > topic.   We  have been running 48 consumers in this group across multiple
> > servers.   We have kept 16 as back-up consumers, and hoping when the
> > consumer dies, meaning when Zookeeper does not have an owner for a
> > particular partition.  The back-up consumer will take over.  But I do not
> > see this behavior after an active consumer died, the back-up consumer did
> > not pick the partitions.  Please let us know what I can do to achieve
> > this.  This is very likely scenario when rolling out new code on consumer
> > side (we will be dong incremental code roll out).  Please see the
> exception
> > below.   We are using version 0.8 for now.
> >
> > [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a],
> > exception during rebalance
> > org.I0Itec.zkclient.exception.ZkNoNodeException:
> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
> > NoNode for
> >
> >
> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a
> >     at
> > org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
> >     at
> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
> >     at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
> >     at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
> >     at kafka.utils.ZkUtils$.readData(Unknown Source)
> >     at kafka.consumer.TopicCount$.constructTopicCount(Unknown Source)
> >     at
> >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(Unknown
> > Source)
> >     at
> >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(Unknown
> > Source)
> >     at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> >     at
> >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
> > Source)
> >     at
> >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
> > Source)
> > *Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> > KeeperErrorCode = NoNode for
> >
> >
> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a*
> >     at
> > org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
> >     at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> >     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
> >     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950)
> >     at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
> >     at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
> >     at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
> >     at
> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
> >
> >
> > 11 Jun 2014 14:12:16,710 ERROR
> >
> >
> [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a_watcher_executor]
> > (kafka.utils.Logging$class.error:?)  -
> > [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a], error
> > during syncedRebalance
> > kafka.common.ConsumerRebalanceFailedException:
> > mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a can't
> > rebalance *after 4 retries*
> >     at
> >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
> > Source)
> >     at
> >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
> > Source)
> >
> >
> >
> > Thanks,
> > Bhavesh
> >
>
>
>
> --
> -- Guozhang
>

Re: Kafka High Level Consumer Fail Over

Posted by Bhavesh Mistry <mi...@gmail.com>.
We have 3 node cluster separate physical box  for consumer group and
consumer that died "mupd_logmon_hb_
events_sdc-q1-logstream-8-1402448850475-6521f70a".  On the box, I show the
above Exception.   What can I configure such way, that when a partition in
COnsumer Group does not have "Owner"  other consumers in group (that are
back up can take over).  Please let me know.

Thanks in advance for your help.

Thanks,
Bhavesh


On Fri, Jun 13, 2014 at 8:14 AM, Guozhang Wang <wa...@gmail.com> wrote:

> From which consumer instance did you see these exceptions?
>
> Guozhang
>
>
> On Thu, Jun 12, 2014 at 4:39 PM, Bhavesh Mistry <
> mistry.p.bhavesh@gmail.com>
> wrote:
>
> > Hi Kafka Dev Team/ Users,
> >
> > We have high level consumer group consuming from 32 partitions for a
> > topic.   We  have been running 48 consumers in this group across multiple
> > servers.   We have kept 16 as back-up consumers, and hoping when the
> > consumer dies, meaning when Zookeeper does not have an owner for a
> > particular partition.  The back-up consumer will take over.  But I do not
> > see this behavior after an active consumer died, the back-up consumer did
> > not pick the partitions.  Please let us know what I can do to achieve
> > this.  This is very likely scenario when rolling out new code on consumer
> > side (we will be dong incremental code roll out).  Please see the
> exception
> > below.   We are using version 0.8 for now.
> >
> > [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a],
> > exception during rebalance
> > org.I0Itec.zkclient.exception.ZkNoNodeException:
> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
> > NoNode for
> >
> >
> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a
> >     at
> > org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
> >     at
> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
> >     at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
> >     at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
> >     at kafka.utils.ZkUtils$.readData(Unknown Source)
> >     at kafka.consumer.TopicCount$.constructTopicCount(Unknown Source)
> >     at
> >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(Unknown
> > Source)
> >     at
> >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(Unknown
> > Source)
> >     at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> >     at
> >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
> > Source)
> >     at
> >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
> > Source)
> > *Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> > KeeperErrorCode = NoNode for
> >
> >
> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a*
> >     at
> > org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
> >     at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> >     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
> >     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950)
> >     at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
> >     at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
> >     at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
> >     at
> org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
> >
> >
> > 11 Jun 2014 14:12:16,710 ERROR
> >
> >
> [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a_watcher_executor]
> > (kafka.utils.Logging$class.error:?)  -
> > [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a], error
> > during syncedRebalance
> > kafka.common.ConsumerRebalanceFailedException:
> > mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a can't
> > rebalance *after 4 retries*
> >     at
> >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
> > Source)
> >     at
> >
> >
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
> > Source)
> >
> >
> >
> > Thanks,
> > Bhavesh
> >
>
>
>
> --
> -- Guozhang
>

Re: Kafka High Level Consumer Fail Over

Posted by Guozhang Wang <wa...@gmail.com>.
>From which consumer instance did you see these exceptions?

Guozhang


On Thu, Jun 12, 2014 at 4:39 PM, Bhavesh Mistry <mi...@gmail.com>
wrote:

> Hi Kafka Dev Team/ Users,
>
> We have high level consumer group consuming from 32 partitions for a
> topic.   We  have been running 48 consumers in this group across multiple
> servers.   We have kept 16 as back-up consumers, and hoping when the
> consumer dies, meaning when Zookeeper does not have an owner for a
> particular partition.  The back-up consumer will take over.  But I do not
> see this behavior after an active consumer died, the back-up consumer did
> not pick the partitions.  Please let us know what I can do to achieve
> this.  This is very likely scenario when rolling out new code on consumer
> side (we will be dong incremental code roll out).  Please see the exception
> below.   We are using version 0.8 for now.
>
> [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a],
> exception during rebalance
> org.I0Itec.zkclient.exception.ZkNoNodeException:
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
> NoNode for
>
> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a
>     at
> org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
>     at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
>     at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
>     at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
>     at kafka.utils.ZkUtils$.readData(Unknown Source)
>     at kafka.consumer.TopicCount$.constructTopicCount(Unknown Source)
>     at
>
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(Unknown
> Source)
>     at
>
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(Unknown
> Source)
>     at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
>     at
>
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
> Source)
>     at
>
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
> Source)
> *Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode = NoNode for
>
> /consumers/mupd_logmon_hb_events/ids/mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a*
>     at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
>     at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:950)
>     at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103)
>     at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770)
>     at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766)
>     at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
>
>
> 11 Jun 2014 14:12:16,710 ERROR
>
> [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a_watcher_executor]
> (kafka.utils.Logging$class.error:?)  -
> [mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a], error
> during syncedRebalance
> kafka.common.ConsumerRebalanceFailedException:
> mupd_logmon_hb_events_sdc-q1-logstream-8-1402448850475-6521f70a can't
> rebalance *after 4 retries*
>     at
>
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(Unknown
> Source)
>     at
>
> kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(Unknown
> Source)
>
>
>
> Thanks,
> Bhavesh
>



-- 
-- Guozhang