You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Evan Chan <ev...@ooyala.com> on 2012/01/11 20:32:47 UTC

Dead kafka consumers claim partitions, new ones can't claim any

Hi,

We are using Kafka 0.6 and testing it on EC2.  We have an issue where some
processes running the ZK/High level consumer (Scala consumer) get killed
before they have a chance to call ConsumerConnector.shutdown().
It seems like they leave nodes hanging around in ZK.
If we restart the process, the consumer in the new process will error out
because it cannot claim any partitions.

The only way I know of getting around this is to use a ZK client and
manually delete nodes.

Is there any way for the high level consumer nodes in ZK to be made
ephemeral so that if a process gets killed, the state won't last forever
and cause subsequent nodes to not be able to claim partitions?
Any chance this has been fixed in 0.7?

thanks,
Evan

-- 
--
*Evan Chan*
Senior Software Engineer |
ev@ooyala.com | (650) 996-4600
www.ooyala.com | blog <http://www.ooyala.com/blog> |
@ooyala<http://www.twitter.com/ooyala>

Re: Dead kafka consumers claim partitions, new ones can't claim any

Posted by Evan Chan <ev...@ooyala.com>.
Sorry, it seems that we misconfigured our Zookeeper cluster, and it was
acting as individual nodes rather than one cluster.  That was probably it.
 :-p

On Wed, Jan 11, 2012 at 1:30 PM, Neha Narkhede <ne...@gmail.com>wrote:

> Evan,
>
> What you are describing seems to be this bug in zookeeper 3.3.3 -
> https://issues.apache.org/jira/browse/ZOOKEEPER-1208
>
> However, to confirm if thats the case, do you mind uploading the
> entire log for the consumer ?
>
> Thanks,
> Neha
>
> On Wed, Jan 11, 2012 at 11:33 AM, Evan Chan <ev...@ooyala.com> wrote:
> > Oh, here are the consumer logs when it can't claim a partition:
> >
> >
> > 2012-01-11 02:23:04 ZookeeperConsumerConnector [WARN] No broker partions
> > consumed by consumer thread
> > whatever_ip-10-116-81-39.ec2.internal-1326248583706-0
> > for t
> > opic player_logs
> > 2012-01-11 02:23:04 ZookeeperConsumerConnector [INFO] Consumer
> > whatever_ip-10-116-81-39.ec2.internal-1326248583706 selected partitions :
> >
> >
> > On Wed, Jan 11, 2012 at 11:32 AM, Evan Chan <ev...@ooyala.com> wrote:
> >
> >> Hi,
> >>
> >> We are using Kafka 0.6 and testing it on EC2.  We have an issue where
> some
> >> processes running the ZK/High level consumer (Scala consumer) get killed
> >> before they have a chance to call ConsumerConnector.shutdown().
> >> It seems like they leave nodes hanging around in ZK.
> >> If we restart the process, the consumer in the new process will error
> out
> >> because it cannot claim any partitions.
> >>
> >> The only way I know of getting around this is to use a ZK client and
> >> manually delete nodes.
> >>
> >> Is there any way for the high level consumer nodes in ZK to be made
> >> ephemeral so that if a process gets killed, the state won't last forever
> >> and cause subsequent nodes to not be able to claim partitions?
> >> Any chance this has been fixed in 0.7?
> >>
> >> thanks,
> >> Evan
> >>
> >> --
> >> --
> >> *Evan Chan*
> >> Senior Software Engineer |
> >> ev@ooyala.com | (650) 996-4600
> >> www.ooyala.com | blog <http://www.ooyala.com/blog> | @ooyala<
> http://www.twitter.com/ooyala>
> >>
> >
> >
> >
> > --
> > --
> > *Evan Chan*
> > Senior Software Engineer |
> > ev@ooyala.com | (650) 996-4600
> > www.ooyala.com | blog <http://www.ooyala.com/blog> |
> > @ooyala<http://www.twitter.com/ooyala>
>



-- 
--
*Evan Chan*
Senior Software Engineer |
ev@ooyala.com | (650) 996-4600
www.ooyala.com | blog <http://www.ooyala.com/blog> |
@ooyala<http://www.twitter.com/ooyala>

Re: Dead kafka consumers claim partitions, new ones can't claim any

Posted by Neha Narkhede <ne...@gmail.com>.
Evan,

What you are describing seems to be this bug in zookeeper 3.3.3 -
https://issues.apache.org/jira/browse/ZOOKEEPER-1208

However, to confirm if thats the case, do you mind uploading the
entire log for the consumer ?

Thanks,
Neha

On Wed, Jan 11, 2012 at 11:33 AM, Evan Chan <ev...@ooyala.com> wrote:
> Oh, here are the consumer logs when it can't claim a partition:
>
>
> 2012-01-11 02:23:04 ZookeeperConsumerConnector [WARN] No broker partions
> consumed by consumer thread
> whatever_ip-10-116-81-39.ec2.internal-1326248583706-0
> for t
> opic player_logs
> 2012-01-11 02:23:04 ZookeeperConsumerConnector [INFO] Consumer
> whatever_ip-10-116-81-39.ec2.internal-1326248583706 selected partitions :
>
>
> On Wed, Jan 11, 2012 at 11:32 AM, Evan Chan <ev...@ooyala.com> wrote:
>
>> Hi,
>>
>> We are using Kafka 0.6 and testing it on EC2.  We have an issue where some
>> processes running the ZK/High level consumer (Scala consumer) get killed
>> before they have a chance to call ConsumerConnector.shutdown().
>> It seems like they leave nodes hanging around in ZK.
>> If we restart the process, the consumer in the new process will error out
>> because it cannot claim any partitions.
>>
>> The only way I know of getting around this is to use a ZK client and
>> manually delete nodes.
>>
>> Is there any way for the high level consumer nodes in ZK to be made
>> ephemeral so that if a process gets killed, the state won't last forever
>> and cause subsequent nodes to not be able to claim partitions?
>> Any chance this has been fixed in 0.7?
>>
>> thanks,
>> Evan
>>
>> --
>> --
>> *Evan Chan*
>> Senior Software Engineer |
>> ev@ooyala.com | (650) 996-4600
>> www.ooyala.com | blog <http://www.ooyala.com/blog> | @ooyala<http://www.twitter.com/ooyala>
>>
>
>
>
> --
> --
> *Evan Chan*
> Senior Software Engineer |
> ev@ooyala.com | (650) 996-4600
> www.ooyala.com | blog <http://www.ooyala.com/blog> |
> @ooyala<http://www.twitter.com/ooyala>

Re: Dead kafka consumers claim partitions, new ones can't claim any

Posted by Evan Chan <ev...@ooyala.com>.
Oh, here are the consumer logs when it can't claim a partition:


2012-01-11 02:23:04 ZookeeperConsumerConnector [WARN] No broker partions
consumed by consumer thread
whatever_ip-10-116-81-39.ec2.internal-1326248583706-0
for t
opic player_logs
2012-01-11 02:23:04 ZookeeperConsumerConnector [INFO] Consumer
whatever_ip-10-116-81-39.ec2.internal-1326248583706 selected partitions :


On Wed, Jan 11, 2012 at 11:32 AM, Evan Chan <ev...@ooyala.com> wrote:

> Hi,
>
> We are using Kafka 0.6 and testing it on EC2.  We have an issue where some
> processes running the ZK/High level consumer (Scala consumer) get killed
> before they have a chance to call ConsumerConnector.shutdown().
> It seems like they leave nodes hanging around in ZK.
> If we restart the process, the consumer in the new process will error out
> because it cannot claim any partitions.
>
> The only way I know of getting around this is to use a ZK client and
> manually delete nodes.
>
> Is there any way for the high level consumer nodes in ZK to be made
> ephemeral so that if a process gets killed, the state won't last forever
> and cause subsequent nodes to not be able to claim partitions?
> Any chance this has been fixed in 0.7?
>
> thanks,
> Evan
>
> --
> --
> *Evan Chan*
> Senior Software Engineer |
> ev@ooyala.com | (650) 996-4600
> www.ooyala.com | blog <http://www.ooyala.com/blog> | @ooyala<http://www.twitter.com/ooyala>
>



-- 
--
*Evan Chan*
Senior Software Engineer |
ev@ooyala.com | (650) 996-4600
www.ooyala.com | blog <http://www.ooyala.com/blog> |
@ooyala<http://www.twitter.com/ooyala>