You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Yury Ruchin <yu...@gmail.com> on 2014/06/24 09:02:32 UTC

Kafka 0.8's VerifyConsumerRebalance reports an error

Hi,

I've run into the following problem. I try to read from a 50-partition
Kafka topic using high level consumer with 8 streams. I'm using 8-thread
pool, each thread handling one stream. After a short time, the threads
reading from the stream stop reading. Lag between topic latest offset and
the consumer constantly increases as new messages come in.

I looked into ZK:  /consumers/<consumer_name>/owners/<topic_name> and see a
list of znodes corresponding to the full list of partitions: [1, 2, 3,
...]. When I do zk get on e. g.
/consumers/<consumer_name>/owners/<topic_name>/1 - I see a valid consumer
name corresponding to Kafka client logs, e. g.
<consumer_name>_<node_name>-1403049600000-abc12345-0. However, when I run
the VerifyConsumerRebalance tool, I see the following:

No owner for partition [<topic_name>,1]
(kafka.tools.VerifyConsumerRebalance$)

No owner for partition [<topic_name>,2]
(kafka.tools.VerifyConsumerRebalance$)

...

No owner for partition [<topic_name>,50]
(kafka.tools.VerifyConsumerRebalance$)

According to this output, no partition has owner, which seemingly
contradicts to what I see in ZK.

What would cause such a problem and how can I troubleshoot it further?

Thanks!

Re: Kafka 0.8's VerifyConsumerRebalance reports an error

Posted by Neha Narkhede <ne...@gmail.com>.
This is a bug in the tool. Please file a bug and attach these error/info
logs to it.

Thanks,
Neha


On Thu, Jun 26, 2014 at 5:24 AM, Yury Ruchin <yu...@gmail.com> wrote:

> I have set log level to DEBUG and saw something strange in the output. For
> each topic partition, I see the following pattern:
>
> [2014-06-26 16:00:24,467] ERROR No owner for partition [<topic_name>,0]
> (kafka.tools.VerifyConsumerRebalance$)
>
> ...
>
> [2014-06-26 16:00:24,469] INFO Owner of partition [<topic_name>,0] is
> <consumer_name>_<node_name>-1403049600000-abc12345-0
> (kafka.tools.VerifyConsumerRebalance$)
>
> As I understand VerifyConsumerRebalance.scala, those 2 messages should be
> mutually exclusive, but their both appear for every partition. With the
> default log settings, only that with ERROR level is shown to user.
>
> Is this a problem with the tool?
>
> 2014-06-25 2:04 GMT+04:00 Neha Narkhede <ne...@gmail.com>:
>
> > I would turn on DEBUG on the tool to see which url it reads and doesn't
> > find the owners.
> >
> >
> >
> >
> > On Tue, Jun 24, 2014 at 11:28 AM, Yury Ruchin <yu...@gmail.com>
> > wrote:
> >
> > > I've just double-checked. The URL is correct, the same one is used by
> > Kafka
> > > clients.
> > >
> > >
> > > 2014-06-24 22:21 GMT+04:00 Neha Narkhede <ne...@gmail.com>:
> > >
> > > > Is it possible that maybe the zookeeper url used for the
> > > > VerifyConsumerRebalance tool is incorrect?
> > > >
> > > >
> > > > On Tue, Jun 24, 2014 at 12:02 AM, Yury Ruchin <yuri.ruchin@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I've run into the following problem. I try to read from a
> > 50-partition
> > > > > Kafka topic using high level consumer with 8 streams. I'm using
> > > 8-thread
> > > > > pool, each thread handling one stream. After a short time, the
> > threads
> > > > > reading from the stream stop reading. Lag between topic latest
> offset
> > > and
> > > > > the consumer constantly increases as new messages come in.
> > > > >
> > > > > I looked into ZK:  /consumers/<consumer_name>/owners/<topic_name>
> and
> > > > see a
> > > > > list of znodes corresponding to the full list of partitions: [1, 2,
> > 3,
> > > > > ...]. When I do zk get on e. g.
> > > > > /consumers/<consumer_name>/owners/<topic_name>/1 - I see a valid
> > > consumer
> > > > > name corresponding to Kafka client logs, e. g.
> > > > > <consumer_name>_<node_name>-1403049600000-abc12345-0. However,
> when I
> > > run
> > > > > the VerifyConsumerRebalance tool, I see the following:
> > > > >
> > > > > No owner for partition [<topic_name>,1]
> > > > > (kafka.tools.VerifyConsumerRebalance$)
> > > > >
> > > > > No owner for partition [<topic_name>,2]
> > > > > (kafka.tools.VerifyConsumerRebalance$)
> > > > >
> > > > > ...
> > > > >
> > > > > No owner for partition [<topic_name>,50]
> > > > > (kafka.tools.VerifyConsumerRebalance$)
> > > > >
> > > > > According to this output, no partition has owner, which seemingly
> > > > > contradicts to what I see in ZK.
> > > > >
> > > > > What would cause such a problem and how can I troubleshoot it
> > further?
> > > > >
> > > > > Thanks!
> > > > >
> > > >
> > >
> >
>

Re: Kafka 0.8's VerifyConsumerRebalance reports an error

Posted by Yury Ruchin <yu...@gmail.com>.
I have set log level to DEBUG and saw something strange in the output. For
each topic partition, I see the following pattern:

[2014-06-26 16:00:24,467] ERROR No owner for partition [<topic_name>,0]
(kafka.tools.VerifyConsumerRebalance$)

...

[2014-06-26 16:00:24,469] INFO Owner of partition [<topic_name>,0] is
<consumer_name>_<node_name>-1403049600000-abc12345-0
(kafka.tools.VerifyConsumerRebalance$)

As I understand VerifyConsumerRebalance.scala, those 2 messages should be
mutually exclusive, but their both appear for every partition. With the
default log settings, only that with ERROR level is shown to user.

Is this a problem with the tool?

2014-06-25 2:04 GMT+04:00 Neha Narkhede <ne...@gmail.com>:

> I would turn on DEBUG on the tool to see which url it reads and doesn't
> find the owners.
>
>
>
>
> On Tue, Jun 24, 2014 at 11:28 AM, Yury Ruchin <yu...@gmail.com>
> wrote:
>
> > I've just double-checked. The URL is correct, the same one is used by
> Kafka
> > clients.
> >
> >
> > 2014-06-24 22:21 GMT+04:00 Neha Narkhede <ne...@gmail.com>:
> >
> > > Is it possible that maybe the zookeeper url used for the
> > > VerifyConsumerRebalance tool is incorrect?
> > >
> > >
> > > On Tue, Jun 24, 2014 at 12:02 AM, Yury Ruchin <yu...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I've run into the following problem. I try to read from a
> 50-partition
> > > > Kafka topic using high level consumer with 8 streams. I'm using
> > 8-thread
> > > > pool, each thread handling one stream. After a short time, the
> threads
> > > > reading from the stream stop reading. Lag between topic latest offset
> > and
> > > > the consumer constantly increases as new messages come in.
> > > >
> > > > I looked into ZK:  /consumers/<consumer_name>/owners/<topic_name> and
> > > see a
> > > > list of znodes corresponding to the full list of partitions: [1, 2,
> 3,
> > > > ...]. When I do zk get on e. g.
> > > > /consumers/<consumer_name>/owners/<topic_name>/1 - I see a valid
> > consumer
> > > > name corresponding to Kafka client logs, e. g.
> > > > <consumer_name>_<node_name>-1403049600000-abc12345-0. However, when I
> > run
> > > > the VerifyConsumerRebalance tool, I see the following:
> > > >
> > > > No owner for partition [<topic_name>,1]
> > > > (kafka.tools.VerifyConsumerRebalance$)
> > > >
> > > > No owner for partition [<topic_name>,2]
> > > > (kafka.tools.VerifyConsumerRebalance$)
> > > >
> > > > ...
> > > >
> > > > No owner for partition [<topic_name>,50]
> > > > (kafka.tools.VerifyConsumerRebalance$)
> > > >
> > > > According to this output, no partition has owner, which seemingly
> > > > contradicts to what I see in ZK.
> > > >
> > > > What would cause such a problem and how can I troubleshoot it
> further?
> > > >
> > > > Thanks!
> > > >
> > >
> >
>

Re: Kafka 0.8's VerifyConsumerRebalance reports an error

Posted by Neha Narkhede <ne...@gmail.com>.
I would turn on DEBUG on the tool to see which url it reads and doesn't
find the owners.




On Tue, Jun 24, 2014 at 11:28 AM, Yury Ruchin <yu...@gmail.com> wrote:

> I've just double-checked. The URL is correct, the same one is used by Kafka
> clients.
>
>
> 2014-06-24 22:21 GMT+04:00 Neha Narkhede <ne...@gmail.com>:
>
> > Is it possible that maybe the zookeeper url used for the
> > VerifyConsumerRebalance tool is incorrect?
> >
> >
> > On Tue, Jun 24, 2014 at 12:02 AM, Yury Ruchin <yu...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I've run into the following problem. I try to read from a 50-partition
> > > Kafka topic using high level consumer with 8 streams. I'm using
> 8-thread
> > > pool, each thread handling one stream. After a short time, the threads
> > > reading from the stream stop reading. Lag between topic latest offset
> and
> > > the consumer constantly increases as new messages come in.
> > >
> > > I looked into ZK:  /consumers/<consumer_name>/owners/<topic_name> and
> > see a
> > > list of znodes corresponding to the full list of partitions: [1, 2, 3,
> > > ...]. When I do zk get on e. g.
> > > /consumers/<consumer_name>/owners/<topic_name>/1 - I see a valid
> consumer
> > > name corresponding to Kafka client logs, e. g.
> > > <consumer_name>_<node_name>-1403049600000-abc12345-0. However, when I
> run
> > > the VerifyConsumerRebalance tool, I see the following:
> > >
> > > No owner for partition [<topic_name>,1]
> > > (kafka.tools.VerifyConsumerRebalance$)
> > >
> > > No owner for partition [<topic_name>,2]
> > > (kafka.tools.VerifyConsumerRebalance$)
> > >
> > > ...
> > >
> > > No owner for partition [<topic_name>,50]
> > > (kafka.tools.VerifyConsumerRebalance$)
> > >
> > > According to this output, no partition has owner, which seemingly
> > > contradicts to what I see in ZK.
> > >
> > > What would cause such a problem and how can I troubleshoot it further?
> > >
> > > Thanks!
> > >
> >
>

Re: Kafka 0.8's VerifyConsumerRebalance reports an error

Posted by Yury Ruchin <yu...@gmail.com>.
I've just double-checked. The URL is correct, the same one is used by Kafka
clients.


2014-06-24 22:21 GMT+04:00 Neha Narkhede <ne...@gmail.com>:

> Is it possible that maybe the zookeeper url used for the
> VerifyConsumerRebalance tool is incorrect?
>
>
> On Tue, Jun 24, 2014 at 12:02 AM, Yury Ruchin <yu...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I've run into the following problem. I try to read from a 50-partition
> > Kafka topic using high level consumer with 8 streams. I'm using 8-thread
> > pool, each thread handling one stream. After a short time, the threads
> > reading from the stream stop reading. Lag between topic latest offset and
> > the consumer constantly increases as new messages come in.
> >
> > I looked into ZK:  /consumers/<consumer_name>/owners/<topic_name> and
> see a
> > list of znodes corresponding to the full list of partitions: [1, 2, 3,
> > ...]. When I do zk get on e. g.
> > /consumers/<consumer_name>/owners/<topic_name>/1 - I see a valid consumer
> > name corresponding to Kafka client logs, e. g.
> > <consumer_name>_<node_name>-1403049600000-abc12345-0. However, when I run
> > the VerifyConsumerRebalance tool, I see the following:
> >
> > No owner for partition [<topic_name>,1]
> > (kafka.tools.VerifyConsumerRebalance$)
> >
> > No owner for partition [<topic_name>,2]
> > (kafka.tools.VerifyConsumerRebalance$)
> >
> > ...
> >
> > No owner for partition [<topic_name>,50]
> > (kafka.tools.VerifyConsumerRebalance$)
> >
> > According to this output, no partition has owner, which seemingly
> > contradicts to what I see in ZK.
> >
> > What would cause such a problem and how can I troubleshoot it further?
> >
> > Thanks!
> >
>

Re: Kafka 0.8's VerifyConsumerRebalance reports an error

Posted by Neha Narkhede <ne...@gmail.com>.
Is it possible that maybe the zookeeper url used for the
VerifyConsumerRebalance tool is incorrect?


On Tue, Jun 24, 2014 at 12:02 AM, Yury Ruchin <yu...@gmail.com> wrote:

> Hi,
>
> I've run into the following problem. I try to read from a 50-partition
> Kafka topic using high level consumer with 8 streams. I'm using 8-thread
> pool, each thread handling one stream. After a short time, the threads
> reading from the stream stop reading. Lag between topic latest offset and
> the consumer constantly increases as new messages come in.
>
> I looked into ZK:  /consumers/<consumer_name>/owners/<topic_name> and see a
> list of znodes corresponding to the full list of partitions: [1, 2, 3,
> ...]. When I do zk get on e. g.
> /consumers/<consumer_name>/owners/<topic_name>/1 - I see a valid consumer
> name corresponding to Kafka client logs, e. g.
> <consumer_name>_<node_name>-1403049600000-abc12345-0. However, when I run
> the VerifyConsumerRebalance tool, I see the following:
>
> No owner for partition [<topic_name>,1]
> (kafka.tools.VerifyConsumerRebalance$)
>
> No owner for partition [<topic_name>,2]
> (kafka.tools.VerifyConsumerRebalance$)
>
> ...
>
> No owner for partition [<topic_name>,50]
> (kafka.tools.VerifyConsumerRebalance$)
>
> According to this output, no partition has owner, which seemingly
> contradicts to what I see in ZK.
>
> What would cause such a problem and how can I troubleshoot it further?
>
> Thanks!
>