You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by James Brown <jb...@easypost.com> on 2016/11/01 20:50:41 UTC

Duplicate consumer group in ListGroups in Kafka 0.10.1?

Here's another strange bug that we're seeing after upgrading to Kafka
0.10.1.0: one of our consumer groups is appearing twice in the list, and
appears to belong to two different nodes.

% kafka-consumer-groups.sh --bootstrap-server localhost:40172 --list | sort
| uniq -c | sort -n | grep -v '^ *1'
      2 details-log-etl

If I manually send a ListGroups request to each node, the offending
consumer group shows up twice (once as owned by broker ID 1 and once as
owned by broker ID 2). If I manually send an OffsetFetchRequest to Broker
#1 and Broker #2 with the given group name, I get back conflicting
responses:

(from Broker #1):
OffsetFetchResponse_v1(topics=[(topic='tracking.details',
partitions=[(partition=0, offset=85606947, metadata='', error_code=0)])])

(from Broker #2):
OffsetFetchResponse_v1(topics=[(topic='tracking.details',
partitions=[(partition=0, offset=83718751, metadata='', error_code=0)])])

The offset=85606947 response is correct.

If I use the GroupCoordinatorRequest API, both broker 1 and broker2 return
a result that broker 1 is the coordinator. The actual consuming application
seems unaffected and is proceeding as expected using broker 1.

This isn't actually breaking anything critical (since, like I said, actual
consumers seem to be doing the right thing), but it's breaking monitoring,
and it concerns me that such a duplicate is possible.

I haven't tried bouncing the consumer yet to see if that fixes it; I
figured I'd e-mail out just in case there was anything else folks wanted me
to look at first.
-- 
James Brown
Engineer

Re: Duplicate consumer group in ListGroups in Kafka 0.10.1?

Posted by James Brown <jb...@easypost.com>.
Oh, and one more tidbit: below are the responses if I manually send a
DescribeGroupsRequest to each of the brokers with the given consumer group
name:

(from Broker #1):
DescribeGroupsResponse_v0(groups=[(error_code=0, group='details-log-etl',
state='Stable', protocol_type='consumer', protocol='standard',
members=[(member_id='tracking.etl-6e0e1be8-939c-445e-bf44-26ed6c37b147',
client_id='tracking.etl', client_host='/fd00:ea51:d057:0:1:0:0:2',
member_metadata=b'\x00\x00\x00\x00\x00\x01\x00\rtest-messages\xff\xff\xff\xff',
member_assignment=b'\x00\x00\x00\x00\x00\x01\x00\x10tracking.details\x00\x00\x00\x01\x00\x00\x00\x00\xff\xff\xff\xff')])])

(from Broker #2):
details-log-etl 2 DescribeGroupsResponse_v0(groups=[(error_code=0,
group='details-log-etl', state='AwaitingSync', protocol_type='consumer',
protocol='',
members=[(member_id='tracking.etl-3c81b0e8-7683-474a-ab85-d809392db6ed',
client_id='tracking.etl', client_host='/fd00:ea51:d057:0:1:0:0:2',
member_metadata=b'', member_assignment=b'')])])




On Tue, Nov 1, 2016 at 1:50 PM, James Brown <jb...@easypost.com> wrote:

> Here's another strange bug that we're seeing after upgrading to Kafka
> 0.10.1.0: one of our consumer groups is appearing twice in the list, and
> appears to belong to two different nodes.
>
> % kafka-consumer-groups.sh --bootstrap-server localhost:40172 --list |
> sort | uniq -c | sort -n | grep -v '^ *1'
>       2 details-log-etl
>
> If I manually send a ListGroups request to each node, the offending
> consumer group shows up twice (once as owned by broker ID 1 and once as
> owned by broker ID 2). If I manually send an OffsetFetchRequest to Broker
> #1 and Broker #2 with the given group name, I get back conflicting
> responses:
>
> (from Broker #1):
> OffsetFetchResponse_v1(topics=[(topic='tracking.details',
> partitions=[(partition=0, offset=85606947, metadata='', error_code=0)])])
>
> (from Broker #2):
> OffsetFetchResponse_v1(topics=[(topic='tracking.details',
> partitions=[(partition=0, offset=83718751, metadata='', error_code=0)])])
>
> The offset=85606947 response is correct.
>
> If I use the GroupCoordinatorRequest API, both broker 1 and broker2
> return a result that broker 1 is the coordinator. The actual consuming
> application seems unaffected and is proceeding as expected using broker 1.
>
> This isn't actually breaking anything critical (since, like I said, actual
> consumers seem to be doing the right thing), but it's breaking monitoring,
> and it concerns me that such a duplicate is possible.
>
> I haven't tried bouncing the consumer yet to see if that fixes it; I
> figured I'd e-mail out just in case there was anything else folks wanted me
> to look at first.
> --
> James Brown
> Engineer
>



-- 
James Brown
Engineer