You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Vahid Hashemian (JIRA)" <ji...@apache.org> on 2018/07/23 23:34:00 UTC

[jira] [Comment Edited] (KAFKA-7044) kafka-consumer-groups.sh NullPointerException describing round robin or sticky assignors

    [ https://issues.apache.org/jira/browse/KAFKA-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553496#comment-16553496 ] 

Vahid Hashemian edited comment on KAFKA-7044 at 7/23/18 11:33 PM:
------------------------------------------------------------------

[~ejpearson], thanks for providing the additional info. It seems that the call to {{endOffsets(...)}} leads to a call to {{Fetcher.fetchOffsetsByTimes(...)}} that involves a timeout. The default timeout that is currently in place is quite long (30 secs), but I wonder if that somehow kicks in causing the call to finding the end offset for the given partitions return with a partial list of end offsets. Is it possible for you to increase the timeout and observe if things will change on your side (for example, you can use {{getConsumer.endOffsets(topicPartitions.asJava, 60000)}} to double the timeout)?

Is it always the same partition 11 that causes the NPE for you? Is there anything (size, lag, ...) different about that partition?

I'm also interested to know what comes back from this call in {{Fetcher.sendListOffsetsRequest(...)}}:
{code:java}
RequestFuture<ListOffsetResult> future =
    sendListOffsetRequest(entry.getKey(), entry.getValue(), requireTimestamps);{code}


was (Author: vahid):
[~ejpearson], thanks for providing the additional info. It seems that the call to {{endOffsets(...)}} leads to a call to {{Fetcher.fetchOffsetsByTimes(...)}} that involves a timeout. The default timeout that is currently in place is quite long (30 secs), but I wonder if that somehow kicks in causing the call to finding the end offset for the given partitions return with a partial list of end offsets. Is it possible for you to increase the timeout and observe if things will change on your side (for example, you can use {{getConsumer.endOffsets(topicPartitions.asJava, 60000)}} to double the timeout)?

Is it always the same partition 11 that causes the NPE for you? Is there anything (size, lag, ...) different about that partition?

> kafka-consumer-groups.sh NullPointerException describing round robin or sticky assignors
> ----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-7044
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7044
>             Project: Kafka
>          Issue Type: Bug
>          Components: tools
>    Affects Versions: 1.1.0
>         Environment: CentOS 7.4, Oracle JDK 1.8
>            Reporter: Jeff Field
>            Assignee: Vahid Hashemian
>            Priority: Minor
>
> We've recently moved to using the round robin assignor for one of our consumer groups, and started testing the sticky assignor. In both cases, using Kafka 1.1.0 we get a null pointer exception *unless* the group being described is rebalancing:
> {code:java}
> kafka-consumer-groups --bootstrap-server fqdn:9092 --describe --group groupname-for-consumer
> Error: Executing consumer group command failed due to null
> [2018-06-12 01:32:34,179] DEBUG Exception in consumer group command (kafka.admin.ConsumerGroupCommand$)
> java.lang.NullPointerException
> at scala.Predef$.Long2long(Predef.scala:363)
> at kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$getLogEndOffsets$2.apply(ConsumerGroupCommand.scala:612)
> at kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$getLogEndOffsets$2.apply(ConsumerGroupCommand.scala:610)
> at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.immutable.List.foreach(List.scala:392)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.immutable.List.map(List.scala:296)
> at kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.getLogEndOffsets(ConsumerGroupCommand.scala:610)
> at kafka.admin.ConsumerGroupCommand$ConsumerGroupService$class.describePartitions(ConsumerGroupCommand.scala:328)
> at kafka.admin.ConsumerGroupCommand$ConsumerGroupService$class.collectConsumerAssignment(ConsumerGroupCommand.scala:308)
> at kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.collectConsumerAssignment(ConsumerGroupCommand.scala:544)
> at kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10$$anonfun$13.apply(ConsumerGroupCommand.scala:571)
> at kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10$$anonfun$13.apply(ConsumerGroupCommand.scala:565)
> at scala.collection.immutable.List.flatMap(List.scala:338)
> at kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10.apply(ConsumerGroupCommand.scala:565)
> at kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10.apply(ConsumerGroupCommand.scala:558)
> at scala.Option.map(Option.scala:146)
> at kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.collectGroupOffsets(ConsumerGroupCommand.scala:558)
> at kafka.admin.ConsumerGroupCommand$ConsumerGroupService$class.describeGroup(ConsumerGroupCommand.scala:271)
> at kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.describeGroup(ConsumerGroupCommand.scala:544)
> at kafka.admin.ConsumerGroupCommand$.main(ConsumerGroupCommand.scala:77)
> at kafka.admin.ConsumerGroupCommand.main(ConsumerGroupCommand.scala)
> [2018-06-12 01:32:34,255] DEBUG Removed sensor with name connections-closed: (org.apache.kafka.common.metrics.Metrics){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)