You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/02/11 22:41:00 UTC

[jira] [Commented] (KAFKA-6397) Consumer should not block setting initial positions of unavailable partitions

    [ https://issues.apache.org/jira/browse/KAFKA-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360154#comment-16360154 ] 

ASF GitHub Bot commented on KAFKA-6397:
---------------------------------------

hachikuji opened a new pull request #4557: KAFKA-6397: Consumer should not block setting positions of unavailable partitions
URL: https://github.com/apache/kafka/pull/4557
 
 
   Prior to this patch, the consumer always blocks in poll() if there are any partitions which are awaiting their initial positions. This behavior was inconsistent with normal fetch behavior since we allow fetching on available partitions even if one or more of the assigned partitions becomes unavailable _after_ initial offset lookup. With this patch, the consumer will do offset resets asynchronously, which allows other partitions to make progress even if the initial positions for some partitions cannot be found.
   
   I have added several new unit tests in `FetcherTest` and `KafkaConsumerTest` to verify the new behavior. One minor compatibility implication worth mentioning is apparent from the change I made in `DynamicBrokerReconfigurationTest`. Previously it was possible to assume that all partitions had a fetch position after `poll()` completed with a non-empty assignment. This assumption is no longer generally true, but you can force the positions to be updated using the `position()` API which still blocks indefinitely until a position is available.
   
   Note that this this patch also removes the logic to cache committed offsets in `SubscriptionState` since it was no longer needed (the consumer's `committed()` API always does an offset lookup anyway). In addition to avoiding the complexity of maintaining the cache, this avoids wasteful offset lookups to refresh the cache when `commitAsync()` is used.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Consumer should not block setting initial positions of unavailable partitions
> -----------------------------------------------------------------------------
>
>                 Key: KAFKA-6397
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6397
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Jason Gustafson
>            Assignee: Jason Gustafson
>            Priority: Major
>              Labels: consumer
>             Fix For: 1.2.0
>
>
> Currently the consumer will block in poll() after receiving its assignment in order to set the starting offset for every assigned partition. If the topic is deleted or if a partition is unavailable, the consumer can be stuck indefinitely. Most of the time this is not a problem since the starting offset is obtained from the committed offsets, which does not depend on partition availability. However, if there are no committed offsets or if the user has manually called {{seekToBeginning}} or {{seekToEnd}}, then we will need to do a lookup for the starting offset from the partition leader, which will stall the consumer until the partition is available or recreated. It would be better to let the consumer fetch on partitions which are available and periodically check availability for the rest. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)