You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Chris Egerton (Jira)" <ji...@apache.org> on 2021/06/22 17:39:00 UTC

[jira] [Created] (KAFKA-12980) Allow consumers to return from poll when position advances

Chris Egerton created KAFKA-12980:
-------------------------------------

             Summary: Allow consumers to return from poll when position advances
                 Key: KAFKA-12980
                 URL: https://issues.apache.org/jira/browse/KAFKA-12980
             Project: Kafka
          Issue Type: Improvement
          Components: consumer
            Reporter: Chris Egerton


When {{Consumer::poll}} is invoked on a topic with an open transaction, and then that transaction is aborted, {{poll}} does not return unless there are other records available in that topic after the aborted transaction.

Instead, {{poll}} could return in this case, even when no records are available.

This facilitates reads to the end of a topic where the end offsets of a topic are listed and then a consumer for that topic is polled until its [position|https://kafka.apache.org/28/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#position(org.apache.kafka.common.TopicPartition)] is at or beyond each of those offsets (for example, [Connect does this|https://github.com/apache/kafka/blob/fce771579c3e20f20949c4c7e0a5e3a16c57c7f0/connect/runtime/src/main/java/org/apache/kafka/connect/util/KafkaBasedLog.java#L322-L345] when reading to the end of any of its internal topics).

We could update the existing language in the docs for {{Consumer::poll}} from
{quote}This method returns immediately if there are records available.
{quote}
to
{quote}This method returns immediately if there are records available or if the position advances past control records.
{quote}
 

A workaround for existing users who would like to see this is to use short poll intervals and manually check the consumer's position in between each poll, but this is fairly tedious and may lead to excess CPU and network utilization depending on the latency requirements for knowing when the end of the topic has been reached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)