You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Alex Dunayevsky (JIRA)" <ji...@apache.org> on 2018/04/03 14:33:00 UTC

[jira] [Created] (KAFKA-6743) ConsumerPerformance fails to consume all messages on topics with large number of partitions

Alex Dunayevsky created KAFKA-6743:
--------------------------------------

             Summary: ConsumerPerformance fails to consume all messages on topics with large number of partitions
                 Key: KAFKA-6743
                 URL: https://issues.apache.org/jira/browse/KAFKA-6743
             Project: Kafka
          Issue Type: Bug
          Components: core, tools
    Affects Versions: 0.11.0.2
            Reporter: Alex Dunayevsky


ConsumerPerformance fails to consume all messages on topics with large number of partitions due to a relatively short default polling loop timeout (1000 ms) that is not reachable and modifiable by the end user.

Demo: Create a topic of 10 000 partitions, send a 50 000 000 of 100 byte records using kafka-producer-perf-test and consume them using kafka-consumer-perf-test (ConsumerPerformance). You will likely notice that the number of records returned by the kafka-consumer-perf-test is many times less than expected 50 000 000. This happens due to specific ConsumerPerformance implementation. As the result, in some rough cases it may take a long enough time to process/iterate through the records polled in batches, thus, the time may exceed the default hardcoded polling loop timeout and this is probably not what we want from this utility.

We have two options: 
1) Increasing polling loop timeout in ConsumerPerformance implementation. It defaults to 1000 ms and is hardcoded, thus cannot be changed but we could export it as an OPTIONAL kafka-consumer-perf-test parameter to enable it on a script level configuration and available to the end user.
2) Decreasing max.poll.records on a Consumer config level. This is not a fine option though since we do not want to touch the default settings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)