You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Javier Arias Losada <ja...@gmail.com> on 2019/01/25 09:43:54 UTC

long time between consumer gracefully stopping and rebalance

Hello Kafka Users,

this is a follow up on a previous question I sent regarding high latency in
our Kafka Streams service during rebalancing.

As a quick reminder, our Stateless service has very tight latency
requirements and we are facing too high latency problems (some messages
consumed more than 10 secs after being produced) when a consumer leaves
gracefully the group.

After further investigation we have found out that at least for small
consumer groups the rebalance is taking less than 500ms. So we thought,
where is this huge latency when removing one consumer (>10s) coming from?

We realized that it is the time between the consumer exiting Gracefully and
the rebalance kicking in.

That previous tests were executed with all-default configurations in both
Kafka and Kafka Streams application.
We changed the configurations to:

  properties.put("max.poll.records", 50); // defaults to 1000 in
kafkastreams
  properties.put("auto.offset.reset", "latest"); // defaults to latest
  properties.put("heartbeat.interval.ms", 1000);
  properties.put("session.timeout.ms", 6000);
  properties.put("group.initial.rebalance.delay.ms", 0);
  properties.put("max.poll.interval.ms", 6000);

And the result is that the time for the rebalance to start dropped to a bit
more than 5 secs.

We also tested to kill a consumer non-gracefully by 'kill -9' it; the
result is that the time to trigger the rebalance is exactly the same.

So we have some questions:
- We expected that when the consumer is stopping gracefully the rebalance
is triggered right away, should that be the expected behaviour? why isn't
it happenning in our tests?
- How can we reduce the time between a consumer gracefully exiting and the
rebalance being triggered? what are the tradeoffs? more unneeded rebalances?


For more context, our Kafka version is 1.1.0, after looking at libs found
for example kafka/kafka_2.11-1.1.0-cp1.jar, we installed Confluent platform
4.1.0. On the consumer side, we are using Kafka-streams 2.1.0.

Thank you very much.
Javier Arias Losada.