You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Rajiv Kurian <ra...@signalfx.com> on 2016/04/27 20:23:08 UTC

Debugging high log flush latency

We monitor the log flush latency p95 on all our Kafka nodes and
occasionally we see it creep up from the regular figure of under 15 ms to
above 150 ms.

Restarting the node usually doesn't help. It seems to fix itself over time
but we are not quite sure about the underlying reason. It's bytes-in/second
and messages-in/second are in line with the other brokers in the cluster.
When one of these incidents happen it usually lasts for hours.

Thanks,
Rajiv