You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@zookeeper.apache.org by Jun Li <jl...@gmail.com> on 2020/08/03 00:06:01 UTC

What is the semantic behavior of zookeeper metrics "zookeeper_MaxRequestLatency" exported from Prometheus Adapter

Hi:

I have a 5-node ZooKeeper cluster deployed, and we monitor the cluster by
turning on the Prometheus Adapter in ZooKeper.  From the dashboard, I am
seeing that after the ZK cluster gets restarted, the
"zookeeper_MaxRequestLatency" keeps climbing up from 0ms to 2.7 seconds to
10 seconds to 15 seconds in about 40 minutes, and afterwards settled at 15
seconds for the last 12 hours with a straight line (that is, no value
changed anymore).

All of the replicas share the similar behavior, but with small difference
in terms of the max request latency value.

So my questions are:

(1) what is the meaning of "max request latency"? It seems that is the
maximum latency experienced after the start of the process, and it is NOT
calculated, say, in a 5-minute window?

(2) what should be the right max request latency? I am concerned that the
15 seconds that I have is too high.

Regards,

Jun