You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@zookeeper.apache.org by "Mathieu Gaudin (Jira)" <ji...@apache.org> on 2023/02/08 09:21:00 UTC
[jira] [Resolved] (ZOOKEEPER-4358) Latency metrics showing surprising results for a keberos-enabled cluster
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mathieu Gaudin resolved ZOOKEEPER-4358.
---------------------------------------
Resolution: Not A Problem
> Latency metrics showing surprising results for a keberos-enabled cluster
> ------------------------------------------------------------------------
>
> Key: ZOOKEEPER-4358
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4358
> Project: ZooKeeper
> Issue Type: Bug
> Components: metric system
> Affects Versions: 3.6.2
> Reporter: Mathieu Gaudin
> Priority: Minor
> Attachments: image-2021-08-27-16-10-28-783.png, image-2021-08-27-16-37-50-112.png
>
>
> Hi,
> I'm trying to understand why the values of min/avg/max latency are showing surprising results. The graph below shows the max latency value of a particular node for last 7 days. The value increases gradually over time and it only ever decreases when the node gets restarted as if the metric value gets reset.
> [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/ServerStats.java#L226]
> !image-2021-08-27-16-10-28-783.png|width=984,height=204!
> * 3 nodes
> * Keberos enabled
> * TGT ticket cashe enabled.
> I believes the values of min/avg/max latency should show more realistic variations. It's very unlikely that the max latency value is expected to always increase while the node is running.
> [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/ServerStats.java#L142]
> _public void updateLatency(Request request, long currentTime) {_
> _long latency = currentTime - request.createTime;_
> _if (latency < 0) {_
> _return;_
> _}_
> _*{color:#FF0000}requestLatency.addDataPoint(latency);{color}*_
> _if (request.getHdr() != null) {_
> _// Only quorum request should have header_
> _ServerMetrics.getMetrics().UPDATE_LATENCY.add(latency);_
> _} else {_
> _// All read request should goes here_
> _ServerMetrics.getMetrics().READ_LATENCY.add(latency);_
> _}_
> The method called let me think that the max latency metric gets set if the current values happens to be lower. __
> [https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/metric/AvgMinMaxCounter.java#L51]
> _private void setMax(long value) {_
> *{color:#FF0000}_long current;_{color}*
> *{color:#FF0000}_while (value > (current = max.get()) && !max.compareAndSet(current, value)) {_{color}*
> _// no op_
> _}_
> _}_
> I put below a graph of a particular from a totally different cluster for last 2 days. The node has not been restarted and all the data is from the same process. We can see a more realistic variations of the max latency metric as it would normally.
> !image-2021-08-27-16-37-50-112.png|width=1084,height=222!
> Thanks for you time in advance,
> Math
--
This message was sent by Atlassian Jira
(v8.20.10#820010)